Core frameworks
Hugging Face library for pretrained diffusion models and generative image audio and 3D pipelines.
Hugging Face library providing pretrained transformer models and tooling for NLP vision audio and multimodal machine learning.
Composable Python library for accelerator oriented array computing with automatic differentiation vectorization and just in time compilation.
High level multi backend deep learning API for building training and deploying neural networks in Python.
Open source machine learning framework with tensor computation automatic differentiation and production deployment tooling.
Classical & tabular ML
CatBoost is Yandex's gradient boosting library with strong categorical feature handling for classification and ranking.
category_encoders provides scikit-learn compatible transformers for encoding categorical variables.
Featuretools is a Python framework for automated feature engineering with relational and temporal data.
imbalanced-learn provides tools for learning from imbalanced datasets and integrates with scikit-learn.
LightGBM is Microsoft's fast gradient boosting framework using tree-based learning algorithms for ML and ranking.
The input names Nixtla but links the StatsForecast repository which is a Nixtla library for statistical forecasting.
PyOD is a Python library for detecting outlying objects in multivariate data.
Python machine learning library for classification regression clustering dimensionality reduction model selection and preprocessing.
statsmodels provides classes and functions for estimating statistical models and conducting statistical tests.
XGBoost is a scalable gradient boosting library for supervised learning and production ML workloads.
Graph ML
Deep Graph Library is a framework for building and training graph neural networks.
NetworkX is a Python package for creating manipulating and studying complex networks.
PyTorch Geometric is a library for deep learning on irregular structures such as graphs.
Spektral is a Python library for graph deep learning built on Keras and TensorFlow.
Reinforcement learning
Gymnasium is a standard API and collection of environments for reinforcement learning research and development.
PettingZoo provides multi-agent reinforcement learning environments with a common Python API.
Stable-Baselines3 provides reliable PyTorch implementations of reinforcement learning algorithms.
Compute & data engines
Apache Arrow defines a cross-language columnar memory format for efficient analytics and data interchange.
Apache Spark MLlib is Spark's scalable machine learning library for distributed data processing workflows.
cuDF is a GPU DataFrame library in the RAPIDS ecosystem for loading joining aggregating and filtering data.
Dask is a flexible Python library for parallel and distributed computing on larger than memory datasets.
In-process analytical database designed for fast OLAP queries and local data analysis.
Numba is an open source JIT compiler that translates Python and NumPy code into fast machine code.
RAPIDS is a suite of GPU accelerated open source data science and analytics libraries.
Distributed AI compute framework for scaling Python applications machine learning workloads and model serving.
Training & efficiency
MIT HAN Lab implementation of Activation aware Weight Quantization for efficient compressed LLM inference.
Library providing k bit optimizers matrix multiplication and quantization routines for efficient deep learning training and inference.
Microsoft deep learning optimization library for distributed training inference compression and scaling very large models.
Fast exact attention kernels with IO aware algorithms for efficient Transformer training and inference.
Reference implementation for GPTQ one shot post training quantization of large language models.
Hugging Face Accelerate simplifies running PyTorch training and inference across distributed and mixed precision setups.
Large scale GPU training framework for transformer language models from NVIDIA.
Hugging Face library for parameter efficient fine tuning methods such as LoRA on large pretrained models.
Interpretability & robustness
Adversarial Robustness Toolbox provides attacks defenses training methods and metrics for adversarial ML.
Captum is a PyTorch model interpretability library for understanding feature and neuron importance.
CleverHans is a library for benchmarking machine learning systems against adversarial examples.
Foolbox is a Python toolbox for creating adversarial examples against ML Models.
LIME explains individual predictions of machine learning classifiers and other models.
SHAP explains machine learning model predictions using Shapley value based attribution methods.
Privacy & federated learning
Flower is a framework for building federated learning and federated AI systems.
Opacus is a PyTorch library for training neural networks with differential privacy.
PySyft enables data science on private data through privacy preserving and remote data access workflows.
Apps & platforms
Unified data and AI lakehouse platform for building training serving and governing ML Models and AI agents.
Gradio is a Python library for building and sharing machine learning apps and demos.
Hugging Face hosts models datasets Spaces and libraries for building sharing and deploying ML systems.
Hugging Face Hub is a platform for sharing models datasets and machine learning demos and managing repositories.
Streamlit is an open source Python framework for building interactive data and AI web apps.