Ctrl K

ML Frameworks & Tooling

52 tools for ml frameworks & tooling.

Core frameworks

  • Hugging Face library for pretrained diffusion models and generative image audio and 3D pipelines.

  • Hugging Face library providing pretrained transformer models and tooling for NLP vision audio and multimodal machine learning.

  • Composable Python library for accelerator oriented array computing with automatic differentiation vectorization and just in time compilation.

  • High level multi backend deep learning API for building training and deploying neural networks in Python.

  • Open source machine learning framework with tensor computation automatic differentiation and production deployment tooling.

Classical & tabular ML

  • CatBoost is Yandex's gradient boosting library with strong categorical feature handling for classification and ranking.

  • category_encoders provides scikit-learn compatible transformers for encoding categorical variables.

  • Featuretools is a Python framework for automated feature engineering with relational and temporal data.

  • imbalanced-learn provides tools for learning from imbalanced datasets and integrates with scikit-learn.

  • LightGBM is Microsoft's fast gradient boosting framework using tree-based learning algorithms for ML and ranking.

  • The input names Nixtla but links the StatsForecast repository which is a Nixtla library for statistical forecasting.

  • PyOD is a Python library for detecting outlying objects in multivariate data.

  • Python machine learning library for classification regression clustering dimensionality reduction model selection and preprocessing.

  • statsmodels provides classes and functions for estimating statistical models and conducting statistical tests.

  • XGBoost is a scalable gradient boosting library for supervised learning and production ML workloads.

Graph ML

  • Deep Graph Library is a framework for building and training graph neural networks.

  • NetworkX is a Python package for creating manipulating and studying complex networks.

  • PyTorch Geometric is a library for deep learning on irregular structures such as graphs.

  • Spektral is a Python library for graph deep learning built on Keras and TensorFlow.

Reinforcement learning

  • Gymnasium is a standard API and collection of environments for reinforcement learning research and development.

  • PettingZoo provides multi-agent reinforcement learning environments with a common Python API.

  • Stable-Baselines3 provides reliable PyTorch implementations of reinforcement learning algorithms.

Compute & data engines

  • Apache Arrow defines a cross-language columnar memory format for efficient analytics and data interchange.

  • Apache Spark MLlib is Spark's scalable machine learning library for distributed data processing workflows.

  • cuDF is a GPU DataFrame library in the RAPIDS ecosystem for loading joining aggregating and filtering data.

  • Dask is a flexible Python library for parallel and distributed computing on larger than memory datasets.

  • In-process analytical database designed for fast OLAP queries and local data analysis.

  • Numba is an open source JIT compiler that translates Python and NumPy code into fast machine code.

  • RAPIDS is a suite of GPU accelerated open source data science and analytics libraries.

  • Distributed AI compute framework for scaling Python applications machine learning workloads and model serving.

Training & efficiency

  • MIT HAN Lab implementation of Activation aware Weight Quantization for efficient compressed LLM inference.

  • Library providing k bit optimizers matrix multiplication and quantization routines for efficient deep learning training and inference.

  • Microsoft deep learning optimization library for distributed training inference compression and scaling very large models.

  • Fast exact attention kernels with IO aware algorithms for efficient Transformer training and inference.

  • Reference implementation for GPTQ one shot post training quantization of large language models.

  • Hugging Face Accelerate simplifies running PyTorch training and inference across distributed and mixed precision setups.

  • Large scale GPU training framework for transformer language models from NVIDIA.

  • Hugging Face library for parameter efficient fine tuning methods such as LoRA on large pretrained models.

Interpretability & robustness

  • Adversarial Robustness Toolbox provides attacks defenses training methods and metrics for adversarial ML.

  • Captum is a PyTorch model interpretability library for understanding feature and neuron importance.

  • CleverHans is a library for benchmarking machine learning systems against adversarial examples.

  • Foolbox is a Python toolbox for creating adversarial examples against ML Models.

  • LIME explains individual predictions of machine learning classifiers and other models.

  • SHAP explains machine learning model predictions using Shapley value based attribution methods.

Privacy & federated learning

  • Flower is a framework for building federated learning and federated AI systems.

  • Opacus is a PyTorch library for training neural networks with differential privacy.

  • PySyft enables data science on private data through privacy preserving and remote data access workflows.

Apps & platforms

  • Unified data and AI lakehouse platform for building training serving and governing ML Models and AI agents.

  • Gradio is a Python library for building and sharing machine learning apps and demos.

  • Hugging Face hosts models datasets Spaces and libraries for building sharing and deploying ML systems.

  • Hugging Face Hub is a platform for sharing models datasets and machine learning demos and managing repositories.

  • Streamlit is an open source Python framework for building interactive data and AI web apps.