Monitoring & Observability | MLNotebooks Tools

LLM observability & evals

Arize Phoenix
Arize Phoenix is an open source AI observability and evaluation platform for LLM and RAG applications.
Braintrust
Platform for evaluating monitoring and improving AI applications with datasets experiments prompts and traces
Helicone
Open source observability platform for logging monitoring caching and analyzing LLM requests
Langfuse
Langfuse is an open source LLM engineering platform for tracing evaluation prompt management and metrics.
LangSmith
LangSmith is an observability and evaluation platform for debugging testing and monitoring LLM apps.
TruLens
Open-source observability and evaluation tooling for tracking and improving LLM applications.

Arize
Arize AI provides observability and evaluation tools for troubleshooting ML models and LLM applications in production.
Evidently
Evidently provides open source and managed tools to evaluate test and monitor AI and ML systems.
Fiddler
Fiddler is an AI observability platform for monitoring explaining and improving ML Models and LLM applications.
WhyLabs
WhyLabs provides AI observability for monitoring data quality model behavior and production ML applications.

Comet ML
Comet is an experiment management and model production platform for tracking comparing and optimizing ML work.
MLflow
Open source platform for managing the machine learning and generative AI lifecycle from tracking to deployment.
Neptune.ai
Neptune is an experiment tracking and metadata store for logging organizing and comparing machine learning runs.
Weights & Biases
Developer platform for experiment tracking model evaluation dataset versioning and ML observability.

Hyperopt
Hyperopt is a Python library for serial and parallel optimization over search spaces.
Optuna
Open source automatic hyperparameter optimization framework with define by run search spaces for machine learning.