Language models
Google lite BERT variant with parameter sharing and factorized embeddings for efficient NLP pretraining.
Google bidirectional Transformer language model for pretraining contextual representations for NLP tasks.
Microsoft disentangled-attention Transformer models for improved natural language understanding.
Hugging Face distilled BERT family providing smaller faster Transformer models for NLP.
Google pretraining method using replaced-token detection for sample-efficient Transformer language representations.
OpenAI autoregressive Transformer language model for text generation and language modeling research.
Selective state space model architecture for efficient long-sequence language modeling.
Meta optimized BERT pretraining recipe and models implemented in fairseq for robust NLP representations.
Google text-to-text Transformer framework and models casting NLP tasks into a unified sequence generation format.
Permutation language modeling Transformer-XL based model for generalized autoregressive pretraining.
NLP libraries
AllenAI research library for building and evaluating deep learning models for NLP.
Facebook AI sequence modeling toolkit for translation and text generation research.
NLP framework for state-of-the-art sequence labeling and embeddings.
Python library for topic modeling document similarity and training unsupervised vector space representations at scale.
Hugging Face Tokenizers provides fast modern tokenizers used for NLP model preprocessing.
KenLM is a toolkit for building querying and using statistical language models.
Efficient neural machine translation framework written in C++ for research and production use.
Python toolkit providing corpora lexical resources and classic NLP algorithms for language processing research and teaching.
Open-source ecosystem for neural machine translation and sequence learning toolkits.
SentencePiece is an unsupervised text tokenizer and detokenizer for neural text generation systems.
Industrial strength Python NLP library for tokenization tagging parsing named entity recognition and production pipelines.
Stanford NLP Python library with neural pipelines for tokenization POS tagging parsing NER and sentiment across many languages.
Python library offering a simple API for common NLP tasks such as tagging and sentiment analysis.
tiktoken is a fast BPE tokenizer for OpenAI model text tokenization.
Embeddings
BAAI FlagEmbedding models and tools for dense retrieval and embedding generation, including the BGE cross-encoder rerankers.
Cohere Embed models and API for semantic search RAG classification and clustering.
Microsoft EmbEddings from bidirEctional Encoder representations for text embedding and retrieval tasks.
AllenNLP contextual word representation model using deep bidirectional language models.
Meta library for efficient text classification and word representations using subword information.
Stanford unsupervised word embedding algorithm based on global word-word co-occurrence statistics.
Alibaba DAMO general text embedding model for semantic similarity and dense retrieval.
Jina AI embedding models and API for multilingual multimodal and long-context retrieval use cases.
Nomic open text embedding model for long-context semantic search and retrieval.
OpenAI embedding model family for converting text into vectors for search clustering and retrieval.
Python framework for sentence text and image embeddings using Transformer models.
High-performance Hugging Face inference server for text embeddings and reranking models.
All-in-one embeddings database for semantic search and language model workflows.
Voyage AI embedding and reranking models for retrieval search and RAG applications.
Google toolkit for learning efficient word vector representations from large text corpora.
Vector databases & indexes
Spotify Annoy is a C++ and Python library for approximate nearest-neighbor search with memory-mapped indexes.
Open source AI application database for embeddings vector search and retrieval workflows
Elasticsearch is Elastic's distributed search and analytics engine for full-text, vector, and hybrid retrieval workloads.
Facebook AI Similarity Search is a library for efficient similarity search and clustering of dense vectors.
HNSWlib is a lightweight C++ and Python library for approximate nearest-neighbor search using HNSW graphs.
Open-source vector database for AI applications built on the Lance columnar data format.
Marqo is an AI-native search engine and API for multimodal vector search and retrieval.
PostgreSQL extension that adds vector similarity search for embeddings inside Postgres
Managed vector database for building search recommendation and RAG applications at scale
Vector similarity search engine and database for high performance neural search and RAG systems
Google ScaNN performs efficient vector similarity search at scale for maximum inner product and nearest-neighbor queries.
Serverless vector database focused on low cost large scale similarity search
Search and serving engine for vector search, recommendation, and large-scale inference.
Open source vector database with hybrid search generative search and scalable AI native data storage
Retrieval engines & rerankers
Anserini is a Lucene based toolkit for reproducible information retrieval research.
Apache Lucene is a high-performance Java search library for indexing and ranked retrieval.
Classic probabilistic bag-of-words ranking function used in search engines and information retrieval.
Cohere Rerank is an API model for reordering search results and documents by semantic relevance to a query.
Stanford late-interaction neural retrieval model for efficient and effective passage search.
Meta Dense Passage Retrieval implementation for open-domain question answering.
Open-source search and analytics suite with full-text search and vector search capabilities.
Pyserini is a Python toolkit for reproducible information retrieval research and retrieval pipelines.
NAVER sparse lexical and expansion model for neural information retrieval.
Tantivy is a Rust full-text search engine library inspired by Apache Lucene and used to build search systems.
Terrier is an open source search engine and information retrieval platform.
Whoosh is a pure Python library for indexing text and searching indexed content.
Agents & RAG frameworks
Microsoft AutoGen is a framework for building and evaluating multi agent AI applications.
CrewAI is a Python framework and platform for orchestrating role based multi agent automations.
Stanford framework for programming and optimizing language-model pipelines with declarative modules.
Guidance lets developers control language models with constrained generation and structured prompting.
Deepset's open-source framework for building production-ready LLM applications and RAG pipelines.
LangChain is a framework for building LLM applications with chains agents retrieval and integrations.
Letta is a framework and platform for building stateful agents with memory and tool use.
LlamaIndex is a framework for connecting data to LLM apps with agents workflows and RAG pipelines.
Mastra is a TypeScript agent framework for workflows memory evals and integrations in AI apps.
NVIDIA framework for adding programmable guardrails and safety controls to conversational AI apps.
OpenAI Agents SDK is a Python toolkit for building agentic applications with tools handoffs and tracing.
Library for structured text generation with LLMs using regexes and type constraints.
Microsoft SDK for orchestrating AI agents and integrating LLMs with application workflows.
Evaluation & benchmarks
Heterogeneous benchmark suite and codebase for zero shot information retrieval evaluation.
Open-source LLM evaluation framework for testing RAG and language-model applications.
Giskard provides testing and evaluation tools to detect risks in AI models and LLM applications.
General Language Understanding Evaluation benchmark suite for natural language understanding systems.
Microsoft Machine Reading Comprehension dataset and benchmark for passage ranking and QA tasks.
Massive Text Embedding Benchmark for evaluating text embedding models across many tasks.
Open-source tool for testing and red-teaming prompts and LLM applications.
Framework for evaluating retrieval augmented generation and LLM applications with metrics and test data generation.
Stanford Question Answering Dataset benchmark for reading comprehension question answering systems.
More challenging language understanding benchmark suite building on GLUE for NLU evaluation.
Benchmark for resolving real GitHub software issues using language models and coding agents.
trec_eval is the NIST tool for evaluating ad hoc retrieval runs using TREC measures.