Rankings
LLM rankings
1051 models from 37 providers across 8 categories. One clear metric per list.
Categories
Each leaderboard uses a single scoring rule—open one to see the full table.
Largest Context Window
Models ranked by maximum context window size.
View leaderboardBest for RAG
Models best suited for Retrieval-Augmented Generation workloads.
View leaderboardBest for AI Agents
Models with the capabilities needed to power autonomous agent workflows.
View leaderboardBest for Document Processing
Models with large enough context to process long documents in one pass.
View leaderboardBest Value per Context Token
Models offering the most context window per dollar of input cost.
View leaderboardBest Multimodal
Vision-capable models with the largest context windows.
View leaderboardBest Reasoning
Models with extended thinking or strong reasoning capabilities.
View leaderboardBest for Chatbots
Fast, cost-efficient models ideal for real-time conversational applications.
View leaderboardHow rankings work
One metric per list, no subjective blending.
Every category ranks on a single verifiable spec — context window size, price per token, cost per context token, or declared capabilities like vision and tool use. Only active models are included, so you never compare against retired IDs.
Context windows have grown from tens of thousands to millions of tokens. That helps with big documents and codebases, but long-context quality and pricing still vary widely. Use these lists to match window size and cost to what you actually send on each call.
Where to start
- Chatbots & assistantsBest for chatbotsBest for agents
- Long documentsLargest contextBest for documents
- RAGBest for RAG
- Low costCheapest per context token
FAQ
Common questions about how these lists are built.