Rankings

Best for RAG

Models best suited for Retrieval-Augmented Generation workloads.

Models ranked

784

Top context

131K

Llama3 2 11b Vision

Leading provider

Openai

141 models in this ranking

Leaderboard

Tap any row to see full specs and comparisons.

Rank	Model	Context	Max output	Input $/M	RAG score
1	Llama3 2 11b VisionMeta	131K	131K	$0.015/M	131K ctx
2	Llama3 2 3bMeta	131K	131K	$0.015/M	131K ctx
3	Granite 4.0 MicroIbm	131K	131K	$0.017/M	131K ctx
4	gpt-oss-20bOpenai	131K	131K	$0.020/M	131K ctx
5	Llama 3 2 3bMeta	131K	131K	$0.020/M	131K ctx
6	Meta Llama 3 1 8b InstructMeta	131K	131K	$0.020/M	131K ctx
7	Mistral Nemo Instruct 2407Mistral	131K	131K	$0.020/M	131K ctx
8	Llama Guard 3 8BMeta	131K	131K	$0.020/M	131K ctx
9	Qwen2 Vl 7bAlibaba	131K	131K	$0.020/M	131K ctx
10	Llama 3 1 8bMeta	131K	16K	$0.020/M	131K ctx
11	Meta Llama 3 1 8bMeta	128K	2K	$0.020/M	128K ctx
12	Hermes3 8bNous Research	131K	131K	$0.025/M	131K ctx
13	Llama3 1 8bMeta	128K	128K	$0.025/M	128K ctx
14	Deepseek R1 Distill Llama 8bMeta	131K	—	$0.025/M	131K ctx
15	Llama 3 2 1bMeta	128K	128K	$0.027/M	128K ctx
16	Qwen3 4b Fp8Alibaba	128K	20K	$0.030/M	128K ctx
17	Llama 4 Maverick 17b 128e Instruct Fp8Meta	1M	16K	$0.050/M	1000K ctx
18	Qwen Turbo LatestAlibaba	1M	16K	$0.050/M	1000K ctx
19	Llama 4 Scout 17b 16eMeta	10M	16K	$0.050/M	10000K ctx
20	Qwen Turbo 2024 11 01Alibaba	1M	8K	$0.050/M	1000K ctx
21	Qwen Turbo 2025 04 28Alibaba	1M	16K	$0.050/M	1000K ctx
22	Amazon Nova MicroAmazon	128K	10K	$0.035/M	128K ctx
23	Nova Micro 1.0Amazon	128K	10K	$0.035/M	128K ctx
24	Apac Amazon Nova MicroAmazon	128K	10K	$0.037/M	128K ctx
25	Command R7B (12-2024)Cohere	128K	4K	$0.037/M	128K ctx

Showing 25 of 784 models

Explore other model leaderboards.

Largest Context Window

Models ranked by maximum context window size.

Best for AI Agents

Models with the capabilities needed to power autonomous agent workflows.

Best for Document Processing

Models with large enough context to process long documents in one pass.

Best Value per Context Token

Models offering the most context window per dollar of input cost.

Best Multimodal

Vision-capable models with the largest context windows.

Best Reasoning

Models with extended thinking or strong reasoning capabilities.

Best for Chatbots

Fast, cost-efficient models ideal for real-time conversational applications.