Context Window

Which LLM context windowfits your use case?

Compare context windows, rank models by use case, and calculate how much Mem0 can reduce your token usage.

Models1051+

Largest context10M

Browse models View rankings Token calculator

What you get

Every LLM context window, ranked and ready

Browse every model by how much text it can handle, find the right one for your use case, and see how Mem0 reduces what you need to send.

Models indexed

1051+

Providers covered

37

Largest context

10M

Use-case rankings

8

Context window spectrum

Models grouped by how much text they can handle

4K32K128K1M10M

4K196 models

32K322 models

128K468 models

1M54 models

10M4 models

Largest: Llama 4 Scout 17b 128e Instruct Maas at 10M tokens

Rankings by use case

The right model depends on more than size. See what actually works for each job.

Mem0 cuts context by ~80%

Mem0 remembers what matters so you don't have to send the full history every time. Smaller requests means faster, cheaper AI.

Without Mem0100K tokens

With Mem0~20K tokens

80% less context Faster responses Fewer hallucinations

Calculate your savings

All Models

1051 models


Llama 4 Scout 17b 128e Instruct MaasMeta	10M	10M	$0.250/M	fast
Llama 4 Scout 17b 16eMeta	10M	16K	$0.050/M	fast
Llama 4 Scout 17b 16e Instruct Fp8Meta	10M	4K	—	fast
Llama 4 Scout 17b 16e Instruct MaasMeta	10M	10M	$0.250/M	fast
Gemini Exp 1206Google	2.1M	8K	Free/M	fast
Grok 4 1 FastXai	2M	2M	$0.200/M	balanced
Grok 4 1 Fast Non Reasoning LatestXai	2M	2M	$0.200/M	deep
Grok 4 1 Fast Reasoning LatestXai	2M	2M	$0.200/M	deep

Showing 8 of 1051. View all →

Rankings

Models ranked by use case

Not all models are equal for every job. Pick the ranking that matches what you're building.

View all rankings

Largest Context Window

Models ranked by maximum context window size.

#1Llama 4 Scout 17b 128e Instruct Maas10M

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 16e Instruct Fp8

Best for RAG

Models best suited for Retrieval-Augmented Generation workloads.

#1Llama 4 Scout 17b 128e Instruct Maas10M

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 16e Instruct Fp8

Best for AI Agents

Models with the capabilities needed to power autonomous agent workflows.

#1Llama 4 Scout 17b 128e Instruct Maas10M

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 16e Instruct Fp8

Best for Document Processing

Models with large enough context to process long documents in one pass.

#1Llama 4 Scout 17b 128e Instruct Maas10M

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 16e Instruct Fp8

Best Value per Context Token

Models offering the most context window per dollar of input cost.

#1Llama 4 Scout 17b 16e10M

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e Instruct Maas

Best Multimodal

Vision-capable models with the largest context windows.

#1Llama 4 Scout 17b 16e10M

Llama 4 Scout 17b 16e

Gemini Exp 1206

Best Reasoning

Models with extended thinking or strong reasoning capabilities.

#1Grok 4 1 Fast2M

Grok 4 1 Fast Reasoning Latest

Best for Chatbots

Fast, cost-efficient models ideal for real-time conversational applications.

#1Llama 4 Scout 17b 128e Instruct Maas10M

Llama 4 Scout 17b 128e Instruct Maas

Llama 4 Scout 17b 16e

Llama 4 Scout 17b 16e Instruct Fp8

Powered by Mem0

Use a smaller model.
Get better results.

Mem0 gives your AI long-term memory so you stop re-sending context on every call. That means you can use a smaller, faster, cheaper model — and still get better answers.

Calculate your savings Learn about Mem0

Example: a multi-turn chat session

Without Mem0~128K tokens sent

Full history

Repeated info

Old context

With Mem0~20K tokens sent

Key memories

Current turn

80% less to send — works with any model