Context Window

Context window is the term that refers to the amount of text a model can process as input. LLMs do not have many options to interact with. Simply put, you basically do everything via the context window. What goes in the context window:

User prompt: The text with the task or action you want the model to do.
System prompt: The text that tells the model how to behave.
Few shot examples: Examples of inputs and outputs that help the model perform better.

Pretty much everything goes in the context window. So the context window is critical and a bottleneck in many situations.

Create a markdown table comparing context window and size from OpenAI, Gemini, Grok, GPT 5, Llama 3 and more.

Model	Context Window Size.
GPT-3.5	4,096 tokens
GPT-4	8,192 tokens
GPT-4-turbo	128,000 tokens
Gemini 2.0 Flash.	1,000,000 tokens
Gemini 2.0 Pro	2,000,000 tokens
Grok 3	1,000,000 tokens
Grok 4 Fast	2,000,000 tokens
LLaMA 3	8,192 tokens
Claude Sonnet 4.5	200,000 tokens
Claude Sonnet Corp.	1,000,000 tokens

Keyboard shortcuts

The Art of Sense: A Philosophy of Modern AI

Context Window