Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Context Window

Context window is the term that refers to the amount of text a model can process as input. LLMs do not have many options to interact with. Simply put, you basically do everything via the context window. What goes in the context window:

  • User prompt: The text with the task or action you want the model to do.
  • System prompt: The text that tells the model how to behave.
  • Few shot examples: Examples of inputs and outputs that help the model perform better.

Pretty much everything goes in the context window. So the context window is critical and a bottleneck in many situations.

Create a markdown table comparing context window and size from OpenAI, Gemini, Grok, GPT 5, Llama 3 and more.

ModelContext Window Size.
GPT-3.54,096 tokens
GPT-48,192 tokens
GPT-4-turbo128,000 tokens
Gemini 2.0 Flash.1,000,000 tokens
Gemini 2.0 Pro2,000,000 tokens
Grok 31,000,000 tokens
Grok 4 Fast2,000,000 tokens
LLaMA 38,192 tokens
Claude Sonnet 4.5200,000 tokens
Claude Sonnet Corp.1,000,000 tokens