Advanced Context Window Management

Claude code does a great job managing the context window. Claude code is not just a wrapper on the top of a LLM API but is also a blend between Generative AI and Engineering. Here is how Claude code manages the context window in an advanced way.

Right now Claude code has the limit of 200k tokens. Before this gets more confusing we need to make the difference between 2 kinds of tokens: payable tokens and context window tokens. If you are in a subscription based plan you have a limit of tokens before you have some “cool down” period. If you use a paid API plan, you do not have such limit because you pay per token so these are payable tokens.

Now there is a different kind of token, which is the context window tokens. Context window tokens, no matter if you are in a flat fee subscription or API plan, you will have a limit of 200k context window tokens. If you look at the image on the top I can explain some advanced magic Claude code is doing.

First we need to understand how the context window works, for all inputs and output tokens that is the conversation, it’s always consuming context window tokens. So the more questions you ask and the more answers you get, the more you are using of your context window. Eventually you will run out of context window tokens, then Claude will run a compact or clean.

/compact will summarize the context window and reduce the space used on the context window. You can see how the context window is looking with the /context command. If that is not enough, Claude will issue a clean which will clean the conversation history freeing up more context window tokens. You can run these commands anytime you want. You can also run the command /export anytime you want which will export the conversation to a file. This is useful because you can use the file for future prompts, as history, or as a base for new conversations.

So Claude code does a great job managing context window tokens for you. There is one more trick, which is the advent of Claude code sub-agents. The beauty of the sub-agents is twofold. First, each sub-agent has its own context window of 200k tokens. So if you have multiple sub-agents working for you, each one has its own context window. This is very useful because you can have different sub-agents working on different tasks without interfering with each other. Sub-agents do not use the parent context window tokens. Second, sub-agents can be spawned and killed at will. Claude can figure out you need a sub-agent for a specific task, will spawn it, use it, and then kill it when you are done. This way you can manage your context window tokens even more efficiently.

Keyboard shortcuts

The Art of Sense: A Philosophy of Modern AI

Advanced Context Window Management