This content originally appeared on DEV Community and was authored by Dr. Furqan Ullah
Main Models
GPT-4.1 — 128,000 tokens
GPT-5 mini — 128,000 tokens
GPT-5 — 128,000 tokens
GPT-4o — 128,000 tokens
o3-mini — 200,000 tokens
Claude Sonnet 3.5 — 90,000 tokens
Claude Sonnet 3.7 — 200,000 tokens
Claude Sonnet 4 — 128,000 tokens
Claude Sonnet 4.5 — 200,000 tokens (standard) / 1,000,000 tokens (beta)
Gemini 2.0 Flash — 1,000,000 tokens
Gemini 2.5 Pro — 128,000 tokens
o4-mini — 128,000 (picker) / 200,000 (full version)
Grok Code Fast 1 — 128,000 tokens
Smaller Models
GPT-3.5 Turbo — 16,384 tokens
GPT-4 — 32,768 tokens
GPT-4 Turbo — 128,000 tokens
GPT-4o mini — 128,000 tokens
How big is that, really?
Let’s take Claude Sonnet 4.5 with a 200,000-token context window.
If one C++ or JavaScript file has ~2,000 lines and each line averages ≈ 15 tokens (including code, spaces, and comments), → one file ≈ 30,000 tokens.
That means Claude Sonnet 4.5 can process around 6 full files of 2,000 lines each at once. If you’re using the 1,000,000-token extended version, that jumps to ~33 files.
Conclusion
So next time your AI assistant suddenly “forgets” what was said earlier or mixes details halfway through your project, remember — it’s not confused… it’s simply “lost in the middle.”
Once that context window fills up, older information fades away to make room for new input.
Now you know why the “lost in the middle” problem happens — because even AI can only remember so much at once.
This content originally appeared on DEV Community and was authored by Dr. Furqan Ullah