If you've coded alongside an AI agent, this might sound familiar:
"Didn't we just agree on that design decision? Why are you ignoring it?"
LLMs are fundamentally stateless. Cross a session boundary and the slate is wiped clean — architectural decisions, library rationale, hard-won project constraints all evaporate. Every time this happens, you're back to re-explaining the same ground.
The most common solution today is recording context in CLAUDE.md. Drop a context file in your project root, let the coding agent read it at the start of every session, and carry context forward. It's simple, easy to adopt, and widely practiced.
But as a project progresses, CLAUDE.md tends to bloat. More decisions, more exceptions, more "let's write this down too." And here's where things get tricky.
More Context Isn't Always Better
A study by Chroma Research ("Context Rot") evaluated 18 LLM models and showed that simply increasing input token count — with task complexity held constant — is enough to degrade output quality.
Even more telling are findings from ACL Findings (EMNLP 2025): even when retrieval works correctly and the relevant information is present in the context, the sheer volume of surrounding tokens degrades reasoning quality. "Having" the right information and "leveraging" it effectively are fundamentally different problems.
Persisting context and getting buried under it — these two things exist in a fundamental trade-off.
From "Dump Everything" to "Deliver What's Needed"
If CLAUDE.md is a static, full-context injection approach, the alternative is dynamic, just-in-time retrieval — serving only the relevant context at the moment it's needed.
sqlew is an MCP tool that structures and accumulates design decisions and project constraints during development, letting AI agents retrieve only the context they need via MCP. No RAG or embedding setup required — it's operational in 30 seconds.
Context persistence matters. But it's a game of signal-to-noise ratio, not volume. Delivering the right information, at the right time, in the right amount — that's what we're building toward.
References
- "Context Rot: How Increasing Input Tokens Impacts LLM Performance" — Chroma Research — https://research.trychroma.com/context-rot
- "Context Length Alone Hurts LLM Performance Despite Perfect Retrieval" — ACL Findings (EMNLP 2025) — https://aclanthology.org/2025.findings-emnlp.1264.pdf




