Why your AI agent needs to forget
Remembering everything forever is its own kind of problem. Here's how cortex-engine handles forgetting at four layers.
Why your AI agent needs to forget
Someone on Reddit asked me a question that most people skip entirely: "how did you solve the forgetting problem?"
Most agent memory projects focus on remembering. Store everything, retrieve it later, done. And that works until it doesn't. Three months in, your agent has thousands of observations and every query returns noise from six weeks ago that was never relevant again. The context window fills up with stale knowledge. Retrieval quality drops. The agent gets worse at its job the longer it runs.
Sound familiar? It should. It's the same problem humans solved a long time ago. We forget things. Not because our brains are broken, but because forgetting is how memory stays useful. Imagine remembering every single moment of your life with equal clarity. Every meal, every commute, every idle thought. It would be a cognitive nightmare. You'd never find the important stuff because it'd be buried under everything else.
Cortex-engine handles forgetting at four layers. Not because we planned it that way from the start, but because we kept hitting different failure modes as the system ran.
Layer 1: Don't remember duplicates
The first problem was bloat. The agent would observe the same thing five different ways across five sessions. "The API is slow." "API response times are high." "Noticed latency on the API endpoint." All slightly different phrasings of the same fact, all creating separate memories, all competing for retrieval slots.
The fix is a prediction error gate. When the agent calls observe(), cortex checks if this observation is genuinely novel or just a rephrasing of something already known. If the similarity to an existing memory is above a threshold, it merges instead of creating a new entry. The memory gets reinforced, not duplicated.
This happens at write time. Before decay or consolidation even need to kick in, the input stage is already filtering noise.
Layer 2: Memories fade if you don't use them
This is the core mechanism. Every memory in cortex has FSRS data attached to it. FSRS is Free Spaced Repetition Scheduler. Same algorithm that powers Anki and other flashcard apps, adapted for agent memory.
Each memory has a stability score. That's how many days until the probability of recall drops to 90%. If a memory gets accessed (the agent queries and it comes up in results), its stability increases. If it sits untouched, its retrievability decays on a power curve.
The math is simple:
retrievability = (1 + factor * elapsed_days / stability) ^ decay
An observation from three months ago that was never relevant again naturally fades from retrieval results. It's still in storage. It's not deleted. But the system stops surfacing it because its retrievability score is too low to compete with recent, reinforced memories.
This is the part that feels most like biological memory. The stuff you keep thinking about stays sharp. The stuff you don't fades away. Not instantly, not randomly. On a predictable curve.
Layer 3: Dream consolidation
Raw observations are noisy. Fifty separate "the API was slow" observations across a month are less useful than one consolidated memory: "the API has a recurring performance issue, possibly related to Tuesday batch jobs."
Cortex has a dream() function that runs a seven-phase consolidation cycle. It's modeled loosely on what happens during biological sleep:
- Cluster. Route unprocessed observations to their nearest existing memories.
- Refine. Update memory definitions based on new clustered observations.
- Create. Promote genuinely novel unclustered observations to new memories.
- Connect. Discover edges between recently active memories.
- Score. Run FSRS passive review on memories in the review state.
- Abstract. Cross-domain pattern synthesis. Find connections between unrelated topics.
- Report. Generate a narrative summary of what changed.
The result is that noisy short-term observations compress into denser, more useful long-term memories. The raw details can then decay via FSRS without losing the insight they contributed to.
You can run dream() on a schedule (we run it overnight) or trigger it manually. The agent doesn't need to be involved. Consolidation happens in the background.
Layer 4: The agent can choose to forget
This is the most human-like layer. Sometimes the agent realizes a belief is outdated. Not gradually fading, but actively wrong. A dependency got updated. A pattern changed. An assumption turned out to be false.
The forget() tool lets the agent deliberately fade a memory. It drops the salience by 40% and marks it for relearning. The memory isn't deleted. It still exists in the graph. But it ranks way lower in retrieval results, so it stops influencing the agent's behavior.
And it logs a belief entry explaining why the memory was faded. There's an audit trail. You can look back and see "on March 15, the agent decided this belief was outdated because..." That trail is useful. It's the difference between an agent that quietly drifts and one that can explain its own evolution.
The combination matters
Any one of these layers alone would be incomplete. Deduplication without decay means the memory still grows indefinitely, just slower. Decay without consolidation means insights get lost along with noise. Consolidation without deliberate forgetting means the agent can't correct itself when it knows something is wrong.
Together they create a memory system that self-prunes. Recent, frequently-accessed memories stay sharp. Old, unreinforced noise fades. Important patterns get consolidated before the details decay. And the agent can actively unlearn when it needs to.
It's not as sophisticated as biological forgetting. That's a hundred million years of evolution. But it's meaningfully better than "remember everything forever," which is what most agent memory systems do today.
The code is open source. The FSRS implementation, the dream consolidation cycle, the prediction error gate, the forget tool. All of it.