Most Agent Failures Aren't Reasoning Failures

A massive cargo ship loaded with glowing data packets points at a narrow pipeline. On the other side, a stressed blue octopus holds the single tiny packet that made it through.

The first architecture run captured 686 bytes. Runs two and three of the same mission, same agent, same prompt, captured 24 to 59 kilobytes. Ninety-seven percent of the output had vanished somewhere between the subprocess and the parent agent's context window.

I spent two weeks blaming the model. Agents were recommending deprecated patterns, looping on completed tasks, ignoring instructions I had injected into their system prompts. I tuned temperatures, rewrote personas, switched between Opus and Sonnet. Nothing stuck. Then I queried the learnings database and found the real number: out of 6.4 learnings injected per worker on average, only 1.66 showed up in the output. A 38.9% compliance rate. Sixty-one percent of everything I told the agent was being ignored or lost.

The problem was never reasoning. It was context.

The Number That Changed the Diagnosis

A NeurIPS 2025 paper by Cemri et al. analyzed 1,600+ failure traces across seven multi-agent frameworks and identified 14 distinct failure modes. Six of those modes, totaling roughly 32.3% of all observed failures, were inter-agent misalignment: the right information existed somewhere in the system but never reached the agent that needed it. Reasoning-action mismatch alone accounted for 13.2%, where agents reasoned correctly but acted on the wrong context.

That matched what I was seeing. My agents weren't stupid. They were blind.

Via runs each agent as an isolated subprocess. It passes context through generated CLAUDE.md files, artifact directories, and a SQLite learnings database with 1,445 entries across five domains. The architecture is deliberate: isolation prevents one agent's failure from contaminating the next. But isolation also means every piece of context has to be explicitly passed, and the passing mechanisms were failing in four distinct ways.

Four Ways Context Breaks

Bird's-eye view of a hexagonal control room. A stressed blue octopus sits at a central console as four information streams flow in from each direction — each broken in a different way.

After auditing 24 missions and 69 phase records, I found that every context failure in Via fell into one of four categories. Each has a different root cause, a different symptom, and a different fix.

Poisoning: The Pink Elephant Effect

Legacy technology listed in AGENTS.md was biasing agents toward deprecated patterns. An outdated reference to an old embedding model, a removed CLI flag, a renamed database table. The agent couldn't distinguish current from legacy when both appeared in its instruction window. Everything loaded into context carried equal weight.

Microsoft's AI Red Team documented the same failure at scale. In their tests, a benign-looking email containing embedded instructions caused an agent to forward sensitive internal communications. The initial success rate was 40%. After modifying the prompt to prioritize memory recall, it climbed above 80%. The mechanism is identical to what I was seeing: outdated or adversarial information loaded into context without relevance filtering actively misleads the agent.

The symptom is insidious because the agent's reasoning looks correct. It follows its instructions precisely. The instructions are just wrong.

Fragmentation: The 686-Byte Run

When an Opus agent spawns internal sub-agents via Claude Code Tasks, the subprocess output gets truncated before returning to the parent context. Architecture run one captured 686 bytes. Runs two and three captured 24 to 59 kilobytes. Same mission file. Same decomposition. The difference was whether the sub-agent's response fit within the parent's remaining context budget.

The broader fragmentation problem is structural. Via copies artifacts from completed phases into an artifacts/merged/ directory, but there is no manifest. Phase three can see what phase two produced by listing files, but it has no metadata about purpose, freshness, or completeness. It is reading a filing cabinet without labels.

AutoGPT's well-documented loop problem is the same failure in a different system. GitHub issues #920, #1899, #2726, and #4467 all trace to the same root cause: the agent is "unaware of what it has already done" because context about completed work was fragmented or dropped during window management.

Staleness: 24 Ghosts Still Running

I queried Via's mission_metrics table and found 24 entries, all with status running. Every single one was stale. An early schema issue had prevented status updates, and the cleanup never happened. Any agent querying that table for mission success rates would get a 100% false running rate. Zero completions, zero failures. Just ghosts.

This is not a dramatic failure. It is a quiet one. No agent crashed because of those 24 rows. But any analysis built on that data would be wrong, and the agent performing that analysis would have no way to know it. Context staleness is the failure mode with the longest detection latency because the information looks correct. It used to be correct. It just stopped being true at some point nobody recorded.

Research from Chroma across 18 models showed that context degradation is continuous, not threshold-based. Even models with million-token windows show measurable quality drops at 50,000 tokens. In multi-agent pipelines, the degradation compounds: Agent A's slightly degraded output becomes Agent B's ground truth, which becomes Agent C's slightly-more-degraded input. Each hop amplifies the error.

Confusion: 1,445 Learnings, No Relevance Score

Via's learnings database contains 1,445 entries: 436 insights, 357 patterns, 335 errors, 201 decisions. The retrieval mechanism injects the five most recent learnings into each worker's CLAUDE.md. Not the five most relevant. The five most recent.

A worker building a React frontend receives learnings about Go error handling. A worker writing a research brief receives learnings about CSS grid layout. The 38.9% compliance rate makes sense once you see the retrieval logic. The agents aren't ignoring their instructions. They are correctly identifying that most of their injected context is irrelevant to the task at hand.

Cemri et al. found that 6.8% of multi-agent failures came from agents proceeding with incorrect assumptions rather than seeking clarification. Another 1.9% came from agents operating as if no prior context had been shared at all. Both patterns showed up in Via's logs. Workers would acknowledge the injected learnings in their first message, then quietly discard them for the rest of the session.

What Actually Fixes Each One

Each failure mode has a different architectural fix. The temptation is to build one unified "context management layer." The reality is that poisoning and confusion require opposite interventions: poisoning needs less context loaded, confusion needs better-targeted context loaded.

For poisoning, the fix is tiered trust. MemGPT, published at ICLR 2024 by Packer et al., models agent memory on operating system virtual memory with explicit promotion and demotion. Core memory is always visible and size-limited. Archival memory is stored externally and retrieved on demand. The key insight: not everything deserves to be in the active context window. Via already has this structure partially, with SKILL.md files that are listed by name but only loaded on demand. The gap is that learnings and procedural instructions get no such gating.

For fragmentation, the fix is artifact manifests. Instead of copying files to a shared directory and hoping the next agent figures out what they are, each phase writes a structured manifest: what was produced, what it contains, what succeeded, what failed. The next phase reads the manifest before reading the artifacts. CtxVault's "vaults make isolation structural, not configurational" principle applies here. Via's filesystem isolation is correct. The missing piece is metadata that makes the isolation navigable.

For staleness, the fix is pre-compact hooks. Anthropic's Claude Code provides lifecycle events that fire before context compression. A hook can snapshot critical state, validate it against ground truth, and write a structured handover file. Community implementations like Continuous Claude v3 report 95% token reduction through five-layer code analysis at compaction time. Via runs no pre-compact hooks. A long-running phase that hits compaction loses all accumulated working state that is not written to a file.

For confusion, the fix is semantic retrieval. Via's learnings.db already has FTS5 full-text search indexes. The infrastructure for relevance-scored retrieval exists. It is just not wired up. Instead of injecting the five most recent learnings, the orchestrator should query FTS5 with the current task description and inject the five highest-scoring matches. The cost difference is negligible. The compliance rate difference, if the pattern holds from other retrieval systems, should be substantial.

What Three HN Posts Confirmed

In a two-week window in late February 2026, three independent developers launched agent memory tools on Hacker News: Engram (three-tier explicit/implicit/synthesized memory), CtxVault (vault-based structural isolation), and Fava Trails (three-tier trust with LLM validation gates at $0.001 per review). Three different architectures. Same root problem. Agents lose state, and the frameworks they run on treat memory as a retrieval afterthought rather than core architecture.

A Reddit analysis of 44 agent frameworks confirmed the pattern. Most frameworks provide a "memory" API that is really just a key-value store with optional embeddings bolted on. The word "memory" appears in their documentation. The engineering of memory does not.

The Wrong Layer

Cross-section split: agents work normally in the upper workspace layer, while below the glass floor a yellow eureka-flushed octopus holds a wrench and points at a cracked pipe leaking teal data cards.

I spent two weeks optimizing prompts, tuning personas, and blaming the model. The 38.9% compliance rate was sitting in the database the entire time, waiting for a query I had not thought to run. The 686-byte truncation was in the subprocess logs. The 24 stale metrics were one SQL query away.

The hardest part of debugging context failures is that they present as reasoning failures. The agent does something wrong, so you assume it thought something wrong. But in a system with 1,445 learnings, 824 memory files, 44 personas, and 36 skills, the more likely explanation is simpler. The agent never saw the right information. Or it saw too much of the wrong information. Or the information it saw was true last week and false today.

Context engineering is not a buzzword. It is plumbing. And like plumbing, nobody thinks about it until something backs up.

Most Agent Failures Aren't Reasoning Failures

The Number That Changed the Diagnosis

Four Ways Context Breaks

Poisoning: The Pink Elephant Effect

Fragmentation: The 686-Byte Run

Staleness: 24 Ghosts Still Running

Confusion: 1,445 Learnings, No Relevance Score

What Actually Fixes Each One

What Three HN Posts Confirmed

The Wrong Layer

Enjoyed this post?

Related Posts

The #1 Thing My AI Agents Learned Wasn't Code

How the Orchestrator Actually Works: 7 Packages, 4,570 Lines, Zero Magic

MCPs Are Dead. CLIs Won.