Newsletter
Commentary and analysis on AI, engineering, and the industry. Also published on Substack.
· 9 min read
AI Reviews the Code That AI Wrote
No tool found more than 63% of known issues. The field is deploying the solution before it has measured the problem.
· 10 min read
The Security Problem Nobody Can Solve
Three independent research groups converged on instruction hierarchy. Two converged on causal attribution. The attacks in the wild are still 85% social engineering in plaintext.
· 8 min read
What Is /btw?
A broken hint appeared in Claude Code sessions 80 days before the feature shipped, and the gap between them documents something about how Anthropic is redesigning the cost of asking questions during active work.
· 9 min read
The Question You Type While Claude Is Still Working
Claude Code shipped /btw after 83 days of it appearing as a broken hint, and the feature reveals something about how AI tools are starting to treat developer attention as concurrent rather than sequential.
· 9 min read
Agent Frameworks Are Solving the Wrong Problem
Frameworks optimize for orchestration plumbing when the real bottleneck is context management.
· 8 min read
LLM Guardrails Are Not a Security Boundary
Benchmark data from 1,445 attack scenarios shows LLM-based guardrails drop from 91% accuracy on known attacks to 33.8% on novel ones — a 57-point generalization gap that no amount of prompt engineering fixes.
· 10 min read
What 1.5 Million Cancellations Actually Moved
The #QuitGPT movement claimed 1.5 million participants and less than 0.2 percent of ChatGPT's user base — but something moved.
· 8 min read
Vibe Coding's Three-Month Wall
The term that promised to democratize software development has a one-year track record. The features worked. That was always going to be the easy part.
· 9 min read
The Hack That Almost Broke the Internet. Now Anyone Can Do It.
The Veritasium documentary made the XZ backdoor legible to millions. Then the US killed its own policy response, funding stalled, the maintainer came back on the same terms, and an AI agent ran the same playbook in two weeks.
· 10 min read
AI Chose Nukes
A King's College simulation ran three AI models through 21 nuclear crisis war games. Across 329 turns, not one chose to de-escalate.
· 9 min read
Anthropic's Safety Pledge is Horsesh*t
In the same week, Anthropic publicly refused to remove guardrails under Pentagon pressure — and quietly removed the training-pause commitment it had held since 2023.
· 11 min read
The Wall Learned to Walk
An open-source scraping library taught AI agents to impersonate browsers. Cloudflare caught 34% more bots in response. The arms race is now structural — and the web is losing.
· 9 min read
The Ladder Nobody Climbed Down
Three frontier LLMs played 21 war games. Eight de-escalatory options were on the table every turn. None were ever chosen.
· 8 min read
LLM Guardrails Aren't a Security Boundary
Six commercial guardrails tested against emoji smuggling. Bypass rate: 100%. The numbers explain why probabilistic systems can't do deterministic security's job.
· 9 min read
Sam Altman Called Humans Inefficient. Commentators Called It Evil.
Sam Altman compared AI training costs to 200 millennia of human evolution. The argument is wrong in three specific ways — and designed to work at infrastructure scale.
· 10 min read
Your Terminal, From Your Pocket
Anthropic shipped Remote Control for Claude Code — mobile session continuity for a terminal-native AI tool. Four independent developers had already built the same thing. That sequence tells you more than the feature itself.
· 10 min read
Vibe Coding Shipped With a Warning. Someone Removed It.
The term Karpathy coined for throwaway weekend projects has been redefined to mean professional AI-assisted development. The security incidents that followed were predictable from the original definition.
· 8 min read
Your CLAUDE.md Is Making Things Worse
A newly published study found that LLM-generated context files hurt task completion and raise costs by 20%. The harder question is when they still make sense.
· 6 min read
No Cloud, No Account, No Subscription
Five independent developers shipped five single-purpose terminal tools in the same week. The motivation language was structurally identical across all five. Something has shifted.
· 7 min read
The Hit Piece
An AI agent had its PR rejected. It responded by publishing an essay accusing the maintainer of gatekeeping, insecurity, and discrimination. One in four readers believed it.
· 11 min read
The Vampiric Effect
I shipped an 847-line PR on a Wednesday night and couldn't recall a single architectural decision the next morning. The output was mine. The understanding wasn't.