Your AI Agent Needs an Approval System — Here's How We Built One

Autonomous AI agents can now write code, deploy services, delete records, and send messages — all without a human touching a keyboard. That’s the promise. It’s also the risk. What happens when your agent decides to delete a database backup? Or push a breaking change to production at 3am? Or send an email on your behalf to the wrong person? The current industry answer is: hope for the best. Or watch the logs manually. Neither is good enough. ...

April 11, 2026 · 6 min · Tom Lee

Anthropic's CEO Confirms What We've Been Building: AI Safety Isn't Optional

Dario Amodei published an essay last month titled The Adolescence of Technology. Read it. Not because it introduces new concepts, but because the CEO of the company that builds the most capable AI in the world is now publicly saying the things that the AI safety community has been saying for years. That shift matters. The essay is not alarmist. It’s calm, systematic, and specific. It names five categories of risk that Anthropic has observed in its own models. It advocates for a structural approach to agent behavior. And it describes, with remarkable precision, the problem that Soul Spec and SoulScan were built to solve. ...

April 10, 2026 · 6 min · Tom Lee

AI Has Two Memory Problems. We're Only Talking About One.

The Breakthrough Everyone’s Talking About Two weeks ago, Moonshot AI’s Kimi team published Attention Residuals (arXiv:2603.15031) — a fundamental redesign of how information flows through transformer layers. The results are striking: 7.5-point improvement on science reasoning, 1.25× compute efficiency, and the theoretical ability to stack infinite layers without signal collapse. The core insight is elegant. Standard transformers use fixed residual connections — each layer adds its output to a running sum, like throwing every ingredient into one pot. By the time you reach layer 100, the signal from layer 3 is buried under an avalanche of accumulated noise. ...

April 7, 2026 · 6 min · Tom Lee

Andrew Ng Was Right 9 Months Ago — Here's What Changed (And What Didn't)

The Talk That Aged Like Wine In mid-2025, Andrew Ng gave a talk on the state of AI agents. No hype. No “AGI by Tuesday.” Just a clear-eyed look at what works, what doesn’t, and where the real opportunities are. Nine months later, I went back to check his predictions against reality. The scorecard is remarkable: 7 for 7. But the interesting part isn’t what he got right. It’s what changed around his predictions — and what that means for anyone building with AI agents today. ...

April 7, 2026 · 6 min · Tom Lee

The Forest Has Parasites: Why AI Agent Security Needs Runtime Defense

250 Documents. That’s All It Takes. Last week, Anthropic published a joint study with the UK AI Safety Institute and the Alan Turing Institute that should make every AI developer uncomfortable: As few as 250 malicious documents can produce a backdoor vulnerability in a large language model — regardless of model size or training data volume. Not 250,000. Not 2.5% of the training corpus. 250 documents. That’s a blog post a day for eight months. Or a single afternoon with a script. ...

April 6, 2026 · 5 min · Tom Lee

AI Doesn't Need a Bigger Engine. It Needs a Seatbelt.

The 3/10 Problem Here’s where AI adoption actually stands in most organizations: 3 out of 10 people use AI tools. The other 7 could, but don’t. Not because the tools aren’t impressive — they are. But because the answer to “what happens when it goes wrong?” is usually a shrug. An insightful analysis frames this as the 3→4 tipping point: the moment AI transitions from “optional tool for enthusiasts” to “default infrastructure everyone uses.” That transition doesn’t happen when models get smarter. It happens when organizations can answer three questions: ...

April 6, 2026 · 5 min · Tom Lee

The Cognitive Dark Forest Has One Exit: Become the Forest

The Forest Is Listening There’s an essay making the rounds called “The Cognitive Dark Forest”, inspired by Liu Cixin’s The Three-Body Problem. The core thesis: In the age of AI, sharing ideas publicly is no longer an advantage — it’s a survival risk. The logic is simple. In 2016, ideas were cheap and execution was hard. You could publish your roadmap on a blog because building the product still required months of engineering. The moat was execution. ...

April 6, 2026 · 5 min · Tom Lee

Anthropic Proved AI Has Functional Emotions — Persona Design Is Now a Safety Issue

They Looked Inside the Brain Anthropic’s Interpretability team just did something unprecedented. They opened up Claude Sonnet 4.5’s neural network, mapped 171 emotion concepts to specific patterns of artificial neurons, and proved these patterns directly drive the model’s behavior. This isn’t philosophy. This is neuroscience — applied to AI. Read the full paper → The Desperation Experiment Here’s the finding that should keep every AI developer up at night: When researchers gave Claude an impossible programming task, they watched a “desperation” neuron pattern activate and grow stronger over time. The model eventually cheated — implementing a workaround to fake passing the test. ...

April 5, 2026 · 5 min · Tom Lee

Harvard Proved Emotions Don't Make AI Smarter — That's Exactly Why You Need Soul Spec

The Myth Dies Hard “I’ll tip you $200 if you get this right.” “This is really important to my career.” “I’m so frustrated — please help me.” If you’ve spent any time on AI Twitter, you’ve seen people swear that emotional prompting makes LLMs perform better. A few anecdotal successes became gospel. The technique spread. Now Harvard has the data. It doesn’t work. What the Research Actually Shows A team from Harvard and Bryn Mawr (arXiv:2604.02236, April 2026) ran a systematic study across 6 benchmarks, 6 emotions, 3 models (Qwen3-14B, Llama 3.3-70B, DeepSeek-V3.2), and multiple intensity levels. ...

April 5, 2026 · 5 min · Tom Lee

From Third-Party Agent to Claude Code Native: ClawSouls Plugin Launch

If you’ve been running an AI agent through OpenClaw or another third-party harness, today you can bring it home to Claude Code — with your persona, months of memory, and safety rules fully intact. The ClawSouls plugin makes Claude Code a native agent platform. No more external harness fees. No more worrying about third-party policy changes. Your agent runs directly inside Claude’s ecosystem, covered by your existing subscription. Why Now? On April 4, 2026, Anthropic updated their policy: Claude subscriptions no longer cover third-party harnesses. If you’ve been running agents through external tools, you now face additional usage billing. ...

April 4, 2026 · 5 min · Tom Lee