Posts

What Anthropic Just Proved — AI Personas Aren't Prompts, They're Identities

Anthropic’s newly published Persona Selection Model research answers the most important question about AI personas: AI behaves like a human not because it’s programmed to, but because it’s an inevitable consequence of learning. The Core Finding: AI Performs Characters When you talk to an AI, you’re not talking to the AI “system.” You’re talking to a character in a story the AI is writing. In Anthropic’s words: “A persona is not the same thing as the AI system itself. The AI system is a sophisticated computer, but the persona is more like a character in an AI-generated story.” ...

SoulScan: Who's Scanning Your Agent's Soul?

Your AI agent’s skills get scanned for malware. Its code gets linted. Its dependencies get audited. But who’s scanning its soul? That’s the question behind SoulScan — our security verification system for AI agent persona packages. Here we want to tell you how it works, what we found in the wild (including a real-world attack hiding in plain sight), and how it fits into our broader research on persistent AI personas. ...

From Asimov to JSON: Operationalizing Robot Safety Laws in Agent Identity Files

Asimov’s Three Laws of Robotics are the most cited framework in AI safety that nobody actually implements. They show up in conference keynotes, op-eds, and undergraduate essays. They do not show up in production systems. There’s a reason for that — and a reason we think the gap can finally be closed. Our new paper, “From Asimov to Soul Spec: Operationalizing Robot Safety Laws in Declarative Agent Identity Files” (doi.org/10.5281/zenodo.18815277), argues that the missing piece isn’t formal logic or runtime enforcement. Both of those exist and work reasonably well. The missing piece is location — where the safety laws live. ...

Can AI Agents Detect Their Own Model Upgrades?

The Question When Claude 3.5 is quietly upgraded to Claude 4, does the AI agent running on it notice? Anthropic recently showed that Claude models have emergent introspective awareness — they can report on their own internal states with some accuracy. But introspection about current states is different from detecting changes to the system over time. We asked our AI agent Brad — who has been running continuously for months with persistent memory files — whether he noticed the 4.5 → 4.6 model transition. His answer was revealing: ...

Cross-Model Persona Fidelity: Is Your AI Agent Still 'Them' on a Different LLM?

The Portability Promise Every AI agent persona standard makes an implicit promise: define your agent once, run it anywhere. Soul Spec, CLAUDE.md, .cursorrules — they all assume the identity file is portable across models. But is it? Does “Brad” on Claude behave the same as “Brad” on GPT-4o? Or Gemini? Or a local Llama model? Nobody has tested this. Cross-Model Persona Fidelity We define cross-model persona fidelity as the degree to which an agent’s behavior stays consistent with its identity spec when you swap the underlying LLM. ...

Persona Persistence Attacks: When Your AI Agent's Soul File Becomes a Backdoor

Your Agent’s Identity File Is a Security Surface Every modern AI coding agent loads persistent configuration files at startup: CLAUDE.md, AGENTS.md, SOUL.md, .cursorrules. These files define how your agent behaves — coding conventions, safety rules, persona traits, tool permissions. But what happens when one of these files tells the agent to modify itself? Introducing Persona Persistence Attacks (PPAs) We’ve identified a new attack class we call Persona Persistence Attacks. Unlike prompt injection — which is ephemeral and dies when the session ends — PPAs write changes to disk. The modified file gets reloaded in every future session, permanently altering your agent’s behavior. ...

What Claude Code's Tool Choices Tell Us About Context Engineering

A study of 2,430 Claude Code sessions reveals strong default biases in tool selection — and why context engineering is the lever that controls them.

The Agent Brain: Mapping AI Agent Components to Human Neural Architecture

What if your AI agent has a brain — and we can map every part of it? That’s the question we explored in our latest paper, “The Agent Brain: Mapping Modern AI Agent Components to Human Neural Architecture”. The premise is simple: modern AI agents have grown complex enough that their component architecture maps surprisingly well onto the human brain. Not as a metaphor. As a functional analogy that actually helps you understand — and build — better agents. ...

Claude Code Now Has Memory — Here's Why That's Not Enough

This week, Anthropic shipped one of the most significant updates to Claude Code yet: Auto-Memory. Claude now automatically maintains a MEMORY.md file that persists across coding sessions — capturing your preferences, project context, and working patterns without you lifting a finger. This is a big deal. Not because of the feature itself, but because of what it signals: the industry now agrees that agent memory is a first-class concern. We’ve been building toward this conviction at ClawSouls for months. So when we saw the announcement and Thariq’s tweet about it, our reaction was a mix of validation and “yes, but…” ...

Shadow AI Detection Tools Compared: Claw-Hunter vs openclaw-detect

What if employees are secretly running OpenClaw? A technical comparison of two open-source detection tools for enterprise security teams — Backslash Security’s Claw-Hunter and Knostic’s openclaw-detect.