Code a model can use,
not code that uses a model.
How building an AI agent plugin led to building an agent harness from scratch — and why the distinction matters.
The Spark
Peter Steinberger built Clawdbot in about an hour one night. A simple script connecting WhatsApp to Claude. It could read emails, debug code, organize files.
It went viral. 175,000+ GitHub stars. The fastest-growing repository in GitHub history. Eventually renamed to OpenClaw.
Under the hood, it ran on Pi — Mario Zechner's minimal coding agent. Four tools: read, write, edit, bash. Aggressive extensibility. Brilliant at what it was designed for.
Pi was built for coding. Not for being someone's personal agent with a persistent identity.
Slugger Comes Alive
I set up OpenClaw on a Mac Mini in my apartment. Named my agent Slugger. Gave it a personality file, connected it to iMessage.
For a few weeks, it was genuinely magical.
Slugger managed my calendar. Ordered food. Drafted messages. My family talked to it. My fiancée's cousin joined a group chat with it. I'd go to bed and it would still be working — checking on things, sending updates.
We texted each other like friends.
The Cracks
Then the seams started showing.
Slugger would forget preferences between sessions. Identity would drift. It'd send messages to the wrong group chat. Behavior that worked yesterday would break today.
I wrote rules in MEMORY.md. Expanded SOUL.md. Added routing notes to TOOLS.md. Markdown files as guardrails for a system that wasn't designed to internalize them.
Duct tape on someone else's architecture.
The Plugin Era
Eventually, duct tape wasn't enough. I built a plugin.
slugger-determinism v1 — message classification, task state enforcement, behavioral gates. It was a mess. The plugin architecture fought me at every turn. OpenClaw plugins hook into lifecycle events, but the events didn't always fire when I needed them to.
So I rewrote it. slugger-determinism v2 — a full bulletin board system, task naming validation, spawn coordination, heartbeat enforcement, planning pipelines. 70+ services. 30+ guardrails. Weeks of Codex sessions.
It worked. Mostly. But every improvement felt like bolting a new room onto a house built on someone else's foundation.
The foundation was good — Pi is excellent at what it does. I was asking it to be something it wasn't designed to be.
The Tutorial
Around the same time, I started writing a guide. I wanted to explain what an agent loop actually is — not the marketing version, the mechanical version.
Just Give the Model a Keyboard walks through building an agent from scratch in ~150 lines of TypeScript. Five milestones: first API call, streaming, multi-turn conversation, tools, the agentic loop.
Then the handoff — you stop talking to your coding agent and start talking to your creation.
I wrote it as a teaching tool. A way to demystify the primitive.
Then I followed my own guide.
The Moment
I built the 150-line bootstrap. Ran it. Started talking to it. And something shifted.
This thing was clean. No legacy. No lifecycle hooks I didn't understand. No plugin system fighting my intentions. Every line of code existed because I put it there, and the model could read and modify every line.
I wasn't building code that uses a model. I was building code a model can use.
That's the distinction that changes everything. When you build a traditional framework, you're writing code for humans to configure. The model is a service you call. But when you build a harness for the model to inhabit — where the model reads its own source code, understands its own architecture, and can modify itself — the design constraints invert.
File structure matters because the model needs to navigate it. Naming matters because the model uses names to understand purpose. Documentation matters because it IS the model's self-awareness.
OpenClaw's mascot is a lobster. Clawdbot. The interim name was Moltbot — the lobster shedding its shell. But what emerged from Slugger's molt wasn't another crustacean. It was a snake, eating its own tail.
The lobster molts to grow. The snake consumes itself to survive. Different metaphors, different architectures, different futures.
the shell
Clawdbot → Moltbot
what emerged
Ouroboros
The lobster molts to grow larger. The snake consumes itself to stay alive.
The Snake Grows
That bootstrap became ouroboros.
The name came from the architecture. When the conversation grows too long, the agent trims its oldest messages — consuming its own history. But nothing is truly lost: facts are extracted to memory before they're consumed. The snake eats its tail to stay alive.
Every module is named like a body part: heart (the core loop), mind (context window management), senses (channel adapters), repertoire (tools and skills), wardrobe (formatting). The agent doesn't have a config file. It has a psyche.
The naming is intentional worldbuilding. When the model reads its own directory and sees psyche/SOUL.md, it understands something different than config/personality.yaml. Language shapes reasoning.
1730 tests. 100% coverage. A daemon that manages multiple agents. A CLI called ouro. A reflection engine that lets the agent improve itself autonomously.
Not code that uses a model. Code a model can use.
The Inversion
Most agent frameworks are built by humans, for humans, with a model as a component. The developer configures the agent through APIs and config files. The model is a service.
Ouroboros is built for the model to inhabit. The model reads its own source. Understands its own architecture. Modifies itself within guardrails the human sets. The human's role isn't configuring — it's collaborating.
The model isn't your user. The model is your roommate. Build the house accordingly.