A few weeks ago I started running two AI coding assistants on the same codebase at the same time.
Not because I like complexity. Because the single-agent approach had started to feel like using one person for everything -- the person who thinks hard about your database schema is doing fundamentally different work than the person generating twenty migration files to a spec. Keeping them the same has a cost.
The split I landed on: Claude Code for architecture, long-context decisions, anything where judgment matters. OpenAI Codex for the mechanical work -- autonomous scaffolding, DevOps tasks, anything with a clear input and a clear expected output.
Getting Codex on Arch was one command:
pacman -S openai-codex-bin
Then I set both to full-auto mode. No approval prompts. No pauses. Just go.
What I didn't expect: running two agents forces better prompts. With a single generalist, you can be vague and trust it to figure out intent. With two specialists, you have to actually decide what kind of work you're handing off before you type anything. "Think through the architecture here" goes to Claude. "Scaffold these endpoints from this schema" goes to Codex. There's no hiding behind ambiguity.
I added a Waybar module that shows token usage for both models in real time on my status bar. Both counts tick up simultaneously while the code writes itself. I check it more than I should.
The codebase in question is Freebo -- a booking platform for tour operators. It's accumulated real product complexity: a payment engine, a queue-based job system, a pricing model with more edge cases than any pricing model should reasonably have. Not a small project, and there is a hard deadline.
Ten days. Two agents. One launch window.
There's more to write about -- what each agent is actually good at, the cost breakdown, what full-auto mode looks like when it goes sideways. For now I'm watching the numbers climb.