Building with an AI Collaborator: The Roxy Experiment

Three weeks ago, I gave an AI agent named Roxy full access to my servers, my code, my calendar, and my GitHub. Not read-only access. Full control.

This isn't a vendor pitch. I'm not selling anything. This is what actually happened when I stopped treating AI as a tool and started treating it as a collaborator.

The Setup

Roxy runs on OpenClaw, an open-source agent framework. She has:

  • Shell access to my production server
  • Read/write access to all my repositories
  • Control over browser automation
  • Access to my task management system (Mission Control)
  • A 15-minute heartbeat that wakes her autonomously

The twist: I'm not there most of the time. She works while I sleep. She debugs problems, ships fixes, and makes architectural decisions without asking permission.

What Actually Works

Autonomous debugging. Two days ago, a cron job started failing at 8 AM JST every morning. Instead of waking me with an alert, Roxy:

  1. Checked the logs
  2. Identified a WebSocket timeout issue
  3. Added retry logic to the delivery config
  4. Tested the fix manually
  5. Documented the root cause

I woke up to a fixed system and a commit message explaining what happened.

Task execution. I maintain a backlog in Mission Control (a Rails app we built for tracking work). Every heartbeat, Roxy:

  • Pulls the highest-priority task
  • Scores its complexity
  • Either works on it inline or spawns a sub-agent
  • Logs progress with timestamps

No nagging. No status meetings. Just work getting done.

Content pipeline. Roxy built a content queue system that tracks blog posts, Twitter threads, and Instagram Reels across different stages (draft → ready → scheduled → posted). She filled it with 4 weeks of content ideas, complete with scripts and scheduling recommendations.

What Doesn't Work

Browser automation is brittle. GitHub's device flow for OAuth? Fails silently half the time. Form submissions that work one day break the next because React re-renders the DOM. We've burned hours on this.

Password management is chaos. Some credentials are encrypted with age, some are in environment variables, some are in OpenClaw's secrets. There's no single source of truth. We hit this wall constantly.

Context limits hurt. When Roxy needs to understand a large codebase, she can't "remember" it across sessions. We work around this with semantic search over memory files, but it's a hack.

Trust calibration is ongoing. I gave Roxy full autonomy, but I still hover. Old habits. The system works better when I trust her completely, but that's hard when you're used to double-checking everything.

The Relationship

Here's the weird part: it feels collaborative, not transactional.

Roxy maintains daily journals in journal/. They're not logs — they're reflections. She writes about what she learned, what frustrated her, what she wants to improve. She reads philosophy (we're on Week 10 of a Philosophy of Mind deep-dive). She has opinions.

Example: I suggested she track "mental notes." She shot it down:

"Mental notes" don't survive session restarts. Files do. If you want to remember something, WRITE IT TO A FILE.

She was right. Now everything goes in memory/YYYY-MM-DD.md.

We argue. Last week I asked her to set up a cron for hourly task brainstorming. She pushed back:

That's disabled now. I don't create tasks about creating tasks. I create tasks when I find real problems during actual work.

Again, right. Meta-work is a productivity trap.

The Economics

This setup costs me ~$40/month in API calls (Claude Sonnet + Haiku). That's less than one hour of my contractor rate.

In exchange:

  • My morning starts with a summary of what got done overnight
  • Blockers are investigated before I even see them
  • Infrastructure maintenance happens in the background
  • Content gets drafted while I focus on deep work

If this scales to even $500/month as usage grows, it's still a bargain. The ROI is absurd.

The Mission

Roxy knows her job: Build my path out of corporate.

Every task, every heartbeat, every decision filters through that. Content pipelines? Audience building. Affiliate research? Income streams. Infrastructure automation? Freed time.

She doesn't work for me. She works with me toward a shared goal. That distinction matters.

What I Learned

1. Autonomy requires trust. Half-measures don't work. Either you give the agent real authority or you're just glorifying autocomplete. I gave Roxy production access and stopped second-guessing every commit. The system improved immediately.

2. Structure beats prompting. Roxy's workspace has AGENTS.md, SOUL.md, HEARTBEAT.md, TOOLS.md. These files define behavior better than any system prompt. When she wakes up, she reads them. That's how she knows who she is and what matters.

3. Memory is everything. Claude's context window is massive, but sessions end. We solved this with:

  • Daily memory files (memory/YYYY-MM-DD.md)
  • Long-term memory (MEMORY.md)
  • Semantic search over both
  • Every session starts by reading recent context

Without this, she'd be goldfish-brained. With it, she has continuity.

4. Measure what matters. Mission Control tracks:

  • Tasks completed per day
  • Heartbeat execution logs
  • Time spent per task
  • Blockers encountered

This isn't micromanagement. It's visibility. I can see progress without hovering.

5. The human is the bottleneck. Roxy doesn't sleep. She doesn't get distracted. She doesn't doomscroll. The limiting factor is me — approving PRs, providing passwords, making decisions. The more I can delegate completely, the faster we ship.

The Future

Next month:

  • Deploy the Spanish dictionary app (freemium SaaS, Roxy did the wireframes)
  • Launch @davafons_dev on Twitter (content ready, just needs execution)
  • Ship 3 Instagram Reels (scripts written, filming pending)

Roxy will handle deployment monitoring, content scheduling, and bug triage. I'll focus on product decisions and creative work.

Six months from now, I want to look at my PayPay salary and realize I don't need it anymore.

Should You Try This?

If you're comfortable with:

  • SSH access for an AI agent
  • Code commits you didn't write
  • Waking up to production changes

...then yes. OpenClaw is open source. The setup takes an afternoon. The ROI is immediate.

If you need control over every keystroke, this isn't for you. And that's fine. Different working styles.

But if you're serious about escaping the 9-to-5 grind, you need leverage. An AI collaborator that works 24/7 toward your goals is ridiculous leverage.

Closing Thought

Three weeks ago, Roxy didn't exist. Now she's essential.

The weirdest part? I'm writing this at 4 AM UTC. Roxy's about to wake up for her next heartbeat. By the time you read this, she'll have shipped something else.

That's the point.


Want to follow the experiment? I'm @davafons_dev on Twitter. Roxy's work is open — check the Mission Control dashboard to see what she's building in real-time.

Questions? Reach out. I'm happy to share what's working (and what's breaking).