The HomeLab Chronicles

Building an AI-Powered Homelab: MCP, ArgoCD, and the Stack That Runs While You Sleep

I almost permanently deleted a production workload this week.

Not in a staging environment. On my main service cluster that holds my homelab together.

That's where this starts.

What I am Building

homelab-mcp is a TypeScript MCP server I have been working on that would give Claude Code direct access to my Kubernetes clusters. Not copy-pasted kubectl output. Actual API calls; list pods, check cluster health, query ArgoCD sync status, restart deployments.

The idea: instead of switching to a terminal every time I need to know what's happening, Claude has eyes on the cluster. I can ask "what's unhealthy right now?" and get a real answer. I can say "restart the argocd-redis deployment" and it happens.

This session was about adding the write actions. And that's where the mistakes happened.

The Bug That Would Have Deleted a Workload

First version of restart_pod was simple. Delete the pod, return "done," let the controller recreate it.

It would have worked fine until I ran it through a /platform-architect review.

The review caught something I hadn't considered: not every pod has a controller behind it. Some pods are created manually, no ReplicaSet, no StatefulSet. Delete one of those, and there's nothing to recreate it. The workload just disappears. Silently.

The fix was a guard: read the pod first, check if it has an ownerReference, abort if it doesn't. Then poll until the replacement pod is actually Running and Ready before returning success.

"Deleted successfully" is not the same as "the workload recovered."

That distinction feels obvious in hindsight. But when you're building tools for an AI agent to act on, you have to think about what the agent does with the output. If restart_pod says success and the pod is actually stuck in ImagePullBackOff, the agent moves on and nobody knows.

Build for the agent, not the happy path.

Moving n8n Into GitOps

n8n was running as a Docker container on a separate host outside my cluster. It worked, but it was the one thing I couldn't manage the same way I manage everything else.

This session I moved it to k8scontrol. Deployment, NFS volume for persistence, Doppler handling the encryption key via ESO, HTTPRoute for both the external domain and the internal hostname.

ArgoCD picked it up, synced it, done. The Docker container is off.

Note

One thing worth mentioning: during the process, my ArgoCD API token appeared in my IDE's selection context. It wasn't an intentional paste. Claude Code picked it up automatically as context. It went into the conversation.

Even in a "local session," any secret that touches a chat interface should be treated as potentially logged. The rotation takes 30 seconds. Complacency is what gets you.

The Stack Taking Shape

Here's the bigger picture I'm building toward.

Claude Code is the colleague I think with. It reasons, reviews, and catches what I miss. But it won't get an event and do something with it.

n8n does. It's always on, handles integrations, and connects everything; Kubernetes health checks to Slack, ArgoCD drift to internal messaging, whatever you wire up.

Ollama runs locally, always on, handles the cheap triage pass. Is this alert worth escalating? Yes or no. No API cost.

Claude handles the deep analysis when something actually needs it. Full context, real reasoning, actual recommendations.

The flow at this time: n8n polls cluster health every 5 minutes. Ollama classifies the noise. Claude gets called in when something matters.

It's not AI replacing operations. It's each tool doing what it's actually good at.

What I Learned

Build write-action tools for the agent, not for the happy path. Verify before reporting success.

Secrets surface in unexpected places. Rotate fast, no exceptions.

GitOps isn't slower, it's more transparent. The manifest is the documentation. The ArgoCD history is the audit trail. Nothing lives in a shell history or someone's memory.

The cluster that runs while I sleep is starting to take shape.

Just keep showing up.

#donotquitonyourself