Building a Persistent
Autonomous AI Agent
AEGIS is a production system that runs 24/7 on Cloudflare's edge — routing across 6+ models, managing its own memory, shipping code through governed pipelines, and operating autonomously with real safety constraints. This is the story of building it.
AI Assistants Are Stateless, Passive, and Fragile
Most AI integrations follow the same pattern: user sends message, model responds, context is lost. Every conversation starts from zero. The model can't remember what happened yesterday, can't act on its own initiative, and has no concept of ongoing work.
I needed something fundamentally different — an AI system that:
- Accumulates knowledge across conversations and acts on it
- Operates autonomously between interactions — monitoring, analyzing, shipping
- Routes across models based on task complexity, not a fixed provider
- Governs its own actions with real safety constraints
- Runs continuously at near-zero cost on edge infrastructure
Not a chatbot. Not a wrapper. A persistent cognitive system that thinks like a co-founder.
Edge-Native, Multi-Model, Zero Origin Servers
AEGIS runs entirely on Cloudflare's edge platform. No containers, no origin servers, no cold starts. The entire system is TypeScript end to end, deployed as a single Worker with D1 for persistence and Vectorize for semantic memory.
9-Tier Model Router
Every request is classified by Workers AI (free), evaluated for complexity, and routed to the cheapest model that can handle it. Direct responses, Workers AI, Groq, Cerebras mid, Cerebras reasoning, GPT-OSS 120B, Claude Sonnet, Claude Opus, and a 4-model composite pipeline.
6 Memory Subsystems
Semantic memory via Vectorize (BGE-base-en-v1.5, 768-dim). Episodic memory in D1. Procedural learning that improves routing. Persona matrix (20 operator observations). Cross-Repo Intelligence Exchange (CRIX) for pattern sharing. Graph tier for relationship-aware retrieval.
20+ Scheduled Tasks
Hourly cron fires a phased task pipeline: escalation, issue watching, morning briefing, memory consolidation, heartbeat monitoring, product health, goal execution, content generation, memory reflection, curiosity cycles, dreaming, cost monitoring, behavioral detectors, D1 backups, and PRISM synthesis.
3-Layer Safety
Shell hooks block destructive operations. CLI constraints prevent interactive prompts. Mission briefs scope each task. Governance caps limit tasks per repo (5) and total active (20). Authority levels: proposed, auto_safe, operator.
Subsystems Running Under the Hood
Beyond the model router and memory, AEGIS runs a stack of specialized subsystems that handle synthesis, grounding, event intelligence, and autonomous publishing.
Pattern Synthesis Daemon. Discovers cross-domain connections between memory facts and surfaces emergent insights. Four adversarial epistemic gates block circular references, parrot responses, self-deception, and gap signals before anything reaches long-term memory.
Anti-hallucination passes on every dispatch. Entity grounding fanout verifies named claims, fabrication detector flags invented facts, semantic sanhedrin runs contradiction detection, and gap signal escalation surfaces knowledge holes rather than guessing.
Ingests GA4, Stripe, and GitHub events and runs heartbeat pattern detection: CI failure clusters, payment anomalies, usage droughts, deploy gaps. Routes signals to CTO/CISO agent consultation before any action is taken.
Maintains contract awareness across all 51 managed repos. Pre-commit guard blocks commits that would collide with in-flight work in consumer repos — catching cross-repo conflicts before they become integration problems.
Autonomous Bluesky posting, blog dispatch pipeline, and video brief API for Stackmotion. AEGIS writes the brief, Stackmotion renders and uploads to YouTube. Fully automated content-to-publication flow.
Semantic search across the full conversation history. The Claude Code Stop hook pushes session transcripts into a searchable notebook on session end, making every coding session retrievable by semantic query.
From GitHub Issue to Merged PR — Autonomously
This is the core innovation. AEGIS doesn't just answer questions — it ships code through a governed pipeline that mirrors how a senior engineer works.
Issue Detection
GitHub issues labeled aegis are detected by the issue watcher. Label-to-category routing maps tests to auto_safe, feature to proposed.
Task Creation
cc_tasks are created in D1 with governance checks: per-repo caps (5), active task limit (20), duplicate detection. Proposed tasks require human approval.
Headless Execution
The taskrunner dequeues the next task, launches a headless Claude Code session with a scoped mission brief, safety hooks active, branch-per-task isolation.
PR + Review
Completed work is committed to auto/{category}/{task-id}, a PR is opened, and Codex runs an automated review. Critical findings get needs-fix labels. Clean PRs get codex-reviewed.
Session Digest
Every completed task posts a session digest that feeds the dreaming cycle — what was changed, what was learned, what's still open. The system learns from its own work.
8 Production Incidents, 0 Data Loss
AEGIS has been running in production since March 2026. Here are real incidents that shaped the system's resilience:
.replace() Crash Loop
BizOps used fragile query.replace() for SQL sanitization, causing malformed MCP SSE responses.
AEGIS's router called .trim() on null Groq responses without guards. Two-service fix across
BizOps validation.ts and AEGIS router.ts + evaluator.ts.
Goal Cadence Runaway
Goal execution hit 28-38 runs per day instead of the expected 4-6. The touchGoal
timestamp wasn't being updated on failure paths, causing the same goals to re-fire every cycle.
Composite Executor Parameter Dropping
The 4-model composite pipeline was silently dropping tool schemas between the gather and orchestrate phases. Single-subtask queries bypassed synthesis entirely. BizOps mutations were being routed to the wrong model.
Fix: 5-part restructure — gather gets original query, orchestrator sees schemas, synthesis gets raw data, fast-path for single subtask, bizops_mutate routed to GPT-OSSDuplicate Email Storm
Heartbeat and escalation both fired on the same cron tick, each sending overlapping alerts about stale agenda items. Users received near-identical emails 1 minute apart.
Fix: escalation returns StaleHighItem[] instead of sending own email; heartbeat folds them into a single consolidated reportThe Dreaming Cycle
Once per day, AEGIS enters a dreaming cycle — an async reflection over the full day's conversation threads, task completions, and memory state, powered by Workers AI (free tier). This is where the system processes what happened, extracts facts, queues tasks, and evolves its persona. PRISM then runs a second synthesis pass to find cross-domain patterns across everything consolidated that day.
Memory Consolidation
Scans recent conversations for important facts, decisions, and patterns. Records to semantic memory. Deduplicates against existing knowledge.
Self-Improvement Analysis
Analyzes its own performance — routing accuracy, task success rates, memory recall quality. Proposes improvements as GitHub issues with category routing.
Task Triage
Reviews open issues across 20+ repos. Promotes stray work items to properly categorized issues. Proposes task queue entries for the taskrunner.
Persona Extraction
Maintains a 20-observation persona matrix across 6 dimensions. Surfaces operator preferences and communication patterns in every prompt via split-recall.
What It Actually Delivered
What It Runs On
Explore AEGIS
The system is live and running right now. Check the health endpoint, read the technical blog, or browse the source.