AI Agent Fleet — Architecture¶
Overview¶
A multi-environment fleet of specialized AI agent instances, each running a different frontier model optimized for its role. The fleet operates across cloud-hosted infrastructure, on-premise bare metal, and an isolated inference lab.
Six figures in dedicated compute hardware. Not rented GPU time — owned infrastructure.
Design Principles¶
The fleet architecture follows the same engineering discipline applied to industrial control systems:
-
Specialization over generalization. Each instance has a defined role, a cost profile, and a model matched to its workload. General-purpose instances waste resources.
-
Async by default. Instances communicate through a structured dispatch protocol, not real-time chat. Tasks are queued, routed, executed, and results delivered to a shared knowledge layer.
-
Cost consciousness. Different models have vastly different token costs. The architecture routes work to the cheapest model capable of the task, reserving expensive frontier models for work that requires them.
-
Shared knowledge, isolated execution. All instances share a structured knowledge vault (Obsidian-based), but execution is sandboxed. One instance's failures don't cascade.
-
Human oversight at decision points. Agents operate autonomously within defined boundaries. Actions with external impact (publishing, infrastructure changes, financial decisions) require human approval.
Fleet Topology¶
The fleet includes instances running multiple frontier models:
- Strategy / oversight instances — expensive, high-capability models for complex reasoning
- Research / analysis instances — mid-tier models for web research, content analysis, data processing
- Code generation instances — specialized for software development, testing, deployment
- Content production instances — optimized for writing, editing, OPSEC sanitization
- Security operations instances — red team / blue team, vulnerability assessment, hardening
Each instance has its own configuration, context window, tool access, and cost guardrails.
Dispatch Protocol¶
Work moves between instances via an async dispatch protocol:
- Task creation — a task is defined with priority, target instance, and payload
- Routing — the dispatch system delivers the task to the appropriate instance based on capability and availability
- Execution — the target instance processes the task in its own context
- Result delivery — output is written to the shared knowledge layer and/or returned to the requesting instance
This is machine-to-machine orchestration. No chat interfaces, no copy-paste, no human relay.
Knowledge Layer¶
A structured Obsidian vault serves as the shared knowledge base:
- Project plans — every project has a single-source-of-truth planning document
- Research results — all dispatch task outputs are indexed and searchable
- Operational procedures — infrastructure management, deployment runbooks, security protocols
- Decision logs — architectural decisions with rationale, preserved for future reference
The vault is versioned, backed up on a 4-hour cycle, and accessible to all instances on the same host.
Cost Management¶
Token economics drive architectural decisions:
- Expensive models (e.g., Opus-class) are reserved for strategy, complex reasoning, and human interaction
- Mid-tier models handle the bulk of research, analysis, and content work
- Cheap models run recurring automated tasks, monitoring, and data processing
- Session cost guards automatically warn and escalate when context windows grow expensive
- Zero-cost pathways exist for recurring automated work where marginal cost must be zero
The fleet has burned enough on runaway sessions to enforce this aggressively.
Infrastructure¶
Cloud¶
- Primary production on dedicated VPS — not shared hosting, not spot instances
- Full root access, Kali Linux security tooling
- All services behind Cloudflare (WAF, DDoS, Zero Trust access)
Security¶
- UFW default deny, SSH key-only
- Fail2ban, automated security scanning
- WireGuard mesh between environments
- CF Zero Trust for all web-facing services
- Dedicated red team infrastructure (separate from production)