AI Agent Fleet — Architecture¶

Overview¶

A multi-environment fleet of specialized AI agent instances, each running a different frontier model optimized for its role. The fleet operates across cloud-hosted infrastructure, on-premise bare metal, and an isolated inference lab.

Six figures in dedicated compute hardware. Not rented GPU time — owned infrastructure.

Design Principles¶

The fleet architecture follows the same engineering discipline applied to industrial control systems:

Specialization over generalization. Each instance has a defined role, a cost profile, and a model matched to its workload. General-purpose instances waste resources.
Async by default. Instances communicate through a structured dispatch protocol, not real-time chat. Tasks are queued, routed, executed, and results delivered to a shared knowledge layer.
Cost consciousness. Different models have vastly different token costs. The architecture routes work to the cheapest model capable of the task, reserving expensive frontier models for work that requires them.
Shared knowledge, isolated execution. All instances share a structured knowledge vault (Obsidian-based), but execution is sandboxed. One instance's failures don't cascade.
Human oversight at decision points. Agents operate autonomously within defined boundaries. Actions with external impact (publishing, infrastructure changes, financial decisions) require human approval.

Fleet Topology¶

The fleet includes instances running multiple frontier models:

Strategy / oversight instances — expensive, high-capability models for complex reasoning
Research / analysis instances — mid-tier models for web research, content analysis, data processing
Code generation instances — specialized for software development, testing, deployment
Content production instances — optimized for writing, editing, OPSEC sanitization
Security operations instances — red team / blue team, vulnerability assessment, hardening

Each instance has its own configuration, context window, tool access, and cost guardrails.

Dispatch Protocol¶

Work moves between instances via an async dispatch protocol:

Task creation — a task is defined with priority, target instance, and payload
Routing — the dispatch system delivers the task to the appropriate instance based on capability and availability
Execution — the target instance processes the task in its own context
Result delivery — output is written to the shared knowledge layer and/or returned to the requesting instance

This is machine-to-machine orchestration. No chat interfaces, no copy-paste, no human relay.

Knowledge Layer¶

A structured Obsidian vault serves as the shared knowledge base:

Project plans — every project has a single-source-of-truth planning document
Research results — all dispatch task outputs are indexed and searchable
Operational procedures — infrastructure management, deployment runbooks, security protocols
Decision logs — architectural decisions with rationale, preserved for future reference

The vault is versioned, backed up on a 4-hour cycle, and accessible to all instances on the same host.

Cost Management¶

Token economics drive architectural decisions:

Expensive models (e.g., Opus-class) are reserved for strategy, complex reasoning, and human interaction
Mid-tier models handle the bulk of research, analysis, and content work
Cheap models run recurring automated tasks, monitoring, and data processing
Session cost guards automatically warn and escalate when context windows grow expensive
Zero-cost pathways exist for recurring automated work where marginal cost must be zero

The fleet has burned enough on runaway sessions to enforce this aggressively.

Infrastructure¶

Cloud¶

Primary production on dedicated VPS — not shared hosting, not spot instances
Full root access, Kali Linux security tooling
All services behind Cloudflare (WAF, DDoS, Zero Trust access)

Security¶

UFW default deny, SSH key-only
Fail2ban, automated security scanning
WireGuard mesh between environments
CF Zero Trust for all web-facing services
Dedicated red team infrastructure (separate from production)