Runtime Guide
Provider Failover Configuration
The runtime supports automatic provider failover when a provider fails (rate limit, outage, auth error). Configure an ordered list of providers in .agenthood/config.json:
{ "providers": [ { "name": "opencode", "model": "deepseek-v4-flash", "priority": 1 }, { "name": "opencode-go", "model": "deepseek-v4-flash", "priority": 2 }, { "name": "anthropic", "model": "claude-sonnet-4-6", "models": ["claude-sonnet-4-6", "claude-haiku-3-5"], "priority": 3 }, { "name": "groq", "model": "llama-3.1-70b-versatile", "priority": 4 }, { "name": "ollama", "model": "llama3.2", "baseUrl": "http://localhost:11434", "priority": 5 } ], "_comment_failover": "Optional circuit breaker and probe recovery settings", "failover": { "failureThreshold": 3, "cooldownMs": 60000, "probeEnabled": true } }
providers[]— Ordered list of LLM providers. First entry is primary, subsequent entries are fallbacks tried in order. Each entry supportsmodels[]for model downgrade on failure.failover(optional) — Circuit breaker tuning:failureThreshold(consecutive failures before skipping),cooldownMs(override cooldown),probeEnabled(enable/disable probe recovery).
See .agenthood/config.example.json for the complete reference.
API Key Validation
The runtime validates LLM API keys at startup before making any provider calls. Only the configured provider's key is checked — if you set provider to "ollama", no key validation is performed.
The runtime automatically loads environment variables from a .env file in the project root (via dotenv).
Key resolution order (per provider):
providers[].apiKey— key in config array entryapiKey— top-level key in configprocess.env.<VAR>— environment variables (including those loaded from.env)
Supported environment variables:
OPENCODE_API_KEY— OpenCode / OpenCodeGoOPENAI_API_KEY— OpenAIANTHROPIC_API_KEY— AnthropicGROQ_API_KEY— Groq
If no key is found and the provider requires one, startup fails with:
OPENCODE_API_KEY not set for provider "opencode". Get a key at https://opencode.ai/auth
Run npx agenthood check to verify API key configuration as part of a full health check.
Ollama / Local Providers
Providers not in the known key list (ollama, custom-local) skip key validation entirely — no key required.
Memory & Vector Store
The runtime includes a tiered memory system and a vector store, all persisted locally:
Memory Store
Pre-configured paths used by the runtime for memory persistence:
| Path | Store | Purpose |
|---|---|---|
.agenthood/memory/ | LanceDB vector store (embedded) | Vector embeddings, semantic search, metadata filtering |
.agenthood/residual.json | ResidualMemory | Decay-weighted trace signals from past agent runs |
.agenthood/society-graph.json | KnowledgeGraphStore | Bidirectional structural relationships between entities |
The IMemoryStore interface at src/memory/IMemoryStore.ts unifies all memory tiers with common operations (set, get, delete, has, prune, stats). The InMemoryStore provides a synchronous TTL/LRU store for in-process caching. LanceDBStore (in src/memory/VectorStore.ts) implements both IVectorStore (vector search) and IMemoryStore<VectorRecord> (key-value access by id).
Residual Memory
ResidualMemory captures trace signals that no explicit tier claimed. It is automatically decayed at the start of each agent session and reinforced after each run. Decay follows an exponential rate (decayRate ^ daysElapsed), and signals below 0.1 strength are pruned automatically. The decayed signals are injected as soft context into the system prompt by PromptBuilder.
Knowledge Graph
KnowledgeGraphStore stores named nodes and bidirectional relations. Use it for structural queries: "what connects these two components?" Path finding uses BFS and returns the shortest path. The graph persists to .agenthood/society-graph.json.
RAG Pipeline
The runtime includes a modular RAG (Retrieval-Augmented Generation) pipeline for indexing and retrieving documents. The pipeline consists of three components:
Chunk Strategy
ChunkStrategy defines how documents are split into chunks before embedding. The built-in FixedSizeChunkStrategy splits text by approximate token count (512 tokens default) with configurable overlap (64 tokens default). Token count is estimated as characters / 4.
import { FixedSizeChunkStrategy } from 'src/rag/ChunkStrategy.ts' const strategy = new FixedSizeChunkStrategy() const chunks = strategy.chunk(documentText, { chunkSize: 512, overlap: 64 })
The MarkdownHierarchicalChunkStrategy implements parent-child chunking for Markdown documents. It splits by ## section headers into parent chunks, then sub-splits each parent into fixed-size child chunks for embedding. On retrieval, child vector matches resolve to parent content for full context.
import { MarkdownHierarchicalChunkStrategy } from 'src/rag/ChunkStrategy.ts' const strategy = new MarkdownHierarchicalChunkStrategy() const { parents, children } = strategy.chunk(documentText, { filePath: 'doc.md', startLine: 1, endLine: 100 })
Indexer
Indexer chunks documents, embeds them via ILLMProvider.embed(), and stores the vectors in an IVectorStore. It supports both single-file and recursive directory indexing.
import { Indexer } from 'src/rag/Indexer.ts' const indexer = new Indexer({ embedder, vectorStore }) await indexer.indexDocument('/path/to/file.md', fileContent) await indexer.indexDirectory('/path/to/project', (file) => file.endsWith('.md')) const stats = indexer.stats() // { totalDocuments: 10, totalChunks: 42, indexedExtensions: ['.md', '.ts'] }
Retriever
Retriever takes a natural language query, embeds it, and searches the vector store for the most relevant chunks. Optional KnowledgeGraphStore integration enriches results with relationship context.
import { Retriever } from 'src/rag/Retriever.ts' const retriever = new Retriever(embedder, vectorStore, knowledgeGraphStore) const results = await retriever.retrieve('how do I configure failover?', { topK: 5, minScore: 0.5, metadataFilter: { source: 'docs/runtime-guide.md' }, }) // Each result: { content, score, source, chunkIndex, graphContext?, metadata }
Results can be filtered by minScore (relevance threshold) and metadataFilter (field-level filtering). When a KnowledgeGraphStore is provided, results are enriched with the node's label, type, and neighbor relationships.
Agentic RAG
AgenticRAG wraps the Retriever with a RetrievalClassifier that decides per-query whether to skip (answer from short-term memory), search vectors (semantic), traverse the knowledge graph (structural), or both. Each result includes provenance: which strategy was used, how many vector matches, graph hops, and source paths.
import { AgenticRAG } from 'src/rag/AgenticRAG.ts' const agenticRag = new AgenticRAG({ embedder, vectorStore, knowledgeGraphStore }) const results = await agenticRag.retrieve('how does failover work?', executionContext) // Each result: { content, strategy, vectorMatches, graphHops, sourcePaths }
Member Detection
MemberOrchestrator (Phase 1) detects which member(s) should handle a task based on keyword matching, changed file patterns, and task stage. It lives at src/reasoning/MemberOrchestrator.ts and supports all members.
import { MemberOrchestrator } from 'src/reasoning/MemberOrchestrator.ts' const orchestrator = new MemberOrchestrator() const detected = orchestrator.detectMembers({ userMessage: 'review this PR and check for security issues', }) // [{ member: 'the-reviewer', score: 4 }, { member: 'the-auditor', score: 2 }]
CLI Provider Override
Override the provider at runtime:
npx agenthood run <agent> "<task>" --provider ollama
This bypasses the configured provider chain and uses the specified provider directly.
Memory Tiers
The runtime includes four specialized memory implementations accessed via ExecutionContext.memory:
| Tier | Implementation | Backing | Purpose |
|---|---|---|---|
| ShortTerm | ShortTermMemoryImpl | In-memory ring buffer | Recent conversation context (default capacity: 20 entries) |
| LongTerm | LongTermMemoryImpl | LanceDB VectorStore | Persistent key-value storage across sessions |
| Episodic | EpisodicMemoryImpl | LanceDB VectorStore + ILLMProvider | Episode recall with semantic search |
| Project | ProjectMemoryImpl | KnowledgeGraphStore + filesystem | Project conventions and architectural decisions |
All four tiers are wired into createContext() in src/commands/run.ts and available to every agent via context.memory.
Society Index
SocietyIndexer indexes all Society members, ADRs, and conventions into a KnowledgeGraphStore and optionally into VectorStore. It is run automatically during agenthood init and loaded by agenthood run at startup. The index persists to .agenthood/society-graph.json.
Personalisation Store
PersonalisationStore stores per-project agent preferences: coding style, analysis depth, and primary domain. Preferences are either set explicitly (via agenthood init prompts or agenthood config set) or inferred from ResidualMemory high-weight signals.
Preferences are injected into the system prompt by PromptBuilder after residual memory hints:
Personalisation:
- style: concise (explicit)
- depth: high (explicit)
- domain: web (inferred)
Preferences persist to .agenthood/preferences.json.
Logging
When failover activates, the runtime logs provider selection to stderr:
Using anthropic (primary)
anthropic failed, falling back to groq
groq failed, falling back to ollama
All providers exhausted