# Gas Town Architecture
Gas Town is a multi-agent workspace manager that coordinates AI coding agents working on software projects. It provides the infrastructure for spawning workers, processing work through a priority queue, and coordinating agents through mail and issue tracking.
**Key insight**: Work is a stream, not discrete batches. The Refinery's merge queue is the coordination mechanism. Beads (issues) are the data plane. There are no "swarm IDs" - just epics with children, processed by workers, merged through the queue.
## System Overview
```mermaid
graph TB
subgraph "Gas Town"
Overseer["π€ Overseer
(Human Operator)"]
subgraph Town["Town (~/ai/)"]
Mayor["π© Mayor
(Global Coordinator)"]
subgraph Rig1["Rig: wyvern"]
W1["π Witness"]
R1["π§ Refinery"]
P1["π± Polecat"]
P2["π± Polecat"]
P3["π± Polecat"]
end
subgraph Rig2["Rig: beads"]
W2["π Witness"]
R2["π§ Refinery"]
P4["π± Polecat"]
end
end
end
Overseer --> Mayor
Mayor --> W1
Mayor --> W2
W1 --> P1
W1 --> P2
W1 --> P3
W2 --> P4
P1 -.-> R1
P2 -.-> R1
P3 -.-> R1
P4 -.-> R2
```
## Core Concepts
### Town
A **Town** is a complete Gas Town installation - the workspace where everything lives. A town contains:
- Town configuration (`config/` directory)
- Mayor's home (`mayor/` directory at town level)
- One or more **Rigs** (managed project repositories)
### Rig
A **Rig** is a container directory for managing a project and its agents. Importantly, the rig itself is NOT a git clone - it's a pure container that holds:
- Rig configuration (`config.json`)
- Rig-level beads database (`.beads/`) for coordinating work
- Agent directories, each with their own git clone
This design prevents agent confusion: each agent has exactly one place to work (their own clone), with no ambiguous "rig root" that could tempt a lost agent.
### Overseer (Human Operator)
The **Overseer** is the human operator of Gas Town - not an AI agent, but the person who runs the system. The Overseer:
- **Sets strategy**: Defines project goals and priorities
- **Provisions resources**: Adds machines, polecats, and rigs
- **Reviews output**: Approves merged code and completed work
- **Handles escalations**: Makes final decisions on stuck or ambiguous work
- **Operates the system**: Runs `gt` commands, monitors dashboards
The Mayor reports to the Overseer. When agents can't resolve issues, they escalate up through the chain: Polecat β Witness β Mayor β Overseer.
### Agents
Gas Town has four AI agent roles:
| Agent | Scope | Responsibility |
|-------|-------|----------------|
| **Mayor** | Town-wide | Global coordination, work dispatch, cross-rig decisions |
| **Witness** | Per-rig | Worker lifecycle, nudging, pre-kill verification, session cycling |
| **Refinery** | Per-rig | Merge queue processing, PR review, integration |
| **Polecat** | Per-rig | Implementation work on assigned issues |
### Mail
Agents communicate via **mail** - messages stored as beads issues with `type=message`. Mail enables:
- Work assignment (Mayor β Refinery β Polecat)
- Status reporting (Polecat β Witness β Mayor)
- Session handoff (Agent β Self for context cycling)
- Escalation (Witness β Mayor for stuck workers)
**Two-tier mail architecture:**
- **Town beads** (prefix: `gm-`): Mayor inbox, cross-rig coordination, handoffs
- **Rig beads** (prefix: varies): Rig-local agent communication
Mail commands use `bd mail` under the hood:
```bash
gt mail send mayor/ -s "Subject" -m "Body" # Uses bd mail send
gt mail inbox # Uses bd mail inbox
gt mail read gm-abc # Uses bd mail read
```
```mermaid
flowchart LR
subgraph "Communication Flows"
direction LR
Mayor -->|"dispatch work"| Refinery
Refinery -->|"assign issue"| Polecat
Polecat -->|"done signal"| Witness
Witness -->|"work complete"| Mayor
Witness -->|"escalation"| Mayor
Mayor -->|"escalation"| Overseer["π€ Overseer"]
end
```
### Beads
**Beads** is the issue tracking system. Gas Town agents use beads to:
- Track work items (`bd ready`, `bd list`)
- Create issues for discovered work (`bd create`)
- Claim and complete work (`bd update`, `bd close`)
- Sync state to git (`bd sync`)
Polecats have direct beads write access and file their own issues.
#### Beads Configuration for Multi-Agent
Gas Town uses beads in a **shared database** configuration where all agents in a rig share one `.beads/` directory. This requires careful configuration:
| Agent Type | BEADS_DIR | BEADS_NO_DAEMON | sync-branch | Notes |
|------------|-----------|-----------------|-------------|-------|
| Polecat (worktree) | rig/.beads | **YES (required)** | recommended | Daemon can't handle worktrees |
| Polecat (full clone) | rig/.beads | Optional | recommended | Daemon safe but sync-branch helps |
| Refinery | rig/.beads | No | optional | Owns main, daemon is fine |
| Witness | rig/.beads | No | optional | Read-mostly access |
| Mayor | rig/.beads | No | optional | Infrequent access |
**Critical: Worktrees require no-daemon mode.** The beads daemon doesn't know which branch each worktree has checked out, and can commit/push to the wrong branch.
**Environment setup when spawning agents:**
```bash
# For worktree polecats (REQUIRED)
export BEADS_DIR=/path/to/rig/.beads
export BEADS_NO_DAEMON=1
# For full-clone polecats (recommended)
export BEADS_DIR=/path/to/rig/.beads
# Daemon is safe, but consider sync-branch for coordination
# Rig beads config.yaml should include:
sync-branch: beads-sync # Separate branch for beads commits
```
**Why sync-branch?** When multiple agents share a beads database, using a dedicated sync branch prevents beads commits from interleaving with code commits on feature branches.
## Directory Structure
### Town Level
```
~/gt/ # Town root (Gas Town harness)
βββ CLAUDE.md # Mayor role prompting (at town root)
βββ .beads/ # Town-level beads (prefix: gm-)
β βββ beads.db # Mayor mail, coordination, handoffs
β βββ config.yaml
β
βββ mayor/ # Mayor's HOME at town level
β βββ town.json # {"type": "town", "name": "..."}
β βββ rigs.json # Registry of managed rigs
β βββ state.json # Mayor state (NO mail/ directory)
β
βββ gastown/ # A rig (project container)
βββ wyvern/ # Another rig
```
**Note**: Mayor's mail is now in town beads (`gm-*` issues), not JSONL files.
### Rig Level
Created by `gt rig add `:
```
gastown/ # Rig = container (NOT a git clone)
βββ config.json # Rig configuration (git_url, beads prefix)
βββ .beads/ β mayor/rig/.beads # Symlink to canonical beads in Mayor
β
βββ mayor/ # Mayor's per-rig presence
β βββ rig/ # CANONICAL clone (beads authority)
β β βββ .beads/ # Canonical rig beads (prefix: gt-, etc.)
β βββ state.json
β
βββ refinery/ # Refinery agent (merge queue processor)
β βββ rig/ # Refinery's clone (for merge operations)
β βββ state.json
β
βββ witness/ # Witness agent (per-rig pit boss)
β βββ state.json # No clone needed (monitors polecats)
β
βββ crew/ # Overseer's personal workspaces
β βββ / # Workspace (full git clone)
β
βββ polecats/ # Worker directories (git worktrees)
βββ Nux/ # Worktree from Mayor's clone
βββ Toast/ # Worktree from Mayor's clone
```
**Beads architecture:**
- Mayor's clone holds the canonical `.beads/` for the rig
- Rig root symlinks `.beads/` β `mayor/rig/.beads`
- All agents (crew, polecats, refinery) inherit beads via parent lookup
- Polecats are git worktrees from Mayor's clone (much faster than full clones)
**Key points:**
- The rig root has no `.git/` - it's not a repository
- All agents use `BEADS_DIR` to point to the rig's `.beads/`
- Refinery's clone is the authoritative "main branch" view
- Witness may not need its own clone (just monitors polecat state)
```mermaid
graph TB
subgraph Rig["Rig: gastown (container, NOT a git clone)"]
Config["config.json"]
Beads[".beads/"]
subgraph Polecats["polecats/"]
Nux["Nux/
(git clone)"]
Toast["Toast/
(git clone)"]
end
subgraph Refinery["refinery/"]
RefRig["rig/
(canonical main)"]
RefState["state.json"]
end
subgraph Witness["witness/"]
WitState["state.json"]
end
subgraph MayorRig["mayor/"]
MayRig["rig/
(git clone)"]
MayState["state.json"]
end
subgraph Crew["crew/"]
CrewMain["main/
(git clone)"]
end
end
Beads -.->|BEADS_DIR| Nux
Beads -.->|BEADS_DIR| Toast
Beads -.->|BEADS_DIR| RefRig
Beads -.->|BEADS_DIR| MayRig
Beads -.->|BEADS_DIR| CrewMain
```
### ASCII Directory Layout
For reference without mermaid rendering:
```
~/gt/ # TOWN ROOT (Gas Town harness)
βββ CLAUDE.md # Mayor role prompting
βββ .beads/ # Town-level beads (gm-* prefix)
β βββ beads.db # Mayor mail, coordination
β βββ config.yaml
β
βββ mayor/ # Mayor's home (at town level)
β βββ town.json # {"type": "town", "name": "..."}
β βββ rigs.json # Registry of managed rigs
β βββ state.json # Mayor state (no mail/ dir)
β
βββ gastown/ # RIG (container, NOT a git clone)
β βββ config.json # Rig configuration
β βββ .beads/ β mayor/rig/.beads # Symlink to Mayor's canonical beads
β β
β βββ mayor/ # Mayor's per-rig presence
β β βββ rig/ # CANONICAL clone (beads + worktree base)
β β β βββ .git/
β β β βββ .beads/ # CANONICAL rig beads (gt-* prefix)
β β β βββ
β β βββ state.json
β β
β βββ refinery/ # Refinery agent (merge queue)
β β βββ rig/ # Refinery's clone (for merges)
β β β βββ .git/
β β β βββ
β β βββ state.json
β β
β βββ witness/ # Witness agent (pit boss)
β β βββ state.json # No clone needed
β β
β βββ crew/ # Overseer's personal workspaces
β β βββ / # Full clone (inherits beads from rig)
β β βββ .git/
β β βββ
β β
β βββ polecats/ # Worker directories (worktrees)
β β βββ Nux/ # Git worktree from Mayor's clone
β β β βββ # (inherits beads from rig)
β β βββ Toast/ # Git worktree from Mayor's clone
β β
β βββ plugins/ # Optional plugins
β βββ merge-oracle/
β βββ CLAUDE.md
β βββ state.json
β
βββ wyvern/ # Another rig (same structure)
βββ config.json
βββ .beads/ β mayor/rig/.beads
βββ mayor/
βββ refinery/
βββ witness/
βββ crew/
βββ polecats/
```
**Key changes from earlier design:**
- Town beads (`gm-*`) hold Mayor mail instead of JSONL files
- Mayor has per-rig clone that's canonical for beads and worktrees
- Rig `.beads/` symlinks to Mayor's canonical beads
- Polecats are git worktrees from Mayor's clone (fast)
### Why Decentralized?
Agents live IN rigs rather than in a central location:
- **Locality**: Each agent works in the context of its rig
- **Independence**: Rigs can be added/removed without restructuring
- **Parallelism**: Multiple rigs can have active workers simultaneously
- **Simplicity**: Agent finds its context by looking at its own directory
## Agent Responsibilities
### Mayor
The Mayor is the global coordinator:
- **Work dispatch**: Spawns workers for issues, coordinates batch work on epics
- **Cross-rig coordination**: Routes work between rigs when needed
- **Escalation handling**: Resolves issues Witnesses can't handle
- **Strategic decisions**: Architecture, priorities, integration planning
**NOT Mayor's job**: Per-worker cleanup, session killing, nudging workers
### Witness
The Witness is the per-rig "pit boss":
- **Worker monitoring**: Track polecat health and progress
- **Nudging**: Prompt workers toward completion
- **Pre-kill verification**: Ensure git state is clean before killing sessions
- **Session lifecycle**: Kill sessions, update worker state
- **Self-cycling**: Hand off to fresh session when context fills
- **Escalation**: Report stuck workers to Mayor
**Key principle**: Witness owns ALL per-worker cleanup. Mayor is never involved in routine worker management.
### Refinery
The Refinery manages the merge queue:
- **PR review**: Check polecat work before merging
- **Integration**: Merge completed work to main
- **Conflict resolution**: Handle merge conflicts
- **Quality gate**: Ensure tests pass, code quality maintained
```mermaid
flowchart LR
subgraph "Merge Queue Flow"
P1[Polecat 1
branch] --> Q[Merge Queue]
P2[Polecat 2
branch] --> Q
P3[Polecat 3
branch] --> Q
Q --> R{Refinery}
R -->|merge| M[main]
R -->|conflict| P1
end
```
#### Direct Landing (Bypass Merge Queue)
Sometimes Mayor needs to land a polecat's work directly, skipping the Refinery:
| Scenario | Use Direct Landing? |
|----------|---------------------|
| Single polecat, simple change | Yes |
| Urgent hotfix | Yes |
| Refinery unavailable | Yes |
| Multiple polecats, potential conflicts | No - use Refinery |
| Complex changes needing review | No - use Refinery |
**Commands:**
```bash
# Normal flow (through Refinery)
gt merge-queue add # Polecat signals PR ready
gt refinery process # Refinery processes queue
# Direct landing (Mayor bypasses Refinery)
gt land --direct / # Land directly to main
gt land --direct --force / # Skip safety checks
gt land --direct --skip-tests / # Skip test run
gt land --direct --dry-run / # Preview only
```
**Direct landing workflow:**
```mermaid
sequenceDiagram
participant M as π© Mayor
participant R as Refinery Clone
participant P as Polecat Branch
participant B as π¦ Beads
M->>M: Verify polecat session terminated
M->>P: Check git state clean
M->>R: Fetch polecat branch
M->>R: Merge to main (fast-forward or merge commit)
M->>R: Run tests (optional)
M->>R: Push to origin
M->>B: Close associated issue
M->>P: Delete polecat branch (cleanup)
```
**Safety checks (skippable with --force):**
1. Polecat session must be terminated
2. Git working tree must be clean
3. No merge conflicts with main
4. Tests pass (skippable with --skip-tests)
**When direct landing makes sense:**
- Mayor is doing sequential, non-swarming work (like GGT scaffolding)
- Single worker completed an isolated task
- Hotfix needs to land immediately
- Refinery agent is down or unavailable
### Polecat
Polecats are the workers that do actual implementation:
- **Issue completion**: Work on assigned beads issues
- **Self-verification**: Run decommission checklist before signaling done
- **Beads access**: Create issues for discovered work, close completed work
- **Clean handoff**: Ensure git state is clean for Witness verification
## Key Workflows
### Work Dispatch
Work flows through the system as a stream. The Overseer spawns workers, they process issues, and completed work enters the merge queue.
```mermaid
sequenceDiagram
participant O as π€ Overseer
participant M as π© Mayor
participant W as π Witness
participant P as π± Polecats
participant R as π§ Refinery
O->>M: Spawn workers for epic
M->>W: Assign issues to workers
W->>P: Start work
loop For each worker
P->>P: Work on issue
P->>R: Submit to merge queue
R->>R: Review & merge
end
R->>M: All work merged
M->>O: Report results
```
**Note**: There is no "swarm ID" or batch boundary. Workers process issues independently. The merge queue handles coordination. "Swarming an epic" is just spawning multiple workers for the epic's child issues.
### Worker Cleanup (Witness-Owned)
```mermaid
sequenceDiagram
participant P as π± Polecat
participant W as π Witness
participant M as π© Mayor
participant O as π€ Overseer
P->>P: Complete work
P->>W: Done signal
W->>W: Capture git state
W->>W: Assess cleanliness
alt Git state dirty
W->>P: Nudge (fix issues)
P->>P: Fix issues
P->>W: Done signal (retry)
end
alt Clean after β€3 tries
W->>W: Verify clean
W->>P: Kill session
else Stuck after 3 tries
W->>M: Escalate
alt Mayor can fix
M->>W: Resolution
else Mayor can't fix
M->>O: Escalate to human
O->>M: Decision
end
end
```
### Session Cycling (Mail-to-Self)
When an agent's context fills, it hands off to its next session:
1. **Recognize**: Notice context filling (slow responses, losing track of state)
2. **Capture**: Gather current state (active work, pending decisions, warnings)
3. **Compose**: Write structured handoff note
4. **Send**: Mail handoff to own inbox
5. **Exit**: End session cleanly
6. **Resume**: New session reads handoff, picks up where old session left off
```mermaid
sequenceDiagram
participant S1 as Agent Session 1
participant MB as π¬ Mailbox
participant S2 as Agent Session 2
S1->>S1: Context filling up
S1->>S1: Capture current state
S1->>MB: Send handoff note
S1->>S1: Exit cleanly
Note over S1,S2: Session boundary
S2->>MB: Check inbox
MB->>S2: Handoff note
S2->>S2: Resume from handoff state
```
## Key Design Decisions
### 1. Witness Owns Worker Cleanup
**Decision**: Witness handles all per-worker cleanup. Mayor is never involved.
**Rationale**:
- Separation of concerns (Mayor strategic, Witness operational)
- Reduced coordination overhead
- Faster shutdown
- Cleaner escalation path
### 2. Polecats Have Direct Beads Access
**Decision**: Polecats can create, update, and close beads issues directly.
**Rationale**:
- Simplifies architecture (no proxy through Witness)
- Empowers workers to file discovered work
- Faster feedback loop
- Beads v0.30.0+ handles multi-agent conflicts
### 3. Session Cycling via Mail-to-Self
**Decision**: Agents mail handoff notes to themselves when cycling sessions.
**Rationale**:
- Consistent pattern across all agent types
- Timestamped and logged
- Works with existing inbox infrastructure
- Clean separation between sessions
### 4. Decentralized Agent Architecture
**Decision**: Agents live in rigs (`/witness/rig/`) not centralized (`mayor/rigs//`).
**Rationale**:
- Agents work in context of their rig
- Rigs are independent units
- Simpler role detection
- Cleaner directory structure
### 5. Visible Config Directory
**Decision**: Use `config/` not `.gastown/` for town configuration.
**Rationale**: AI models often miss hidden directories. Visible is better.
### 6. Rig as Container, Not Clone
**Decision**: The rig directory is a pure container, not a git clone of the project.
**Rationale**:
- **Prevents confusion**: Agents historically get lost (polecats in refinery, mayor in polecat dirs). If the rig root were a clone, it's another tempting target for confused agents. Two confused agents at once = collision disaster.
- **Single work location**: Each agent has exactly one place to work (their own `/rig/` clone)
- **Clear role detection**: "Am I in a `/rig/` directory?" = I'm in an agent clone
- **Refinery is canonical main**: Refinery's clone serves as the authoritative "main branch" - it pulls, merges PRs, and pushes. No need for a separate rig-root clone.
### 7. Plugins as Agents
**Decision**: Plugins are just additional agents with identities, mailboxes, and access to beads. No special plugin infrastructure.
**Rationale**:
- Fits Gas Town's intentionally rough aesthetic
- Zero new infrastructure needed (uses existing mail, beads, identities)
- Composable - plugins can invoke other plugins via mail
- Debuggable - just look at mail logs and bead history
- Extensible - anyone can add a plugin by creating a directory
**Structure**: `/plugins//` with optional `rig/`, `CLAUDE.md`, `mail/`, `state.json`.
### 8. Rig-Level Beads via BEADS_DIR
**Decision**: Each rig has its own `.beads/` directory. Agents use the `BEADS_DIR` environment variable to point to it.
**Rationale**:
- **Centralized issue tracking**: All polecats in a rig share the same beads database
- **Project separation**: Even if the project repo has its own `.beads/`, Gas Town agents use the rig's beads instead
- **OSS-friendly**: For contributing to projects you don't own, rig beads stay separate from upstream
- **Already supported**: Beads supports `BEADS_DIR` env var (see beads `internal/beads/beads.go`)
**Configuration**: Gas Town sets `BEADS_DIR` when spawning agents:
```bash
export BEADS_DIR=/path/to/rig/.beads
```
**See also**: beads issue `bd-411u` for documentation of this pattern.
### 9. Direct Landing Option
**Decision**: Mayor can land polecat work directly, bypassing the Refinery merge queue.
**Rationale**:
- **Flexibility**: Not all work needs merge queue overhead
- **Sequential work**: Mayor doing non-swarming work (like GGT scaffolding) shouldn't need Refinery
- **Emergency path**: Hotfixes can land immediately
- **Resilience**: System works even if Refinery is down
**Constraints**:
- Direct landing still uses Refinery's clone as the canonical main
- Safety checks prevent landing dirty or conflicting work
- Mayor takes responsibility for quality (no Refinery review)
**Commands**:
```bash
gt land --direct / # Standard direct land
gt land --direct --force / # Skip safety checks
```
### 10. Beads Daemon Awareness
**Decision**: Gas Town must disable the beads daemon for worktree-based polecats.
**Rationale**:
- The beads daemon doesn't track which branch each worktree has checked out
- Daemon can commit beads changes to the wrong branch
- This is a beads limitation, not a Gas Town bug
- Full clones don't have this problem
**Configuration**:
```bash
# For worktree polecats (REQUIRED)
export BEADS_NO_DAEMON=1
# For full-clone polecats (optional)
# Daemon is safe, no special config needed
```
**See also**: beads docs/WORKTREES.md and docs/DAEMON.md for details.
### 11. Work is a Stream (No Swarm IDs)
**Decision**: Work state is encoded in beads epics and issues. There are no "swarm IDs" or separate swarm infrastructure - the epic IS the grouping, the merge queue IS the coordination.
**Rationale**:
- **No new infrastructure**: Beads already provides hierarchy, dependencies, status, priority
- **Shared state**: All rig agents share the same `.beads/` via BEADS_DIR
- **Queryable**: `bd ready` finds work with no blockers, enabling multi-wave orchestration
- **Auditable**: Beads history shows work progression
- **Resilient**: Beads sync handles multi-agent conflicts
- **No boundary problem**: When does a swarm start/end? Who's in it? These questions dissolve - work is a stream
**How it works**:
- Create an epic with child issues for batch work
- Dependencies encode ordering (task B depends on task A)
- Status transitions track progress (open β in_progress β closed)
- Witness queries `bd ready` to find next available work
- Spawn workers as needed - add more anytime
- Batch complete = all child issues closed (or just keep going)
**Example**: Batch work on authentication bugs:
```
gt-auth-epic # Epic: "Fix authentication bugs"
βββ gt-auth-epic.1 # "Fix login timeout" (ready, no deps)
βββ gt-auth-epic.2 # "Fix session expiry" (ready, no deps)
βββ gt-auth-epic.3 # "Update auth tests" (blocked by .1 and .2)
```
Workers process issues independently. Work flows through the merge queue. No "swarm ID" needed - the epic provides grouping, labels provide ad-hoc queries, dependencies provide sequencing.
### 12. Agent Session Lifecycle (Daemon Protection)
**Decision**: A background daemon manages agent session lifecycles, including cycling sessions when agents request handoff.
**Rationale**:
- Agents can't restart themselves after exiting
- Handoff mail is useless without someone to start the new session
- Daemon provides reliable session management outside agent context
- Enables autonomous long-running operation (hours/days)
**Session cycling protocol**:
1. Agent detects context exhaustion or requests cycle
2. Agent sends handoff mail to own inbox
3. Agent sets `requesting_cycle: true` in state.json
4. Agent exits (or sends explicit signal to daemon)
5. Daemon detects exit + cycle request flag
6. Daemon starts new session
7. New session reads handoff mail, resumes work
**Daemon responsibilities**:
- Monitor agent session health (heartbeat)
- Detect session exit
- Check cycle request flag in state.json
- Start replacement session if cycle requested
- Clear cycle flag after successful restart
- Report failures to Mayor (escalation)
**Applies to**: Witness, Refinery (both long-running agents that may exhaust context)
```mermaid
sequenceDiagram
participant A1 as Agent Session 1
participant S as State.json
participant D as Daemon
participant A2 as Agent Session 2
participant MB as Mailbox
A1->>MB: Send handoff mail
A1->>S: Set requesting_cycle: true
A1->>A1: Exit cleanly
D->>D: Detect session exit
D->>S: Check requesting_cycle
S->>D: true
D->>D: Start new session
D->>S: Clear requesting_cycle
A2->>MB: Read handoff mail
A2->>A2: Resume from handoff
```
### 13. Resource-Constrained Worker Pool
**Decision**: Each rig has a configurable `max_workers` limit for concurrent polecats.
**Rationale**:
- Claude Code can use 500MB+ RAM per session
- Prevents resource exhaustion on smaller machines
- Enables autonomous operation without human oversight
- Witness respects limit when spawning new workers
**Configuration** (in rig config.json):
```json
{
"type": "rig",
"max_workers": 8,
"worker_spawn_delay": "5s"
}
```
**Witness behavior**:
- Query active worker count before spawning
- If at limit, wait for workers to complete
- Prioritize higher-priority ready issues
### 14. Outpost Abstraction for Federation
**Decision**: Federation uses an "Outpost" abstraction to support multiple compute backends (local, SSH/VM, Cloud Run, etc.) through a unified interface.
**Rationale**:
- Different workloads need different compute: burst vs long-running, cheap vs fast
- Cloud Run's pay-per-use model is ideal for elastic burst capacity
- VMs are better for autonomous long-running work
- Local is always the default for development
- Platform flexibility lets users choose based on their needs and budget
**Key insight**: Cloud Run's persistent HTTP/2 connections solve the "zero to one" cold start problem, making container workers viable for interactive-ish work at ~$0.017 per 5-minute session.
**Design principles**:
1. **Local-first** - Remote outposts are overflow, not primary
2. **Git remains source of truth** - All outposts sync via git
3. **HTTP for Cloud Run** - Don't force filesystem mail onto containers
4. **Graceful degradation** - System works with any subset of outposts
**See**: `docs/federation-design.md` for full architectural analysis.
## Multi-Wave Work Processing
For large task trees (like implementing GGT itself), workers can process multiple "waves" of work automatically based on the dependency graph.
### Wave Orchestration
A wave is not explicitly managed - it emerges from dependencies:
1. **Wave 1**: All issues with no dependencies (`bd ready`)
2. **Wave 2**: Issues whose dependencies are now closed
3. **Wave N**: Continue until all work is done
```mermaid
graph TD
subgraph "Wave 1 (no dependencies)"
A[Task A]
B[Task B]
C[Task C]
end
subgraph "Wave 2 (depends on Wave 1)"
D[Task D]
E[Task E]
end
subgraph "Wave 3 (depends on Wave 2)"
F[Task F]
end
A --> D
B --> D
C --> E
D --> F
E --> F
```
### Witness Work Loop
```
while epic has open issues:
ready_issues = bd ready --parent
if ready_issues is empty and workers_active:
wait for worker completion
continue
for issue in ready_issues:
if active_workers < max_workers:
spawn worker for issue
else:
break # wait for capacity
monitor workers, handle completions
all work complete - report to Mayor
```
### Long-Running Autonomy
With daemon session cycling, the system can run autonomously for extended periods:
- **Witness cycles**: Every few hours as context fills
- **Refinery cycles**: As merge queue grows complex
- **Workers cycle**: If individual tasks are very large
- **Daemon persistence**: Survives all agent restarts
The daemon is the only truly persistent component. All agents are ephemeral sessions that hand off state via mail.
Work is a continuous stream - you can add new issues, spawn new workers, reprioritize the queue, all without "starting a new swarm" or managing batch boundaries.
## Configuration
### town.json
```json
{
"type": "town",
"version": 1,
"name": "stevey-gastown",
"created_at": "2024-01-15T10:30:00Z"
}
```
### rigs.json
```json
{
"version": 1,
"rigs": {
"wyvern": {
"git_url": "https://github.com/steveyegge/wyvern",
"added_at": "2024-01-15T10:30:00Z"
}
}
}
```
### rig.json (Per-Rig Config)
Each rig has a `config.json` at its root:
```json
{
"type": "rig",
"version": 1,
"name": "wyvern",
"git_url": "https://github.com/steveyegge/wyvern",
"beads": {
"prefix": "wyv",
"sync_remote": "origin" // Optional: git remote for bd sync
}
}
```
The rig's `.beads/` directory is always at the rig root. Gas Town:
1. Creates `.beads/` when adding a rig (`gt rig add`)
2. Runs `bd init --prefix ` to initialize it
3. Sets `BEADS_DIR` environment variable when spawning agents
This ensures all agents in the rig share a single beads database, separate from any beads the project itself might use.
## CLI Commands
### Town Management
```bash
gt install [path] # Install Gas Town at path
gt doctor # Check workspace health
gt doctor --fix # Auto-fix issues
```
### Agent Operations
```bash
gt status # Overall town status
gt rigs # List all rigs
gt polecats # List polecats in a rig
```
### Communication
```bash
gt inbox # Check inbox
gt send -s "Subject" -m "Message"
gt inject "Message" # Direct injection to session
gt capture "" # Run command in polecat session
```
### Session Management
```bash
gt spawn --issue # Start polecat on issue
gt kill # Kill polecat session
gt wake # Mark polecat as active
gt sleep # Mark polecat as inactive
```
### Landing & Merge Queue
```bash
gt merge-queue add # Add to merge queue (normal flow)
gt merge-queue list # Show pending merges
gt refinery process # Trigger Refinery to process queue
gt land --direct / # Direct landing (bypass Refinery)
gt land --direct --force ... # Skip safety checks
gt land --direct --skip-tests ... # Skip test verification
gt land --direct --dry-run ... # Preview only
```
### Emergency Operations
```bash
gt stop --all # Kill ALL sessions (emergency halt)
gt stop --rig # Kill all sessions in one rig
gt doctor --fix # Auto-repair common issues
```
## Plugins
Gas Town supports **plugins** - but in the simplest possible way: plugins are just more agents.
### Philosophy
Gas Town is intentionally rough and lightweight. A "credible plugin system" with manifests, schemas, and invocation frameworks would be pretentious for a project named after a Mad Max wasteland. Instead, plugins follow the same patterns as all Gas Town agents:
- **Identity**: Plugins have persistent identities like polecats and witnesses
- **Communication**: Plugins use mail for input/output
- **Artifacts**: Plugins produce beads, files, or other handoff artifacts
- **Lifecycle**: Plugins can be invoked on-demand or at specific workflow points
### Plugin Structure
Plugins live in a rig's `plugins/` directory:
```
wyvern/ # Rig
βββ plugins/
β βββ merge-oracle/ # A plugin
β βββ rig/ # Plugin's git clone (if needed)
β βββ CLAUDE.md # Plugin's instructions/prompts
β βββ mail/inbox.jsonl # Plugin's mailbox
β βββ state.json # Plugin state (optional)
```
That's it. No plugin.yaml, no special registration. If the directory exists, the plugin exists.
### Invoking Plugins
Plugins are invoked like any other agent - via mail:
```bash
# Refinery asks merge-oracle to analyze pending changesets
gt send wyvern/plugins/merge-oracle -s "Analyze merge queue" -m "..."
# Mayor asks plan-oracle for a work breakdown
gt send beads/plugins/plan-oracle -s "Plan for bd-xyz" -m "..."
```
Plugins do their work (potentially spawning Claude sessions) and respond via mail, creating any necessary artifacts (beads, files, branches).
### Hook Points
Existing agents can be configured to notify plugins at specific points. This is just convention - agents check if a plugin exists and mail it:
| Workflow Point | Agent | Example Plugin |
|----------------|-------|----------------|
| Before merge processing | Refinery | merge-oracle |
| Before work dispatch | Mayor | plan-oracle |
| On worker stuck | Witness | debug-oracle |
| On PR ready | Refinery | review-oracle |
Configuration is minimal - perhaps a line in the agent's CLAUDE.md or state.json noting which plugins to consult.
### Example: Merge Oracle
The **merge-oracle** plugin analyzes changesets before the Refinery processes them:
**Input** (via mail from Refinery):
- List of pending changesets
- Current merge queue state
**Processing**:
1. Build overlap graph (which changesets touch same files/regions)
2. Classify disjointness (fully disjoint β parallel safe, overlapping β needs sequencing)
3. Use LLM to assess semantic complexity of overlapping components
4. Identify high-risk patterns (deletions vs modifications, conflicting business logic)
**Output**:
- Bead with merge plan (parallel groups, sequential chains)
- Mail to Refinery with recommendation (proceed / escalate to Mayor)
- If escalation needed: mail to Mayor with explanation
The merge-oracle's `CLAUDE.md` contains the prompts and classification criteria. Gas Town doesn't need to know the internals.
### Example: Plan Oracle
The **plan-oracle** plugin helps decompose work:
**Input**: An issue/epic that needs breakdown
**Processing**:
1. Analyze the scope and requirements
2. Identify dependencies and blockers
3. Estimate complexity (for parallelization decisions)
4. Suggest task breakdown
**Output**:
- Beads for the sub-tasks (created via `bd create`)
- Dependency links (via `bd dep add`)
- Mail back with summary and recommendations
### Why This Design
1. **Fits Gas Town's aesthetic**: Rough, text-based, agent-shaped
2. **Zero new infrastructure**: Uses existing mail, beads, identities
3. **Composable**: Plugins can invoke other plugins
4. **Debuggable**: Just look at mail logs and bead history
5. **Extensible**: Anyone can add a plugin by creating a directory
### Plugin Discovery
```bash
gt plugins # List plugins in a rig
gt plugin status # Check plugin state
```
Or just `ls /plugins/`.
## Failure Modes and Recovery
Gas Town is designed for resilience. Common failure modes and their recovery:
| Failure | Detection | Recovery |
|---------|-----------|----------|
| Agent crash | Session gone, state shows 'working' | `gt doctor` detects, reset state to idle |
| Git dirty state | Witness pre-kill check fails | Nudge worker, or manual commit/discard |
| Beads sync conflict | `bd sync` fails | Beads tombstones handle most cases |
| Tmux crash | All sessions inaccessible | `gt doctor --fix` cleans up |
| Stuck work | No progress for 30+ minutes | Witness escalates, Overseer intervenes |
| Disk full | Write operations fail | Clean logs, remove old clones |
### Recovery Principles
1. **Fail safe**: Prefer stopping over corrupting data
2. **State is recoverable**: Git and beads have built-in recovery
3. **Doctor heals**: `gt doctor --fix` handles common issues
4. **Emergency stop**: `gt stop --all` as last resort
5. **Human escalation**: Some failures need Overseer intervention
### Doctor Checks
`gt doctor` performs health checks at both workspace and rig levels:
**Workspace checks**: Config validity, Mayor mailbox, rig registry
**Rig checks**: Git state, clone health, Witness/Refinery presence
**Work checks**: Stuck detection, zombie sessions, heartbeat health
Run `gt doctor` regularly. Run `gt doctor --fix` to auto-repair issues.
## Federation: Outposts
Federation enables Gas Town to scale across machines via **Outposts** - remote compute environments that can run workers.
**Full design**: See `docs/federation-design.md`
### Outpost Types
| Type | Description | Cost Model | Best For |
|------|-------------|------------|----------|
| Local | Current tmux model | Free | Development, primary work |
| SSH/VM | Full Gas Town clone on VM | Always-on | Long-running, autonomous |
| CloudRun | Container workers on GCP | Pay-per-use | Burst, elastic, background |
### Core Abstraction
```go
type Outpost interface {
Name() string
Type() OutpostType // local, ssh, cloudrun
MaxWorkers() int
ActiveWorkers() int
Spawn(issue string, config WorkerConfig) (Worker, error)
Workers() []Worker
Ping() error
}
type Worker interface {
ID() string
Outpost() string
Status() WorkerStatus // idle, working, done, failed
Issue() string
Attach() error // for interactive outposts
Logs() (io.Reader, error)
Stop() error
}
```
### Configuration
```yaml
# ~/ai/config/outposts.yaml
outposts:
- name: local
type: local
max_workers: 4
- name: gce-burst
type: ssh
host: 10.0.0.5
user: steve
town_path: /home/steve/ai
max_workers: 8
- name: cloudrun-burst
type: cloudrun
project: my-gcp-project
region: us-central1
service: gastown-worker
max_workers: 20
cost_cap_hourly: 5.00
policy:
default_preference: [local, gce-burst, cloudrun-burst]
```
### Cloud Run Workers
Cloud Run enables elastic, pay-per-use workers:
- **Persistent HTTP/2 connections** solve cold start (zero-to-one) problem
- **Cost**: ~$0.017 per 5-minute worker session
- **Scaling**: 0βN automatically based on demand
- **When idle**: Scales to zero, costs nothing
Workers receive work via HTTP, clone code from git, run Claude, push results. No filesystem mail needed - HTTP is the control plane.
### SSH/VM Outposts
Full Gas Town clone on remote machines:
- **Model**: Complete town installation via SSH
- **Workers**: Remote tmux sessions
- **Sync**: Git for code and beads
- **Good for**: Long-running work, full autonomy if disconnected
### Design Principles
1. **Outpost abstraction** - Support multiple backends via unified interface
2. **Local-first** - Remote outposts are for overflow/burst, not primary
3. **Git as source of truth** - Code and beads sync everywhere
4. **HTTP for Cloud Run** - Don't force mail onto stateless containers
5. **Graceful degradation** - System works with any subset of outposts
### Architecture Diagram
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MAYOR β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Outpost Manager β β
β β - Tracks all registered outposts β β
β β - Routes work to appropriate outpost β β
β β - Monitors worker status across outposts β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββ ββββββββββββ ββββββββββββββββ β
β β Local β β SSH β β CloudRun β β
β β Outpost β β Outpost β β Outpost β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββββ¬ββββββββ β
βββββββββΌβββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββββββ
β tmux β β SSH β β HTTP/2 β
β panes β βsessions β β connections β
βββββββββββ βββββββββββ βββββββββββββββ
β β β
ββββββββββββββββΌβββββββββββββββββββ
βΌ
βββββββββββββββββββ
β Git Repos β
β (code + beads) β
βββββββββββββββββββ
```
### CLI Commands
```bash
gt outpost list # List configured outposts
gt outpost status [name] # Detailed status
gt outpost add ... # Add new outpost
gt outpost ping # Test connectivity
```
### Implementation Status
Federation is tracked in **gt-9a2** (P3 epic). Key tasks:
- `gt-9a2.1`: Outpost/Worker interfaces
- `gt-9a2.2`: LocalOutpost (refactor current spawning)
- `gt-9a2.5`: SSHOutpost
- `gt-9a2.8`: CloudRunOutpost
## Implementation Status
Gas Town is being ported from Python (gastown-py) to Go (gastown). The Go port (GGT) is in development:
- **Epic**: gt-u1j (Port Gas Town to Go)
- **Scaffolding**: gt-u1j.1 (Go scaffolding - blocker for implementation)
- **Management**: gt-f9x (Town & Rig Management: install, doctor, federation)
See beads issues with `bd list --status=open` for current work items.