7.5 KiB
Operational State in Gas Town
Managing runtime state, degraded modes, and the Boot triage system.
Overview
Gas Town needs to track operational state: Is the Deacon's patrol muted? Is the system in degraded mode? When did state change, and why?
This document covers:
- Events: State transitions as beads
- Labels-as-state: Fast queries via role bead labels
- Boot: The dog that triages the Deacon
- Degraded mode: Operating without tmux
Events: State Transitions as Data
Operational state changes are recorded as event beads. Each event captures:
- What changed (
event_type) - Who caused it (
actor) - What was affected (
target) - Context (
payload) - When (
created_at)
Event Types
| Event Type | Description | Payload |
|---|---|---|
patrol.muted |
Patrol cycle disabled | {reason, until?} |
patrol.unmuted |
Patrol cycle re-enabled | {reason?} |
agent.started |
Agent session began | {session_id?} |
agent.stopped |
Agent session ended | {reason, outcome?} |
mode.degraded |
System entered degraded mode | {reason} |
mode.normal |
System returned to normal | {} |
Creating Events
# Mute deacon patrol
bd create --type=event --event-type=patrol.muted \
--actor=human:overseer --target=agent:deacon \
--payload='{"reason":"fixing convoy deadlock","until":"gt-abc1"}'
# System entered degraded mode
bd create --type=event --event-type=mode.degraded \
--actor=system:daemon --target=rig:greenplace \
--payload='{"reason":"tmux unavailable"}'
Querying Events
# Recent events for an agent
bd list --type=event --target=agent:deacon --limit=10
# All patrol state changes
bd list --type=event --event-type=patrol.muted
bd list --type=event --event-type=patrol.unmuted
# Events in the activity feed
bd activity --follow --type=event
Labels-as-State Pattern
Events capture the full history. Labels cache the current state for fast queries.
Convention
Labels use <dimension>:<value> format:
patrol:muted/patrol:activemode:degraded/mode:normalstatus:idle/status:working
State Change Flow
- Create event bead (full context, immutable)
- Update role bead labels (current state cache)
# Mute patrol
bd create --type=event --event-type=patrol.muted ...
bd update role-deacon --add-label=patrol:muted --remove-label=patrol:active
# Unmute patrol
bd create --type=event --event-type=patrol.unmuted ...
bd update role-deacon --add-label=patrol:active --remove-label=patrol:muted
Querying Current State
# Is deacon patrol muted?
bd show role-deacon | grep patrol:
# All agents with muted patrol
bd list --type=role --label=patrol:muted
# All agents in degraded mode
bd list --type=role --label=mode:degraded
Boot: The Deacon's Watchdog
See Watchdog Chain for the complete Daemon/Boot/Deacon architecture and design rationale.
Boot is a dog (Deacon helper) that triages the Deacon's health. The daemon pokes Boot instead of the Deacon directly, centralizing the "when to wake" decision in an agent that can reason about it.
Why Boot?
The daemon is dumb transport (ZFC principle). It can't decide:
- Is the Deacon stuck or just thinking?
- Should we interrupt or let it continue?
- Is the system in a state where nudging would help?
Boot is an agent that can observe and decide.
Boot's Lifecycle
Daemon tick
│
├── Check: Is Boot already running? (marker file)
│ └── Yes + recent: Skip this tick
│
└── Spawn Boot (fresh session each time)
│
└── Boot runs triage molecule
├── Observe (wisps, mail, git state, tmux panes)
├── Decide (start/wake/nudge/interrupt/nothing)
├── Act
├── Clean inbox (discard stale handoffs)
└── Handoff (or exit in degraded mode)
Boot is Always Fresh
Boot restarts on each daemon tick. This is intentional:
- Narrow scope makes restarts cheap
- Fresh context avoids accumulated confusion
- Handoff mail provides continuity without session persistence
- No keepalive needed
Boot's Decision Guidance
Agents may take several minutes on legitimate work - composing artifacts, running tools, deep analysis. Ten minutes or more in edge cases.
To assess whether an agent is stuck:
- Check the agent's last reported activity (recent wisps, mail sent, git commits)
- Observe the tmux pane output over a 30-second window
- Look for signs of progress vs. signs of hanging (tool prompt, error loop, silence)
Agents work in small steps with feedback. Most tasks complete in 2-3 minutes, but task nature matters.
Boot's options (increasing disruption):
- Let them continue (if progress is evident)
gt nudge <agent>(gentle wake signal)- Escape + chat (interrupt and ask what's happening)
- Request process restart (last resort, for true hangs)
Common false positives:
- Tool waiting for user confirmation
- Long-running test suite
- Large file read/write operations
Boot's Location
~/gt/deacon/dogs/boot/
Session name: gt-boot
Created/maintained by bd doctor.
Boot Commands
# Check Boot status
gt dog status boot
# Manual Boot run (debugging)
gt dog call boot
# Prime Boot with context
gt dog prime boot
Degraded Mode
Gas Town can operate without tmux, with reduced capabilities.
Detection
The daemon detects degraded mode mechanically and passes it to agents:
GT_DEGRADED=true # Set by daemon when tmux unavailable
Boot and other agents check this environment variable.
What Changes in Degraded Mode
| Capability | Normal | Degraded |
|---|---|---|
| Observe tmux panes | Yes | No |
| Interactive interrupt | Yes | No |
| Session management | Full | Limited |
| Agent spawn | tmux sessions | Direct spawn |
| Boot lifecycle | Handoff | Exit |
Agents in Degraded Mode
In degraded mode, agents:
- Cannot observe other agents' pane output
- Cannot interactively interrupt stuck agents
- Focus on beads/git state observation only
- Report anomalies but can't fix interactively
Boot specifically:
- Runs to completion and exits (no handoff)
- Limited to: start deacon, file beads, mail overseer
- Cannot: observe panes, nudge, interrupt
Recording Degraded Mode
# System entered degraded mode
bd create --type=event --event-type=mode.degraded \
--actor=system:daemon --target=rig:greenplace \
--payload='{"reason":"tmux unavailable"}'
bd update role-greenplace --add-label=mode:degraded --remove-label=mode:normal
Configuration vs State
| Type | Storage | Example |
|---|---|---|
| Static config | TOML files | Daemon tick interval |
| Operational state | Beads (events + labels) | Patrol muted |
| Runtime flags | Marker files | .deacon-disabled |
Static config rarely changes and doesn't need history. Operational state changes at runtime and benefits from audit trail. Marker files are fast checks that can trigger deeper beads queries.
Commands Summary
# Create operational event
bd create --type=event --event-type=<type> \
--actor=<entity> --target=<entity> --payload='<json>'
# Update state label
bd update <role-bead> --add-label=<dim>:<val> --remove-label=<dim>:<old>
# Query current state
bd list --type=role --label=<dim>:<val>
# Query state history
bd list --type=event --target=<entity>
# Boot management
gt dog status boot
gt dog call boot
gt dog prime boot
Events are the source of truth. Labels are the cache.