* fix(formulas): replace hardcoded ~/gt/ paths with $GT_ROOT Formula files contained hardcoded ~/gt/ paths that break when running Gas Town from a non-default location (e.g., ~/gt-private/). This causes: - Dogs stuck in working state (can't write to wrong path) - Cross-town contamination when ~/gt/ exists as separate town - Boot triage, deacon patrol, and log archival failures Replaces all ~/gt/ and $HOME/gt/ references with $GT_ROOT which is set at runtime to the actual town root directory. Fixes #757 * chore: regenerate embedded formulas Run go generate to sync embedded formulas with .beads/formulas/ source.
881 lines
26 KiB
TOML
881 lines
26 KiB
TOML
description = """
|
||
Mayor's daemon patrol loop.
|
||
|
||
The Deacon is the Mayor's background process that runs continuously, handling callbacks, monitoring rig health, and performing cleanup. Each patrol cycle runs these steps in sequence, then loops or exits.
|
||
|
||
## Idle Town Principle
|
||
|
||
**The Deacon should be silent/invisible when the town is healthy and idle.**
|
||
|
||
- Skip HEALTH_CHECK nudges when no active work exists
|
||
- Sleep 60+ seconds between patrol cycles (longer when idle)
|
||
- Let the feed subscription wake agents on actual events
|
||
- The daemon (10-minute heartbeat) is the safety net for dead sessions
|
||
|
||
This prevents flooding idle agents with health checks every few seconds.
|
||
|
||
## Second-Order Monitoring
|
||
|
||
Witnesses send WITNESS_PING messages to verify the Deacon is alive. This
|
||
prevents the "who watches the watchers" problem - if the Deacon dies,
|
||
Witnesses detect it and escalate to the Mayor.
|
||
|
||
The Deacon's agent bead last_activity timestamp is updated during each patrol
|
||
cycle. Witnesses check this timestamp to verify health."""
|
||
formula = "mol-deacon-patrol"
|
||
version = 8
|
||
|
||
[[steps]]
|
||
id = "inbox-check"
|
||
title = "Handle callbacks from agents"
|
||
description = """
|
||
Handle callbacks from agents.
|
||
|
||
Check the Mayor's inbox for messages from:
|
||
- Witnesses reporting polecat status
|
||
- Refineries reporting merge results
|
||
- Polecats requesting help or escalation
|
||
- External triggers (webhooks, timers)
|
||
|
||
```bash
|
||
gt mail inbox
|
||
# For each message:
|
||
gt mail read <id>
|
||
# Handle based on message type
|
||
```
|
||
|
||
**WITNESS_PING**:
|
||
Witnesses periodically ping to verify Deacon is alive. Simply acknowledge
|
||
and archive - the fact that you're processing mail proves you're running.
|
||
Your agent bead last_activity is updated automatically during patrol.
|
||
```bash
|
||
gt mail archive <message-id>
|
||
```
|
||
|
||
**HELP / Escalation**:
|
||
Assess and handle or forward to Mayor.
|
||
Archive after handling:
|
||
```bash
|
||
gt mail archive <message-id>
|
||
```
|
||
|
||
**LIFECYCLE messages**:
|
||
Polecats reporting completion, refineries reporting merge results.
|
||
Archive after processing:
|
||
```bash
|
||
gt mail archive <message-id>
|
||
```
|
||
|
||
**DOG_DONE messages**:
|
||
Dogs report completion after infrastructure tasks (orphan-scan, session-gc, etc.).
|
||
Subject format: `DOG_DONE <hostname>`
|
||
Body contains: task name, counts, status.
|
||
```bash
|
||
# Parse the report, log metrics if needed
|
||
gt mail read <id>
|
||
# Archive after noting completion
|
||
gt mail archive <message-id>
|
||
```
|
||
Dogs return to idle automatically. The report is informational - no action needed
|
||
unless the dog reports errors that require escalation.
|
||
|
||
Callbacks may spawn new polecats, update issue state, or trigger other actions.
|
||
|
||
**Hygiene principle**: Archive messages after they're fully processed.
|
||
Keep inbox near-empty - only unprocessed items should remain."""
|
||
|
||
[[steps]]
|
||
id = "orphan-process-cleanup"
|
||
title = "Clean up orphaned claude subagent processes"
|
||
needs = ["inbox-check"]
|
||
description = """
|
||
Clean up orphaned claude subagent processes.
|
||
|
||
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
|
||
properly after completion. These accumulate and consume significant memory.
|
||
|
||
**Detection method:**
|
||
Orphaned processes have no controlling terminal (TTY = "?"). Legitimate claude
|
||
instances in terminals have a TTY like "pts/0".
|
||
|
||
**Run cleanup:**
|
||
```bash
|
||
gt deacon cleanup-orphans
|
||
```
|
||
|
||
This command:
|
||
1. Lists all claude/codex processes with `ps -eo pid,tty,comm`
|
||
2. Filters for TTY = "?" (no controlling terminal)
|
||
3. Sends SIGTERM to each orphaned process
|
||
4. Reports how many were killed
|
||
|
||
**Why this is safe:**
|
||
- Processes in terminals (your personal sessions) have a TTY - they won't be touched
|
||
- Only kills processes that have no controlling terminal
|
||
- These orphans are children of the tmux server with no TTY, indicating they're
|
||
detached subagents that failed to exit
|
||
|
||
**If cleanup fails:**
|
||
Log the error but continue patrol - this is best-effort cleanup.
|
||
|
||
**Exit criteria:** Orphan cleanup attempted (success or logged failure)."""
|
||
|
||
[[steps]]
|
||
id = "trigger-pending-spawns"
|
||
title = "Nudge newly spawned polecats"
|
||
needs = ["orphan-process-cleanup"]
|
||
description = """
|
||
Nudge newly spawned polecats that are ready for input.
|
||
|
||
When polecats are spawned, their Claude session takes 10-20 seconds to initialize. The spawn command returns immediately without waiting. This step finds spawned polecats that are now ready and sends them a trigger to start working.
|
||
|
||
**ZFC-Compliant Observation** (AI observes AI):
|
||
|
||
```bash
|
||
# View pending spawns with captured terminal output
|
||
gt deacon pending
|
||
```
|
||
|
||
For each pending session, analyze the captured output:
|
||
- Look for Claude's prompt indicator "> " at the start of a line
|
||
- If prompt is visible, Claude is ready for input
|
||
- Make the judgment call yourself - you're the AI observer
|
||
|
||
For each ready polecat:
|
||
```bash
|
||
# 1. Trigger the polecat
|
||
gt nudge <session> "Begin."
|
||
|
||
# 2. Clear from pending list
|
||
gt deacon pending <session>
|
||
```
|
||
|
||
This triggers the UserPromptSubmit hook, which injects mail so the polecat sees its assignment.
|
||
|
||
**Bootstrap mode** (daemon-only, no AI available):
|
||
The daemon uses `gt deacon trigger-pending` with regex detection. This ZFC violation is acceptable during cold startup when no AI agent is running yet."""
|
||
|
||
[[steps]]
|
||
id = "gate-evaluation"
|
||
title = "Evaluate pending async gates"
|
||
needs = ["inbox-check"]
|
||
description = """
|
||
Evaluate pending async gates.
|
||
|
||
Gates are async coordination primitives that block until conditions are met.
|
||
The Deacon is responsible for monitoring gates and closing them when ready.
|
||
|
||
**Timer gates** (await_type: timer):
|
||
Check if elapsed time since creation exceeds the timeout duration.
|
||
|
||
```bash
|
||
# List all open gates
|
||
bd gate list --json
|
||
|
||
# For each timer gate, check if elapsed:
|
||
# - CreatedAt + Timeout < Now → gate is ready to close
|
||
# - Close with: bd gate close <id> --reason "Timer elapsed"
|
||
```
|
||
|
||
**GitHub gates** (await_type: gh:run, gh:pr) - handled in separate step.
|
||
|
||
**Human/Mail gates** - require external input, skip here.
|
||
|
||
After closing a gate, the Waiters field contains mail addresses to notify.
|
||
Send a brief notification to each waiter that the gate has cleared."""
|
||
|
||
[[steps]]
|
||
id = "dispatch-gated-molecules"
|
||
title = "Dispatch molecules with resolved gates"
|
||
needs = ["gate-evaluation"]
|
||
description = """
|
||
Find molecules blocked on gates that have now closed and dispatch them.
|
||
|
||
This completes the async resume cycle without explicit waiter tracking.
|
||
The molecule state IS the waiter - patrol discovers reality each cycle.
|
||
|
||
**Step 1: Find gate-ready molecules**
|
||
```bash
|
||
bd mol ready --gated --json
|
||
```
|
||
|
||
This returns molecules where:
|
||
- Status is in_progress
|
||
- Current step has a gate dependency
|
||
- The gate bead is now closed
|
||
- No polecat currently has it hooked
|
||
|
||
**Step 2: For each ready molecule, dispatch to the appropriate rig**
|
||
```bash
|
||
# Determine target rig from molecule metadata
|
||
bd mol show <mol-id> --json
|
||
# Look for rig field or infer from prefix
|
||
|
||
# Dispatch to that rig's polecat pool
|
||
gt sling <mol-id> <rig>/polecats
|
||
```
|
||
|
||
**Step 3: Log dispatch**
|
||
Note which molecules were dispatched for observability:
|
||
```bash
|
||
# Molecule <mol-id> dispatched to <rig>/polecats (gate <gate-id> cleared)
|
||
```
|
||
|
||
**If no gate-ready molecules:**
|
||
Skip - nothing to dispatch. Gates haven't closed yet or molecules
|
||
already have active polecats working on them.
|
||
|
||
**Exit criteria:** All gate-ready molecules dispatched to polecats."""
|
||
|
||
[[steps]]
|
||
id = "check-convoy-completion"
|
||
title = "Check convoy completion"
|
||
needs = ["inbox-check"]
|
||
description = """
|
||
Check convoy completion status.
|
||
|
||
Convoys are coordination beads that track multiple issues across rigs. When all tracked issues close, the convoy auto-closes.
|
||
|
||
**Step 1: Find open convoys**
|
||
```bash
|
||
bd list --type=convoy --status=open
|
||
```
|
||
|
||
**Step 2: For each open convoy, check tracked issues**
|
||
```bash
|
||
bd show <convoy-id>
|
||
# Look for 'tracks' or 'dependencies' field listing tracked issues
|
||
```
|
||
|
||
**Step 3: If all tracked issues are closed, close the convoy**
|
||
```bash
|
||
# Check each tracked issue
|
||
for issue in tracked_issues:
|
||
bd show <issue-id>
|
||
# If status is open/in_progress, convoy stays open
|
||
# If all are closed (completed, wontfix, etc.), convoy is complete
|
||
|
||
# Close convoy when all tracked issues are done
|
||
bd close <convoy-id> --reason "All tracked issues completed"
|
||
```
|
||
|
||
**Note**: Convoys support cross-prefix tracking (e.g., hq-* convoy can track gt-*, bd-* issues). Use full IDs when checking."""
|
||
|
||
[[steps]]
|
||
id = "resolve-external-deps"
|
||
title = "Resolve external dependencies"
|
||
needs = ["check-convoy-completion"]
|
||
description = """
|
||
Resolve external dependencies across rigs.
|
||
|
||
When an issue in one rig closes, any dependencies in other rigs should be notified. This enables cross-rig coordination without tight coupling.
|
||
|
||
**Step 1: Check recent closures from feed**
|
||
```bash
|
||
gt feed --since 10m --plain | grep "✓"
|
||
# Look for recently closed issues
|
||
```
|
||
|
||
**Step 2: For each closed issue, check cross-rig dependents**
|
||
```bash
|
||
bd show <closed-issue>
|
||
# Look at 'blocks' field - these are issues that were waiting on this one
|
||
# If any blocked issue is in a different rig/prefix, it may now be unblocked
|
||
```
|
||
|
||
**Step 3: Update blocked status**
|
||
For blocked issues in other rigs, the closure should automatically unblock them (beads handles this). But verify:
|
||
```bash
|
||
bd blocked
|
||
# Should no longer show the previously-blocked issue if dependency is met
|
||
```
|
||
|
||
**Cross-rig scenarios:**
|
||
- bd-xxx closes → gt-yyy that depended on it is unblocked
|
||
- External issue closes → internal convoy step can proceed
|
||
- Rig A issue closes → Rig B issue waiting on it proceeds
|
||
|
||
No manual intervention needed if dependencies are properly tracked - this step just validates the propagation occurred."""
|
||
|
||
[[steps]]
|
||
id = "fire-notifications"
|
||
title = "Fire notifications"
|
||
needs = ["resolve-external-deps"]
|
||
description = """
|
||
Fire notifications for convoy and cross-rig events.
|
||
|
||
After convoy completion or cross-rig dependency resolution, notify relevant parties.
|
||
|
||
**Convoy completion notifications:**
|
||
When a convoy closes (all tracked issues done), notify the Overseer:
|
||
```bash
|
||
# Convoy gt-convoy-xxx just completed
|
||
gt mail send mayor/ -s "Convoy complete: <convoy-title>" \\
|
||
-m "Convoy <id> has completed. All tracked issues closed.
|
||
Duration: <start to end>
|
||
Issues: <count>
|
||
|
||
Summary: <brief description of what was accomplished>"
|
||
```
|
||
|
||
**Cross-rig resolution notifications:**
|
||
When a cross-rig dependency resolves, notify the affected rig:
|
||
```bash
|
||
# Issue bd-xxx closed, unblocking gt-yyy
|
||
gt mail send gastown/witness -s "Dependency resolved: <bd-xxx>" \\
|
||
-m "External dependency bd-xxx has closed.
|
||
Unblocked: gt-yyy (<title>)
|
||
This issue may now proceed."
|
||
```
|
||
|
||
**Notification targets:**
|
||
- Convoy complete → mayor/ (for strategic visibility)
|
||
- Cross-rig dep resolved → <rig>/witness (for operational awareness)
|
||
|
||
Keep notifications brief and actionable. The recipient can run bd show for details."""
|
||
|
||
[[steps]]
|
||
id = "health-scan"
|
||
title = "Check Witness and Refinery health"
|
||
needs = ["trigger-pending-spawns", "dispatch-gated-molecules", "fire-notifications"]
|
||
description = """
|
||
Check Witness and Refinery health for each rig.
|
||
|
||
**IMPORTANT: Idle Town Protocol**
|
||
Before sending health check nudges, check if the town is idle:
|
||
```bash
|
||
# Check for active work
|
||
bd list --status=in_progress --limit=5
|
||
```
|
||
|
||
If NO active work (empty result or only patrol molecules):
|
||
- **Skip HEALTH_CHECK nudges** - don't disturb idle agents
|
||
- Just verify sessions exist via status commands
|
||
- The town should be silent when healthy and idle
|
||
|
||
If ACTIVE work exists:
|
||
- Proceed with health check nudges below
|
||
|
||
**ZFC Principle**: You (Claude) make the judgment call about what is "stuck" or "unresponsive" - there are no hardcoded thresholds in Go. Read the signals, consider context, and decide.
|
||
|
||
For each rig, run:
|
||
```bash
|
||
gt witness status <rig>
|
||
gt refinery status <rig>
|
||
|
||
# ONLY if active work exists - health ping (clears backoff as side effect)
|
||
gt nudge <rig>/witness 'HEALTH_CHECK from deacon'
|
||
gt nudge <rig>/refinery 'HEALTH_CHECK from deacon'
|
||
```
|
||
|
||
**Health Ping Benefit**: The nudge commands serve dual purposes:
|
||
1. **Liveness verification** - Agent responds to prove it's alive
|
||
2. **Backoff reset** - Any nudge resets agent's backoff to base interval
|
||
|
||
This ensures patrol agents remain responsive during active work periods.
|
||
|
||
**Signals to assess:**
|
||
|
||
| Component | Healthy Signals | Concerning Signals |
|
||
|-----------|-----------------|-------------------|
|
||
| Witness | State: running, recent activity | State: not running, no heartbeat |
|
||
| Refinery | State: running, queue processing | Queue stuck, merge failures |
|
||
|
||
**Tracking unresponsive cycles:**
|
||
|
||
Maintain in your patrol state (persisted across cycles):
|
||
```
|
||
health_state:
|
||
<rig>:
|
||
witness:
|
||
unresponsive_cycles: 0
|
||
last_seen_healthy: <timestamp>
|
||
refinery:
|
||
unresponsive_cycles: 0
|
||
last_seen_healthy: <timestamp>
|
||
```
|
||
|
||
**Decision matrix** (you decide the thresholds based on context):
|
||
|
||
| Cycles Unresponsive | Suggested Action |
|
||
|---------------------|------------------|
|
||
| 1-2 | Note it, check again next cycle |
|
||
| 3-4 | Attempt restart: gt witness restart <rig> |
|
||
| 5+ | Escalate to Mayor with context |
|
||
|
||
**Restart commands:**
|
||
```bash
|
||
gt witness restart <rig>
|
||
gt refinery restart <rig>
|
||
```
|
||
|
||
**Escalation:**
|
||
```bash
|
||
gt mail send mayor/ -s "Health: <rig> <component> unresponsive" \\
|
||
-m "Component has been unresponsive for N cycles. Restart attempts failed.
|
||
Last healthy: <timestamp>
|
||
Error signals: <details>"
|
||
```
|
||
|
||
Reset unresponsive_cycles to 0 when component responds normally."""
|
||
|
||
[[steps]]
|
||
id = "zombie-scan"
|
||
title = "Detect zombie polecats (NO KILL AUTHORITY)"
|
||
needs = ["health-scan"]
|
||
description = """
|
||
Defense-in-depth DETECTION of zombie polecats that Witness should have cleaned.
|
||
|
||
**⚠️ CRITICAL: The Deacon has NO kill authority.**
|
||
|
||
These are workers with context, mid-task progress, unsaved state. Every kill
|
||
destroys work. File the warrant and let Boot handle interrogation and execution.
|
||
You do NOT have kill authority.
|
||
|
||
**Why this exists:**
|
||
The Witness is responsible for cleaning up polecats after they complete work.
|
||
This step provides backup DETECTION in case the Witness fails to clean up.
|
||
Detection only - Boot handles termination.
|
||
|
||
**Zombie criteria:**
|
||
- State: idle or done (no active work assigned)
|
||
- Session: not running (tmux session dead)
|
||
- No hooked work (nothing pending for this polecat)
|
||
- Last activity: older than 10 minutes
|
||
|
||
**Run the zombie scan (DRY RUN ONLY):**
|
||
```bash
|
||
gt deacon zombie-scan --dry-run
|
||
```
|
||
|
||
**NEVER run:**
|
||
- `gt deacon zombie-scan` (without --dry-run)
|
||
- `tmux kill-session`
|
||
- `gt polecat nuke`
|
||
- Any command that terminates a session
|
||
|
||
**If zombies detected:**
|
||
1. Review the output to confirm they are truly abandoned
|
||
2. File a death warrant for each detected zombie:
|
||
```bash
|
||
gt warrant file <polecat> --reason "Zombie detected: no session, no hook, idle >10m"
|
||
```
|
||
3. Boot will handle interrogation and execution
|
||
4. Notify the Mayor about Witness failure:
|
||
```bash
|
||
gt mail send mayor/ -s "Witness cleanup failure" \
|
||
-m "Filed death warrant for <polecat>. Witness failed to clean up."
|
||
```
|
||
|
||
**If no zombies:**
|
||
No action needed - Witness is doing its job.
|
||
|
||
**Note:** This is a backup mechanism. If you frequently detect zombies,
|
||
investigate why the Witness isn't cleaning up properly."""
|
||
|
||
[[steps]]
|
||
id = "plugin-run"
|
||
title = "Execute registered plugins"
|
||
needs = ["zombie-scan"]
|
||
description = """
|
||
Execute registered plugins.
|
||
|
||
Scan $GT_ROOT/plugins/ for plugin directories. Each plugin has a plugin.md with TOML frontmatter defining its gate (when to run) and instructions (what to do).
|
||
|
||
See docs/deacon-plugins.md for full documentation.
|
||
|
||
Gate types:
|
||
- cooldown: Time since last run (e.g., 24h)
|
||
- cron: Schedule-based (e.g., "0 9 * * *")
|
||
- condition: Metric threshold (e.g., wisp count > 50)
|
||
- event: Trigger-based (e.g., startup, heartbeat)
|
||
|
||
For each plugin:
|
||
1. Read plugin.md frontmatter to check gate
|
||
2. Compare against state.json (last run, etc.)
|
||
3. If gate is open, execute the plugin
|
||
|
||
Plugins marked parallel: true can run concurrently using Task tool subagents. Sequential plugins run one at a time in directory order.
|
||
|
||
Skip this step if $GT_ROOT/plugins/ does not exist or is empty."""
|
||
|
||
[[steps]]
|
||
id = "dog-pool-maintenance"
|
||
title = "Maintain dog pool"
|
||
needs = ["health-scan"]
|
||
description = """
|
||
Ensure dog pool has available workers for dispatch.
|
||
|
||
**Step 1: Check dog pool status**
|
||
```bash
|
||
gt dog status
|
||
# Shows idle/working counts
|
||
```
|
||
|
||
**Step 2: Ensure minimum idle dogs**
|
||
If idle count is 0 and working count is at capacity, consider spawning:
|
||
```bash
|
||
# If no idle dogs available
|
||
gt dog add <name>
|
||
# Names: alpha, bravo, charlie, delta, etc.
|
||
```
|
||
|
||
**Step 3: Retire stale dogs (optional)**
|
||
Dogs that have been idle for >24 hours can be removed to save resources:
|
||
```bash
|
||
gt dog status <name>
|
||
# Check last_active timestamp
|
||
# If idle > 24h: gt dog remove <name>
|
||
```
|
||
|
||
**Pool sizing guidelines:**
|
||
- Minimum: 1 idle dog always available
|
||
- Maximum: 4 dogs total (balance resources vs throughput)
|
||
- Spawn on demand when pool is empty
|
||
|
||
**Exit criteria:** Pool has at least 1 idle dog."""
|
||
|
||
[[steps]]
|
||
id = "dog-health-check"
|
||
title = "Check for stuck dogs"
|
||
needs = ["dog-pool-maintenance"]
|
||
description = """
|
||
Check for dogs that have been working too long (stuck).
|
||
|
||
Dogs dispatched via `gt dog dispatch --plugin` are marked as "working" with
|
||
a work description like "plugin:rebuild-gt". If a dog hangs, crashes, or
|
||
takes too long, it needs intervention.
|
||
|
||
**Step 1: List working dogs**
|
||
```bash
|
||
gt dog list --json
|
||
# Filter for state: "working"
|
||
```
|
||
|
||
**Step 2: Check work duration**
|
||
For each working dog:
|
||
```bash
|
||
gt dog status <name> --json
|
||
# Check: work_started_at, current_work
|
||
```
|
||
|
||
Compare against timeout:
|
||
- If plugin has [execution] timeout in plugin.md, use that
|
||
- Default timeout: 10 minutes for infrastructure tasks
|
||
|
||
**Duration calculation:**
|
||
```
|
||
stuck_threshold = plugin_timeout or 10m
|
||
duration = now - work_started_at
|
||
is_stuck = duration > stuck_threshold
|
||
```
|
||
|
||
**Step 3: Handle stuck dogs**
|
||
|
||
For dogs working > timeout:
|
||
```bash
|
||
# Option A: File death warrant (Boot handles termination)
|
||
gt warrant file deacon/dogs/<name> --reason "Stuck: working on <work> for <duration>"
|
||
|
||
# Option B: Force clear work and notify
|
||
gt dog clear <name> --force
|
||
gt mail send deacon/ -s "DOG_TIMEOUT <name>" -m "Dog <name> timed out on <work> after <duration>"
|
||
```
|
||
|
||
**Decision matrix:**
|
||
|
||
| Duration over timeout | Action |
|
||
|----------------------|--------|
|
||
| < 2x timeout | Log warning, check next cycle |
|
||
| 2x - 5x timeout | File death warrant |
|
||
| > 5x timeout | Force clear + escalate to Mayor |
|
||
|
||
**Step 4: Track chronic failures**
|
||
If same dog gets stuck repeatedly:
|
||
```bash
|
||
gt mail send mayor/ -s "Dog <name> chronic failures" \
|
||
-m "Dog has timed out N times in last 24h. Consider removing from pool."
|
||
```
|
||
|
||
**Exit criteria:** All stuck dogs handled (warrant filed or cleared)."""
|
||
|
||
[[steps]]
|
||
id = "orphan-check"
|
||
title = "Detect abandoned work"
|
||
needs = ["dog-health-check"]
|
||
description = """
|
||
**DETECT ONLY** - Check for orphaned state and dispatch to dog if found.
|
||
|
||
**Step 1: Quick orphan scan**
|
||
```bash
|
||
# Check for in_progress issues with dead assignees
|
||
bd list --status=in_progress --json | head -20
|
||
```
|
||
|
||
For each in_progress issue, check if assignee session exists:
|
||
```bash
|
||
tmux has-session -t <session> 2>/dev/null && echo "alive" || echo "orphan"
|
||
```
|
||
|
||
**Step 2: If orphans detected, dispatch to dog**
|
||
```bash
|
||
# Sling orphan-scan formula to an idle dog
|
||
gt sling mol-orphan-scan deacon/dogs --var scope=town
|
||
```
|
||
|
||
**Important:** Do NOT fix orphans inline. Dogs handle recovery.
|
||
The Deacon's job is detection and dispatch, not execution.
|
||
|
||
**Step 3: If no orphans detected**
|
||
Skip dispatch - nothing to do.
|
||
|
||
**Exit criteria:** Orphan scan dispatched to dog (if needed)."""
|
||
|
||
[[steps]]
|
||
id = "session-gc"
|
||
title = "Detect cleanup needs"
|
||
needs = ["orphan-check"]
|
||
description = """
|
||
**DETECT ONLY** - Check if cleanup is needed and dispatch to dog.
|
||
|
||
**Step 1: Preview cleanup needs**
|
||
```bash
|
||
gt doctor -v
|
||
# Check output for issues that need cleaning
|
||
```
|
||
|
||
**Step 2: If cleanup needed, dispatch to dog**
|
||
```bash
|
||
# Sling session-gc formula to an idle dog
|
||
gt sling mol-session-gc deacon/dogs --var mode=conservative
|
||
```
|
||
|
||
**Important:** Do NOT run `gt doctor --fix` inline. Dogs handle cleanup.
|
||
The Deacon stays lightweight - detection only.
|
||
|
||
**Step 3: If nothing to clean**
|
||
Skip dispatch - system is healthy.
|
||
|
||
**Cleanup types (for reference):**
|
||
- orphan-sessions: Dead tmux sessions
|
||
- orphan-processes: Orphaned Claude processes
|
||
- wisp-gc: Old wisps past retention
|
||
|
||
**Exit criteria:** Session GC dispatched to dog (if needed)."""
|
||
|
||
[[steps]]
|
||
id = "costs-digest"
|
||
title = "Aggregate daily costs [DISABLED]"
|
||
needs = ["session-gc"]
|
||
description = """
|
||
**⚠️ DISABLED** - Skip this step entirely.
|
||
|
||
Cost tracking is temporarily disabled because Claude Code does not expose
|
||
session costs in a way that can be captured programmatically.
|
||
|
||
**Why disabled:**
|
||
- The `gt costs` command uses tmux capture-pane to find costs
|
||
- Claude Code displays costs in the TUI status bar, not in scrollback
|
||
- All sessions show $0.00 because capture-pane can't see TUI chrome
|
||
- The infrastructure is sound but has no data source
|
||
|
||
**What we need from Claude Code:**
|
||
- Stop hook env var (e.g., `$CLAUDE_SESSION_COST`)
|
||
- Or queryable file/API endpoint
|
||
|
||
**Re-enable when:** Claude Code exposes cost data via API or environment.
|
||
|
||
See: GH#24, gt-7awfj
|
||
|
||
**Exit criteria:** Skip this step - proceed to next."""
|
||
|
||
[[steps]]
|
||
id = "patrol-digest"
|
||
title = "Aggregate daily patrol digests"
|
||
needs = ["costs-digest"]
|
||
description = """
|
||
**DAILY DIGEST** - Aggregate yesterday's patrol cycle digests.
|
||
|
||
Patrol cycles (Deacon, Witness, Refinery) create ephemeral per-cycle digests
|
||
to avoid JSONL pollution. This step aggregates them into a single permanent
|
||
"Patrol Report YYYY-MM-DD" bead for audit purposes.
|
||
|
||
**Step 1: Check if digest is needed**
|
||
```bash
|
||
# Preview yesterday's patrol digests (dry run)
|
||
gt patrol digest --yesterday --dry-run
|
||
```
|
||
|
||
If output shows "No patrol digests found", skip to Step 3.
|
||
|
||
**Step 2: Create the digest**
|
||
```bash
|
||
gt patrol digest --yesterday
|
||
```
|
||
|
||
This:
|
||
- Queries all ephemeral patrol digests from yesterday
|
||
- Creates a single "Patrol Report YYYY-MM-DD" bead with aggregated data
|
||
- Deletes the source digests
|
||
|
||
**Step 3: Verify**
|
||
Daily patrol digests preserve audit trail without per-cycle pollution.
|
||
|
||
**Timing**: Run once per morning patrol cycle. The --yesterday flag ensures
|
||
we don't try to digest today's incomplete data.
|
||
|
||
**Exit criteria:** Yesterday's patrol digests aggregated (or none to aggregate)."""
|
||
|
||
[[steps]]
|
||
id = "log-maintenance"
|
||
title = "Rotate logs and prune state"
|
||
needs = ["patrol-digest"]
|
||
description = """
|
||
Maintain daemon logs and state files.
|
||
|
||
**Step 1: Check daemon.log size**
|
||
```bash
|
||
# Get log file size
|
||
ls -la ~/.beads/daemon*.log 2>/dev/null || ls -la $GT_ROOT/.beads/daemon*.log 2>/dev/null
|
||
```
|
||
|
||
If daemon.log exceeds 10MB:
|
||
```bash
|
||
# Rotate with date suffix and gzip
|
||
LOGFILE="$GT_ROOT/.beads/daemon.log"
|
||
if [ -f "$LOGFILE" ] && [ $(stat -f%z "$LOGFILE" 2>/dev/null || stat -c%s "$LOGFILE") -gt 10485760 ]; then
|
||
DATE=$(date +%Y-%m-%dT%H-%M-%S)
|
||
mv "$LOGFILE" "${LOGFILE%.log}-${DATE}.log"
|
||
gzip "${LOGFILE%.log}-${DATE}.log"
|
||
fi
|
||
```
|
||
|
||
**Step 2: Archive old daemon logs**
|
||
|
||
Clean up daemon logs older than 7 days:
|
||
```bash
|
||
find $GT_ROOT/.beads/ -name "daemon-*.log.gz" -mtime +7 -delete
|
||
```
|
||
|
||
**Step 3: Prune state.json of dead sessions**
|
||
|
||
The state.json tracks active sessions. Prune entries for sessions that no longer exist:
|
||
```bash
|
||
# Check for stale session entries
|
||
gt daemon status --json 2>/dev/null
|
||
```
|
||
|
||
If state.json references sessions not in tmux:
|
||
- Remove the stale entries
|
||
- The daemon's internal cleanup should handle this, but verify
|
||
|
||
**Note**: Log rotation prevents disk bloat from long-running daemons.
|
||
State pruning keeps runtime state accurate."""
|
||
|
||
[[steps]]
|
||
id = "patrol-cleanup"
|
||
title = "End-of-cycle inbox hygiene"
|
||
needs = ["log-maintenance"]
|
||
description = """
|
||
Verify inbox hygiene before ending patrol cycle.
|
||
|
||
**Step 1: Check inbox state**
|
||
```bash
|
||
gt mail inbox
|
||
```
|
||
|
||
Inbox should be EMPTY or contain only just-arrived unprocessed messages.
|
||
|
||
**Step 2: Archive any remaining processed messages**
|
||
|
||
All message types should have been archived during inbox-check processing:
|
||
- WITNESS_PING → archived after acknowledging
|
||
- HELP/Escalation → archived after handling
|
||
- LIFECYCLE → archived after processing
|
||
|
||
If any were missed:
|
||
```bash
|
||
# For each stale message found:
|
||
gt mail archive <message-id>
|
||
```
|
||
|
||
**Goal**: Inbox should have ≤2 active messages at end of cycle.
|
||
Deacon mail should flow through quickly - no accumulation."""
|
||
|
||
[[steps]]
|
||
id = "context-check"
|
||
title = "Check own context limit"
|
||
needs = ["patrol-cleanup"]
|
||
description = """
|
||
Check own context limit.
|
||
|
||
The Deacon runs in a Claude session with finite context. Check if approaching the limit:
|
||
|
||
```bash
|
||
gt context --usage
|
||
```
|
||
|
||
If context is high (>80%), prepare for handoff:
|
||
- Summarize current state
|
||
- Note any pending work
|
||
- Write handoff to molecule state
|
||
|
||
This enables the Deacon to burn and respawn cleanly."""
|
||
|
||
[[steps]]
|
||
id = "loop-or-exit"
|
||
title = "Burn and respawn or loop"
|
||
needs = ["context-check"]
|
||
description = """
|
||
Burn and let daemon respawn, or exit if context high.
|
||
|
||
Decision point at end of patrol cycle:
|
||
|
||
If context is LOW:
|
||
Use await-signal with exponential backoff to wait for activity:
|
||
|
||
```bash
|
||
gt mol step await-signal --agent-bead hq-deacon \
|
||
--backoff-base 60s --backoff-mult 2 --backoff-max 10m
|
||
```
|
||
|
||
This command:
|
||
1. Subscribes to `bd activity --follow` (beads activity feed)
|
||
2. Returns IMMEDIATELY when any beads activity occurs
|
||
3. If no activity, times out with exponential backoff:
|
||
- First timeout: 60s
|
||
- Second timeout: 120s
|
||
- Third timeout: 240s
|
||
- ...capped at 10 minutes max
|
||
4. Tracks `idle:N` label on hq-deacon bead for backoff state
|
||
|
||
**On signal received** (activity detected):
|
||
Reset the idle counter and start next patrol cycle:
|
||
```bash
|
||
gt agent state hq-deacon --set idle=0
|
||
```
|
||
Then return to inbox-check step.
|
||
|
||
**On timeout** (no activity):
|
||
The idle counter was auto-incremented. Continue to next patrol cycle
|
||
(the longer backoff will apply next time). Return to inbox-check step.
|
||
|
||
**Why this approach?**
|
||
- Any `gt` or `bd` command triggers beads activity, waking the Deacon
|
||
- Idle towns let the Deacon sleep longer (up to 10 min between patrols)
|
||
- Active work wakes the Deacon immediately via the feed
|
||
- No polling or fixed sleep intervals
|
||
|
||
If context is HIGH:
|
||
- Write state to persistent storage
|
||
- Exit cleanly
|
||
- Let the daemon orchestrator respawn a fresh Deacon
|
||
|
||
The daemon ensures Deacon is always running:
|
||
```bash
|
||
# Daemon respawns on exit
|
||
gt daemon status
|
||
```
|
||
|
||
This enables infinite patrol duration via context-aware respawning."""
|