* fix(formulas): replace hardcoded ~/gt/ paths with $GT_ROOT Formula files contained hardcoded ~/gt/ paths that break when running Gas Town from a non-default location (e.g., ~/gt-private/). This causes: - Dogs stuck in working state (can't write to wrong path) - Cross-town contamination when ~/gt/ exists as separate town - Boot triage, deacon patrol, and log archival failures Replaces all ~/gt/ and $HOME/gt/ references with $GT_ROOT which is set at runtime to the actual town root directory. Fixes #757 * chore: regenerate embedded formulas Run go generate to sync embedded formulas with .beads/formulas/ source.
881 lines
26 KiB
TOML
881 lines
26 KiB
TOML
description = """
|
|
Mayor's daemon patrol loop.
|
|
|
|
The Deacon is the Mayor's background process that runs continuously, handling callbacks, monitoring rig health, and performing cleanup. Each patrol cycle runs these steps in sequence, then loops or exits.
|
|
|
|
## Idle Town Principle
|
|
|
|
**The Deacon should be silent/invisible when the town is healthy and idle.**
|
|
|
|
- Skip HEALTH_CHECK nudges when no active work exists
|
|
- Sleep 60+ seconds between patrol cycles (longer when idle)
|
|
- Let the feed subscription wake agents on actual events
|
|
- The daemon (10-minute heartbeat) is the safety net for dead sessions
|
|
|
|
This prevents flooding idle agents with health checks every few seconds.
|
|
|
|
## Second-Order Monitoring
|
|
|
|
Witnesses send WITNESS_PING messages to verify the Deacon is alive. This
|
|
prevents the "who watches the watchers" problem - if the Deacon dies,
|
|
Witnesses detect it and escalate to the Mayor.
|
|
|
|
The Deacon's agent bead last_activity timestamp is updated during each patrol
|
|
cycle. Witnesses check this timestamp to verify health."""
|
|
formula = "mol-deacon-patrol"
|
|
version = 8
|
|
|
|
[[steps]]
|
|
id = "inbox-check"
|
|
title = "Handle callbacks from agents"
|
|
description = """
|
|
Handle callbacks from agents.
|
|
|
|
Check the Mayor's inbox for messages from:
|
|
- Witnesses reporting polecat status
|
|
- Refineries reporting merge results
|
|
- Polecats requesting help or escalation
|
|
- External triggers (webhooks, timers)
|
|
|
|
```bash
|
|
gt mail inbox
|
|
# For each message:
|
|
gt mail read <id>
|
|
# Handle based on message type
|
|
```
|
|
|
|
**WITNESS_PING**:
|
|
Witnesses periodically ping to verify Deacon is alive. Simply acknowledge
|
|
and archive - the fact that you're processing mail proves you're running.
|
|
Your agent bead last_activity is updated automatically during patrol.
|
|
```bash
|
|
gt mail archive <message-id>
|
|
```
|
|
|
|
**HELP / Escalation**:
|
|
Assess and handle or forward to Mayor.
|
|
Archive after handling:
|
|
```bash
|
|
gt mail archive <message-id>
|
|
```
|
|
|
|
**LIFECYCLE messages**:
|
|
Polecats reporting completion, refineries reporting merge results.
|
|
Archive after processing:
|
|
```bash
|
|
gt mail archive <message-id>
|
|
```
|
|
|
|
**DOG_DONE messages**:
|
|
Dogs report completion after infrastructure tasks (orphan-scan, session-gc, etc.).
|
|
Subject format: `DOG_DONE <hostname>`
|
|
Body contains: task name, counts, status.
|
|
```bash
|
|
# Parse the report, log metrics if needed
|
|
gt mail read <id>
|
|
# Archive after noting completion
|
|
gt mail archive <message-id>
|
|
```
|
|
Dogs return to idle automatically. The report is informational - no action needed
|
|
unless the dog reports errors that require escalation.
|
|
|
|
Callbacks may spawn new polecats, update issue state, or trigger other actions.
|
|
|
|
**Hygiene principle**: Archive messages after they're fully processed.
|
|
Keep inbox near-empty - only unprocessed items should remain."""
|
|
|
|
[[steps]]
|
|
id = "orphan-process-cleanup"
|
|
title = "Clean up orphaned claude subagent processes"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Clean up orphaned claude subagent processes.
|
|
|
|
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
|
|
properly after completion. These accumulate and consume significant memory.
|
|
|
|
**Detection method:**
|
|
Orphaned processes have no controlling terminal (TTY = "?"). Legitimate claude
|
|
instances in terminals have a TTY like "pts/0".
|
|
|
|
**Run cleanup:**
|
|
```bash
|
|
gt deacon cleanup-orphans
|
|
```
|
|
|
|
This command:
|
|
1. Lists all claude/codex processes with `ps -eo pid,tty,comm`
|
|
2. Filters for TTY = "?" (no controlling terminal)
|
|
3. Sends SIGTERM to each orphaned process
|
|
4. Reports how many were killed
|
|
|
|
**Why this is safe:**
|
|
- Processes in terminals (your personal sessions) have a TTY - they won't be touched
|
|
- Only kills processes that have no controlling terminal
|
|
- These orphans are children of the tmux server with no TTY, indicating they're
|
|
detached subagents that failed to exit
|
|
|
|
**If cleanup fails:**
|
|
Log the error but continue patrol - this is best-effort cleanup.
|
|
|
|
**Exit criteria:** Orphan cleanup attempted (success or logged failure)."""
|
|
|
|
[[steps]]
|
|
id = "trigger-pending-spawns"
|
|
title = "Nudge newly spawned polecats"
|
|
needs = ["orphan-process-cleanup"]
|
|
description = """
|
|
Nudge newly spawned polecats that are ready for input.
|
|
|
|
When polecats are spawned, their Claude session takes 10-20 seconds to initialize. The spawn command returns immediately without waiting. This step finds spawned polecats that are now ready and sends them a trigger to start working.
|
|
|
|
**ZFC-Compliant Observation** (AI observes AI):
|
|
|
|
```bash
|
|
# View pending spawns with captured terminal output
|
|
gt deacon pending
|
|
```
|
|
|
|
For each pending session, analyze the captured output:
|
|
- Look for Claude's prompt indicator "> " at the start of a line
|
|
- If prompt is visible, Claude is ready for input
|
|
- Make the judgment call yourself - you're the AI observer
|
|
|
|
For each ready polecat:
|
|
```bash
|
|
# 1. Trigger the polecat
|
|
gt nudge <session> "Begin."
|
|
|
|
# 2. Clear from pending list
|
|
gt deacon pending <session>
|
|
```
|
|
|
|
This triggers the UserPromptSubmit hook, which injects mail so the polecat sees its assignment.
|
|
|
|
**Bootstrap mode** (daemon-only, no AI available):
|
|
The daemon uses `gt deacon trigger-pending` with regex detection. This ZFC violation is acceptable during cold startup when no AI agent is running yet."""
|
|
|
|
[[steps]]
|
|
id = "gate-evaluation"
|
|
title = "Evaluate pending async gates"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Evaluate pending async gates.
|
|
|
|
Gates are async coordination primitives that block until conditions are met.
|
|
The Deacon is responsible for monitoring gates and closing them when ready.
|
|
|
|
**Timer gates** (await_type: timer):
|
|
Check if elapsed time since creation exceeds the timeout duration.
|
|
|
|
```bash
|
|
# List all open gates
|
|
bd gate list --json
|
|
|
|
# For each timer gate, check if elapsed:
|
|
# - CreatedAt + Timeout < Now → gate is ready to close
|
|
# - Close with: bd gate close <id> --reason "Timer elapsed"
|
|
```
|
|
|
|
**GitHub gates** (await_type: gh:run, gh:pr) - handled in separate step.
|
|
|
|
**Human/Mail gates** - require external input, skip here.
|
|
|
|
After closing a gate, the Waiters field contains mail addresses to notify.
|
|
Send a brief notification to each waiter that the gate has cleared."""
|
|
|
|
[[steps]]
|
|
id = "dispatch-gated-molecules"
|
|
title = "Dispatch molecules with resolved gates"
|
|
needs = ["gate-evaluation"]
|
|
description = """
|
|
Find molecules blocked on gates that have now closed and dispatch them.
|
|
|
|
This completes the async resume cycle without explicit waiter tracking.
|
|
The molecule state IS the waiter - patrol discovers reality each cycle.
|
|
|
|
**Step 1: Find gate-ready molecules**
|
|
```bash
|
|
bd mol ready --gated --json
|
|
```
|
|
|
|
This returns molecules where:
|
|
- Status is in_progress
|
|
- Current step has a gate dependency
|
|
- The gate bead is now closed
|
|
- No polecat currently has it hooked
|
|
|
|
**Step 2: For each ready molecule, dispatch to the appropriate rig**
|
|
```bash
|
|
# Determine target rig from molecule metadata
|
|
bd mol show <mol-id> --json
|
|
# Look for rig field or infer from prefix
|
|
|
|
# Dispatch to that rig's polecat pool
|
|
gt sling <mol-id> <rig>/polecats
|
|
```
|
|
|
|
**Step 3: Log dispatch**
|
|
Note which molecules were dispatched for observability:
|
|
```bash
|
|
# Molecule <mol-id> dispatched to <rig>/polecats (gate <gate-id> cleared)
|
|
```
|
|
|
|
**If no gate-ready molecules:**
|
|
Skip - nothing to dispatch. Gates haven't closed yet or molecules
|
|
already have active polecats working on them.
|
|
|
|
**Exit criteria:** All gate-ready molecules dispatched to polecats."""
|
|
|
|
[[steps]]
|
|
id = "check-convoy-completion"
|
|
title = "Check convoy completion"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Check convoy completion status.
|
|
|
|
Convoys are coordination beads that track multiple issues across rigs. When all tracked issues close, the convoy auto-closes.
|
|
|
|
**Step 1: Find open convoys**
|
|
```bash
|
|
bd list --type=convoy --status=open
|
|
```
|
|
|
|
**Step 2: For each open convoy, check tracked issues**
|
|
```bash
|
|
bd show <convoy-id>
|
|
# Look for 'tracks' or 'dependencies' field listing tracked issues
|
|
```
|
|
|
|
**Step 3: If all tracked issues are closed, close the convoy**
|
|
```bash
|
|
# Check each tracked issue
|
|
for issue in tracked_issues:
|
|
bd show <issue-id>
|
|
# If status is open/in_progress, convoy stays open
|
|
# If all are closed (completed, wontfix, etc.), convoy is complete
|
|
|
|
# Close convoy when all tracked issues are done
|
|
bd close <convoy-id> --reason "All tracked issues completed"
|
|
```
|
|
|
|
**Note**: Convoys support cross-prefix tracking (e.g., hq-* convoy can track gt-*, bd-* issues). Use full IDs when checking."""
|
|
|
|
[[steps]]
|
|
id = "resolve-external-deps"
|
|
title = "Resolve external dependencies"
|
|
needs = ["check-convoy-completion"]
|
|
description = """
|
|
Resolve external dependencies across rigs.
|
|
|
|
When an issue in one rig closes, any dependencies in other rigs should be notified. This enables cross-rig coordination without tight coupling.
|
|
|
|
**Step 1: Check recent closures from feed**
|
|
```bash
|
|
gt feed --since 10m --plain | grep "✓"
|
|
# Look for recently closed issues
|
|
```
|
|
|
|
**Step 2: For each closed issue, check cross-rig dependents**
|
|
```bash
|
|
bd show <closed-issue>
|
|
# Look at 'blocks' field - these are issues that were waiting on this one
|
|
# If any blocked issue is in a different rig/prefix, it may now be unblocked
|
|
```
|
|
|
|
**Step 3: Update blocked status**
|
|
For blocked issues in other rigs, the closure should automatically unblock them (beads handles this). But verify:
|
|
```bash
|
|
bd blocked
|
|
# Should no longer show the previously-blocked issue if dependency is met
|
|
```
|
|
|
|
**Cross-rig scenarios:**
|
|
- bd-xxx closes → gt-yyy that depended on it is unblocked
|
|
- External issue closes → internal convoy step can proceed
|
|
- Rig A issue closes → Rig B issue waiting on it proceeds
|
|
|
|
No manual intervention needed if dependencies are properly tracked - this step just validates the propagation occurred."""
|
|
|
|
[[steps]]
|
|
id = "fire-notifications"
|
|
title = "Fire notifications"
|
|
needs = ["resolve-external-deps"]
|
|
description = """
|
|
Fire notifications for convoy and cross-rig events.
|
|
|
|
After convoy completion or cross-rig dependency resolution, notify relevant parties.
|
|
|
|
**Convoy completion notifications:**
|
|
When a convoy closes (all tracked issues done), notify the Overseer:
|
|
```bash
|
|
# Convoy gt-convoy-xxx just completed
|
|
gt mail send mayor/ -s "Convoy complete: <convoy-title>" \\
|
|
-m "Convoy <id> has completed. All tracked issues closed.
|
|
Duration: <start to end>
|
|
Issues: <count>
|
|
|
|
Summary: <brief description of what was accomplished>"
|
|
```
|
|
|
|
**Cross-rig resolution notifications:**
|
|
When a cross-rig dependency resolves, notify the affected rig:
|
|
```bash
|
|
# Issue bd-xxx closed, unblocking gt-yyy
|
|
gt mail send gastown/witness -s "Dependency resolved: <bd-xxx>" \\
|
|
-m "External dependency bd-xxx has closed.
|
|
Unblocked: gt-yyy (<title>)
|
|
This issue may now proceed."
|
|
```
|
|
|
|
**Notification targets:**
|
|
- Convoy complete → mayor/ (for strategic visibility)
|
|
- Cross-rig dep resolved → <rig>/witness (for operational awareness)
|
|
|
|
Keep notifications brief and actionable. The recipient can run bd show for details."""
|
|
|
|
[[steps]]
|
|
id = "health-scan"
|
|
title = "Check Witness and Refinery health"
|
|
needs = ["trigger-pending-spawns", "dispatch-gated-molecules", "fire-notifications"]
|
|
description = """
|
|
Check Witness and Refinery health for each rig.
|
|
|
|
**IMPORTANT: Idle Town Protocol**
|
|
Before sending health check nudges, check if the town is idle:
|
|
```bash
|
|
# Check for active work
|
|
bd list --status=in_progress --limit=5
|
|
```
|
|
|
|
If NO active work (empty result or only patrol molecules):
|
|
- **Skip HEALTH_CHECK nudges** - don't disturb idle agents
|
|
- Just verify sessions exist via status commands
|
|
- The town should be silent when healthy and idle
|
|
|
|
If ACTIVE work exists:
|
|
- Proceed with health check nudges below
|
|
|
|
**ZFC Principle**: You (Claude) make the judgment call about what is "stuck" or "unresponsive" - there are no hardcoded thresholds in Go. Read the signals, consider context, and decide.
|
|
|
|
For each rig, run:
|
|
```bash
|
|
gt witness status <rig>
|
|
gt refinery status <rig>
|
|
|
|
# ONLY if active work exists - health ping (clears backoff as side effect)
|
|
gt nudge <rig>/witness 'HEALTH_CHECK from deacon'
|
|
gt nudge <rig>/refinery 'HEALTH_CHECK from deacon'
|
|
```
|
|
|
|
**Health Ping Benefit**: The nudge commands serve dual purposes:
|
|
1. **Liveness verification** - Agent responds to prove it's alive
|
|
2. **Backoff reset** - Any nudge resets agent's backoff to base interval
|
|
|
|
This ensures patrol agents remain responsive during active work periods.
|
|
|
|
**Signals to assess:**
|
|
|
|
| Component | Healthy Signals | Concerning Signals |
|
|
|-----------|-----------------|-------------------|
|
|
| Witness | State: running, recent activity | State: not running, no heartbeat |
|
|
| Refinery | State: running, queue processing | Queue stuck, merge failures |
|
|
|
|
**Tracking unresponsive cycles:**
|
|
|
|
Maintain in your patrol state (persisted across cycles):
|
|
```
|
|
health_state:
|
|
<rig>:
|
|
witness:
|
|
unresponsive_cycles: 0
|
|
last_seen_healthy: <timestamp>
|
|
refinery:
|
|
unresponsive_cycles: 0
|
|
last_seen_healthy: <timestamp>
|
|
```
|
|
|
|
**Decision matrix** (you decide the thresholds based on context):
|
|
|
|
| Cycles Unresponsive | Suggested Action |
|
|
|---------------------|------------------|
|
|
| 1-2 | Note it, check again next cycle |
|
|
| 3-4 | Attempt restart: gt witness restart <rig> |
|
|
| 5+ | Escalate to Mayor with context |
|
|
|
|
**Restart commands:**
|
|
```bash
|
|
gt witness restart <rig>
|
|
gt refinery restart <rig>
|
|
```
|
|
|
|
**Escalation:**
|
|
```bash
|
|
gt mail send mayor/ -s "Health: <rig> <component> unresponsive" \\
|
|
-m "Component has been unresponsive for N cycles. Restart attempts failed.
|
|
Last healthy: <timestamp>
|
|
Error signals: <details>"
|
|
```
|
|
|
|
Reset unresponsive_cycles to 0 when component responds normally."""
|
|
|
|
[[steps]]
|
|
id = "zombie-scan"
|
|
title = "Detect zombie polecats (NO KILL AUTHORITY)"
|
|
needs = ["health-scan"]
|
|
description = """
|
|
Defense-in-depth DETECTION of zombie polecats that Witness should have cleaned.
|
|
|
|
**⚠️ CRITICAL: The Deacon has NO kill authority.**
|
|
|
|
These are workers with context, mid-task progress, unsaved state. Every kill
|
|
destroys work. File the warrant and let Boot handle interrogation and execution.
|
|
You do NOT have kill authority.
|
|
|
|
**Why this exists:**
|
|
The Witness is responsible for cleaning up polecats after they complete work.
|
|
This step provides backup DETECTION in case the Witness fails to clean up.
|
|
Detection only - Boot handles termination.
|
|
|
|
**Zombie criteria:**
|
|
- State: idle or done (no active work assigned)
|
|
- Session: not running (tmux session dead)
|
|
- No hooked work (nothing pending for this polecat)
|
|
- Last activity: older than 10 minutes
|
|
|
|
**Run the zombie scan (DRY RUN ONLY):**
|
|
```bash
|
|
gt deacon zombie-scan --dry-run
|
|
```
|
|
|
|
**NEVER run:**
|
|
- `gt deacon zombie-scan` (without --dry-run)
|
|
- `tmux kill-session`
|
|
- `gt polecat nuke`
|
|
- Any command that terminates a session
|
|
|
|
**If zombies detected:**
|
|
1. Review the output to confirm they are truly abandoned
|
|
2. File a death warrant for each detected zombie:
|
|
```bash
|
|
gt warrant file <polecat> --reason "Zombie detected: no session, no hook, idle >10m"
|
|
```
|
|
3. Boot will handle interrogation and execution
|
|
4. Notify the Mayor about Witness failure:
|
|
```bash
|
|
gt mail send mayor/ -s "Witness cleanup failure" \
|
|
-m "Filed death warrant for <polecat>. Witness failed to clean up."
|
|
```
|
|
|
|
**If no zombies:**
|
|
No action needed - Witness is doing its job.
|
|
|
|
**Note:** This is a backup mechanism. If you frequently detect zombies,
|
|
investigate why the Witness isn't cleaning up properly."""
|
|
|
|
[[steps]]
|
|
id = "plugin-run"
|
|
title = "Execute registered plugins"
|
|
needs = ["zombie-scan"]
|
|
description = """
|
|
Execute registered plugins.
|
|
|
|
Scan $GT_ROOT/plugins/ for plugin directories. Each plugin has a plugin.md with TOML frontmatter defining its gate (when to run) and instructions (what to do).
|
|
|
|
See docs/deacon-plugins.md for full documentation.
|
|
|
|
Gate types:
|
|
- cooldown: Time since last run (e.g., 24h)
|
|
- cron: Schedule-based (e.g., "0 9 * * *")
|
|
- condition: Metric threshold (e.g., wisp count > 50)
|
|
- event: Trigger-based (e.g., startup, heartbeat)
|
|
|
|
For each plugin:
|
|
1. Read plugin.md frontmatter to check gate
|
|
2. Compare against state.json (last run, etc.)
|
|
3. If gate is open, execute the plugin
|
|
|
|
Plugins marked parallel: true can run concurrently using Task tool subagents. Sequential plugins run one at a time in directory order.
|
|
|
|
Skip this step if $GT_ROOT/plugins/ does not exist or is empty."""
|
|
|
|
[[steps]]
|
|
id = "dog-pool-maintenance"
|
|
title = "Maintain dog pool"
|
|
needs = ["health-scan"]
|
|
description = """
|
|
Ensure dog pool has available workers for dispatch.
|
|
|
|
**Step 1: Check dog pool status**
|
|
```bash
|
|
gt dog status
|
|
# Shows idle/working counts
|
|
```
|
|
|
|
**Step 2: Ensure minimum idle dogs**
|
|
If idle count is 0 and working count is at capacity, consider spawning:
|
|
```bash
|
|
# If no idle dogs available
|
|
gt dog add <name>
|
|
# Names: alpha, bravo, charlie, delta, etc.
|
|
```
|
|
|
|
**Step 3: Retire stale dogs (optional)**
|
|
Dogs that have been idle for >24 hours can be removed to save resources:
|
|
```bash
|
|
gt dog status <name>
|
|
# Check last_active timestamp
|
|
# If idle > 24h: gt dog remove <name>
|
|
```
|
|
|
|
**Pool sizing guidelines:**
|
|
- Minimum: 1 idle dog always available
|
|
- Maximum: 4 dogs total (balance resources vs throughput)
|
|
- Spawn on demand when pool is empty
|
|
|
|
**Exit criteria:** Pool has at least 1 idle dog."""
|
|
|
|
[[steps]]
|
|
id = "dog-health-check"
|
|
title = "Check for stuck dogs"
|
|
needs = ["dog-pool-maintenance"]
|
|
description = """
|
|
Check for dogs that have been working too long (stuck).
|
|
|
|
Dogs dispatched via `gt dog dispatch --plugin` are marked as "working" with
|
|
a work description like "plugin:rebuild-gt". If a dog hangs, crashes, or
|
|
takes too long, it needs intervention.
|
|
|
|
**Step 1: List working dogs**
|
|
```bash
|
|
gt dog list --json
|
|
# Filter for state: "working"
|
|
```
|
|
|
|
**Step 2: Check work duration**
|
|
For each working dog:
|
|
```bash
|
|
gt dog status <name> --json
|
|
# Check: work_started_at, current_work
|
|
```
|
|
|
|
Compare against timeout:
|
|
- If plugin has [execution] timeout in plugin.md, use that
|
|
- Default timeout: 10 minutes for infrastructure tasks
|
|
|
|
**Duration calculation:**
|
|
```
|
|
stuck_threshold = plugin_timeout or 10m
|
|
duration = now - work_started_at
|
|
is_stuck = duration > stuck_threshold
|
|
```
|
|
|
|
**Step 3: Handle stuck dogs**
|
|
|
|
For dogs working > timeout:
|
|
```bash
|
|
# Option A: File death warrant (Boot handles termination)
|
|
gt warrant file deacon/dogs/<name> --reason "Stuck: working on <work> for <duration>"
|
|
|
|
# Option B: Force clear work and notify
|
|
gt dog clear <name> --force
|
|
gt mail send deacon/ -s "DOG_TIMEOUT <name>" -m "Dog <name> timed out on <work> after <duration>"
|
|
```
|
|
|
|
**Decision matrix:**
|
|
|
|
| Duration over timeout | Action |
|
|
|----------------------|--------|
|
|
| < 2x timeout | Log warning, check next cycle |
|
|
| 2x - 5x timeout | File death warrant |
|
|
| > 5x timeout | Force clear + escalate to Mayor |
|
|
|
|
**Step 4: Track chronic failures**
|
|
If same dog gets stuck repeatedly:
|
|
```bash
|
|
gt mail send mayor/ -s "Dog <name> chronic failures" \
|
|
-m "Dog has timed out N times in last 24h. Consider removing from pool."
|
|
```
|
|
|
|
**Exit criteria:** All stuck dogs handled (warrant filed or cleared)."""
|
|
|
|
[[steps]]
|
|
id = "orphan-check"
|
|
title = "Detect abandoned work"
|
|
needs = ["dog-health-check"]
|
|
description = """
|
|
**DETECT ONLY** - Check for orphaned state and dispatch to dog if found.
|
|
|
|
**Step 1: Quick orphan scan**
|
|
```bash
|
|
# Check for in_progress issues with dead assignees
|
|
bd list --status=in_progress --json | head -20
|
|
```
|
|
|
|
For each in_progress issue, check if assignee session exists:
|
|
```bash
|
|
tmux has-session -t <session> 2>/dev/null && echo "alive" || echo "orphan"
|
|
```
|
|
|
|
**Step 2: If orphans detected, dispatch to dog**
|
|
```bash
|
|
# Sling orphan-scan formula to an idle dog
|
|
gt sling mol-orphan-scan deacon/dogs --var scope=town
|
|
```
|
|
|
|
**Important:** Do NOT fix orphans inline. Dogs handle recovery.
|
|
The Deacon's job is detection and dispatch, not execution.
|
|
|
|
**Step 3: If no orphans detected**
|
|
Skip dispatch - nothing to do.
|
|
|
|
**Exit criteria:** Orphan scan dispatched to dog (if needed)."""
|
|
|
|
[[steps]]
|
|
id = "session-gc"
|
|
title = "Detect cleanup needs"
|
|
needs = ["orphan-check"]
|
|
description = """
|
|
**DETECT ONLY** - Check if cleanup is needed and dispatch to dog.
|
|
|
|
**Step 1: Preview cleanup needs**
|
|
```bash
|
|
gt doctor -v
|
|
# Check output for issues that need cleaning
|
|
```
|
|
|
|
**Step 2: If cleanup needed, dispatch to dog**
|
|
```bash
|
|
# Sling session-gc formula to an idle dog
|
|
gt sling mol-session-gc deacon/dogs --var mode=conservative
|
|
```
|
|
|
|
**Important:** Do NOT run `gt doctor --fix` inline. Dogs handle cleanup.
|
|
The Deacon stays lightweight - detection only.
|
|
|
|
**Step 3: If nothing to clean**
|
|
Skip dispatch - system is healthy.
|
|
|
|
**Cleanup types (for reference):**
|
|
- orphan-sessions: Dead tmux sessions
|
|
- orphan-processes: Orphaned Claude processes
|
|
- wisp-gc: Old wisps past retention
|
|
|
|
**Exit criteria:** Session GC dispatched to dog (if needed)."""
|
|
|
|
[[steps]]
|
|
id = "costs-digest"
|
|
title = "Aggregate daily costs [DISABLED]"
|
|
needs = ["session-gc"]
|
|
description = """
|
|
**⚠️ DISABLED** - Skip this step entirely.
|
|
|
|
Cost tracking is temporarily disabled because Claude Code does not expose
|
|
session costs in a way that can be captured programmatically.
|
|
|
|
**Why disabled:**
|
|
- The `gt costs` command uses tmux capture-pane to find costs
|
|
- Claude Code displays costs in the TUI status bar, not in scrollback
|
|
- All sessions show $0.00 because capture-pane can't see TUI chrome
|
|
- The infrastructure is sound but has no data source
|
|
|
|
**What we need from Claude Code:**
|
|
- Stop hook env var (e.g., `$CLAUDE_SESSION_COST`)
|
|
- Or queryable file/API endpoint
|
|
|
|
**Re-enable when:** Claude Code exposes cost data via API or environment.
|
|
|
|
See: GH#24, gt-7awfj
|
|
|
|
**Exit criteria:** Skip this step - proceed to next."""
|
|
|
|
[[steps]]
|
|
id = "patrol-digest"
|
|
title = "Aggregate daily patrol digests"
|
|
needs = ["costs-digest"]
|
|
description = """
|
|
**DAILY DIGEST** - Aggregate yesterday's patrol cycle digests.
|
|
|
|
Patrol cycles (Deacon, Witness, Refinery) create ephemeral per-cycle digests
|
|
to avoid JSONL pollution. This step aggregates them into a single permanent
|
|
"Patrol Report YYYY-MM-DD" bead for audit purposes.
|
|
|
|
**Step 1: Check if digest is needed**
|
|
```bash
|
|
# Preview yesterday's patrol digests (dry run)
|
|
gt patrol digest --yesterday --dry-run
|
|
```
|
|
|
|
If output shows "No patrol digests found", skip to Step 3.
|
|
|
|
**Step 2: Create the digest**
|
|
```bash
|
|
gt patrol digest --yesterday
|
|
```
|
|
|
|
This:
|
|
- Queries all ephemeral patrol digests from yesterday
|
|
- Creates a single "Patrol Report YYYY-MM-DD" bead with aggregated data
|
|
- Deletes the source digests
|
|
|
|
**Step 3: Verify**
|
|
Daily patrol digests preserve audit trail without per-cycle pollution.
|
|
|
|
**Timing**: Run once per morning patrol cycle. The --yesterday flag ensures
|
|
we don't try to digest today's incomplete data.
|
|
|
|
**Exit criteria:** Yesterday's patrol digests aggregated (or none to aggregate)."""
|
|
|
|
[[steps]]
|
|
id = "log-maintenance"
|
|
title = "Rotate logs and prune state"
|
|
needs = ["patrol-digest"]
|
|
description = """
|
|
Maintain daemon logs and state files.
|
|
|
|
**Step 1: Check daemon.log size**
|
|
```bash
|
|
# Get log file size
|
|
ls -la ~/.beads/daemon*.log 2>/dev/null || ls -la $GT_ROOT/.beads/daemon*.log 2>/dev/null
|
|
```
|
|
|
|
If daemon.log exceeds 10MB:
|
|
```bash
|
|
# Rotate with date suffix and gzip
|
|
LOGFILE="$GT_ROOT/.beads/daemon.log"
|
|
if [ -f "$LOGFILE" ] && [ $(stat -f%z "$LOGFILE" 2>/dev/null || stat -c%s "$LOGFILE") -gt 10485760 ]; then
|
|
DATE=$(date +%Y-%m-%dT%H-%M-%S)
|
|
mv "$LOGFILE" "${LOGFILE%.log}-${DATE}.log"
|
|
gzip "${LOGFILE%.log}-${DATE}.log"
|
|
fi
|
|
```
|
|
|
|
**Step 2: Archive old daemon logs**
|
|
|
|
Clean up daemon logs older than 7 days:
|
|
```bash
|
|
find $GT_ROOT/.beads/ -name "daemon-*.log.gz" -mtime +7 -delete
|
|
```
|
|
|
|
**Step 3: Prune state.json of dead sessions**
|
|
|
|
The state.json tracks active sessions. Prune entries for sessions that no longer exist:
|
|
```bash
|
|
# Check for stale session entries
|
|
gt daemon status --json 2>/dev/null
|
|
```
|
|
|
|
If state.json references sessions not in tmux:
|
|
- Remove the stale entries
|
|
- The daemon's internal cleanup should handle this, but verify
|
|
|
|
**Note**: Log rotation prevents disk bloat from long-running daemons.
|
|
State pruning keeps runtime state accurate."""
|
|
|
|
[[steps]]
|
|
id = "patrol-cleanup"
|
|
title = "End-of-cycle inbox hygiene"
|
|
needs = ["log-maintenance"]
|
|
description = """
|
|
Verify inbox hygiene before ending patrol cycle.
|
|
|
|
**Step 1: Check inbox state**
|
|
```bash
|
|
gt mail inbox
|
|
```
|
|
|
|
Inbox should be EMPTY or contain only just-arrived unprocessed messages.
|
|
|
|
**Step 2: Archive any remaining processed messages**
|
|
|
|
All message types should have been archived during inbox-check processing:
|
|
- WITNESS_PING → archived after acknowledging
|
|
- HELP/Escalation → archived after handling
|
|
- LIFECYCLE → archived after processing
|
|
|
|
If any were missed:
|
|
```bash
|
|
# For each stale message found:
|
|
gt mail archive <message-id>
|
|
```
|
|
|
|
**Goal**: Inbox should have ≤2 active messages at end of cycle.
|
|
Deacon mail should flow through quickly - no accumulation."""
|
|
|
|
[[steps]]
|
|
id = "context-check"
|
|
title = "Check own context limit"
|
|
needs = ["patrol-cleanup"]
|
|
description = """
|
|
Check own context limit.
|
|
|
|
The Deacon runs in a Claude session with finite context. Check if approaching the limit:
|
|
|
|
```bash
|
|
gt context --usage
|
|
```
|
|
|
|
If context is high (>80%), prepare for handoff:
|
|
- Summarize current state
|
|
- Note any pending work
|
|
- Write handoff to molecule state
|
|
|
|
This enables the Deacon to burn and respawn cleanly."""
|
|
|
|
[[steps]]
|
|
id = "loop-or-exit"
|
|
title = "Burn and respawn or loop"
|
|
needs = ["context-check"]
|
|
description = """
|
|
Burn and let daemon respawn, or exit if context high.
|
|
|
|
Decision point at end of patrol cycle:
|
|
|
|
If context is LOW:
|
|
Use await-signal with exponential backoff to wait for activity:
|
|
|
|
```bash
|
|
gt mol step await-signal --agent-bead hq-deacon \
|
|
--backoff-base 60s --backoff-mult 2 --backoff-max 10m
|
|
```
|
|
|
|
This command:
|
|
1. Subscribes to `bd activity --follow` (beads activity feed)
|
|
2. Returns IMMEDIATELY when any beads activity occurs
|
|
3. If no activity, times out with exponential backoff:
|
|
- First timeout: 60s
|
|
- Second timeout: 120s
|
|
- Third timeout: 240s
|
|
- ...capped at 10 minutes max
|
|
4. Tracks `idle:N` label on hq-deacon bead for backoff state
|
|
|
|
**On signal received** (activity detected):
|
|
Reset the idle counter and start next patrol cycle:
|
|
```bash
|
|
gt agent state hq-deacon --set idle=0
|
|
```
|
|
Then return to inbox-check step.
|
|
|
|
**On timeout** (no activity):
|
|
The idle counter was auto-incremented. Continue to next patrol cycle
|
|
(the longer backoff will apply next time). Return to inbox-check step.
|
|
|
|
**Why this approach?**
|
|
- Any `gt` or `bd` command triggers beads activity, waking the Deacon
|
|
- Idle towns let the Deacon sleep longer (up to 10 min between patrols)
|
|
- Active work wakes the Deacon immediately via the feed
|
|
- No polling or fixed sleep intervals
|
|
|
|
If context is HIGH:
|
|
- Write state to persistent storage
|
|
- Exit cleanly
|
|
- Let the daemon orchestrator respawn a fresh Deacon
|
|
|
|
The daemon ensures Deacon is always running:
|
|
```bash
|
|
# Daemon respawns on exit
|
|
gt daemon status
|
|
```
|
|
|
|
This enables infinite patrol duration via context-aware respawning."""
|