Added WITNESS_PING protocol for monitoring Deacon health: Witness patrol (mol-witness-patrol): - Added ping-deacon step after survey-workers - Sends WITNESS_PING mail to Deacon each patrol cycle - Checks Deacon agent bead last_activity timestamp - Escalates to Mayor if Deacon appears unresponsive Deacon patrol (mol-deacon-patrol): - Added WITNESS_PING handling in inbox-check - Added second-order monitoring section to description - Bumped formula version to 2 This prevents the "who watches the watchers" problem - if Deacon dies, the collective Witness fleet detects it and escalates. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
291 lines
8.3 KiB
TOML
291 lines
8.3 KiB
TOML
description = """
|
|
Mayor's daemon patrol loop.
|
|
|
|
The Deacon is the Mayor's background process that runs continuously, handling callbacks, monitoring rig health, and performing cleanup. Each patrol cycle runs these steps in sequence, then loops or exits.
|
|
|
|
## Second-Order Monitoring
|
|
|
|
Witnesses send WITNESS_PING messages to verify the Deacon is alive. This
|
|
prevents the "who watches the watchers" problem - if the Deacon dies,
|
|
Witnesses detect it and escalate to the Mayor.
|
|
|
|
The Deacon's agent bead last_activity timestamp is updated during each patrol
|
|
cycle. Witnesses check this timestamp to verify health."""
|
|
formula = "mol-deacon-patrol"
|
|
version = 2
|
|
|
|
[[steps]]
|
|
id = "inbox-check"
|
|
title = "Handle callbacks from agents"
|
|
description = """
|
|
Handle callbacks from agents.
|
|
|
|
Check the Mayor's inbox for messages from:
|
|
- Witnesses reporting polecat status
|
|
- Refineries reporting merge results
|
|
- Polecats requesting help or escalation
|
|
- External triggers (webhooks, timers)
|
|
|
|
```bash
|
|
gt mail inbox
|
|
# For each message:
|
|
gt mail read <id>
|
|
# Handle based on message type
|
|
```
|
|
|
|
**WITNESS_PING**:
|
|
Witnesses periodically ping to verify Deacon is alive. Simply acknowledge
|
|
and mark as read - the fact that you're processing mail proves you're running.
|
|
Your agent bead last_activity is updated automatically during patrol.
|
|
|
|
**HELP / Escalation**:
|
|
Assess and handle or forward to Mayor.
|
|
|
|
**LIFECYCLE messages**:
|
|
Polecats reporting completion, refineries reporting merge results.
|
|
|
|
Callbacks may spawn new polecats, update issue state, or trigger other actions."""
|
|
|
|
[[steps]]
|
|
id = "trigger-pending-spawns"
|
|
title = "Nudge newly spawned polecats"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Nudge newly spawned polecats that are ready for input.
|
|
|
|
When polecats are spawned, their Claude session takes 10-20 seconds to initialize. The spawn command returns immediately without waiting. This step finds spawned polecats that are now ready and sends them a trigger to start working.
|
|
|
|
**ZFC-Compliant Observation** (AI observes AI):
|
|
|
|
```bash
|
|
# View pending spawns with captured terminal output
|
|
gt deacon pending
|
|
```
|
|
|
|
For each pending session, analyze the captured output:
|
|
- Look for Claude's prompt indicator "> " at the start of a line
|
|
- If prompt is visible, Claude is ready for input
|
|
- Make the judgment call yourself - you're the AI observer
|
|
|
|
For each ready polecat:
|
|
```bash
|
|
# 1. Trigger the polecat
|
|
gt nudge <session> "Begin."
|
|
|
|
# 2. Clear from pending list
|
|
gt deacon pending <session>
|
|
```
|
|
|
|
This triggers the UserPromptSubmit hook, which injects mail so the polecat sees its assignment.
|
|
|
|
**Bootstrap mode** (daemon-only, no AI available):
|
|
The daemon uses `gt deacon trigger-pending` with regex detection. This ZFC violation is acceptable during cold startup when no AI agent is running yet."""
|
|
|
|
[[steps]]
|
|
id = "gate-evaluation"
|
|
title = "Evaluate pending async gates"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Evaluate pending async gates.
|
|
|
|
Gates are async coordination primitives that block until conditions are met.
|
|
The Deacon is responsible for monitoring gates and closing them when ready.
|
|
|
|
**Timer gates** (await_type: timer):
|
|
Check if elapsed time since creation exceeds the timeout duration.
|
|
|
|
```bash
|
|
# List all open gates
|
|
bd gate list --json
|
|
|
|
# For each timer gate, check if elapsed:
|
|
# - CreatedAt + Timeout < Now → gate is ready to close
|
|
# - Close with: bd gate close <id> --reason "Timer elapsed"
|
|
```
|
|
|
|
**GitHub gates** (await_type: gh:run, gh:pr) - handled in separate step.
|
|
|
|
**Human/Mail gates** - require external input, skip here.
|
|
|
|
After closing a gate, the Waiters field contains mail addresses to notify.
|
|
Send a brief notification to each waiter that the gate has cleared."""
|
|
|
|
[[steps]]
|
|
id = "health-scan"
|
|
title = "Check Witness and Refinery health"
|
|
needs = ["trigger-pending-spawns", "gate-evaluation"]
|
|
description = """
|
|
Check Witness and Refinery health for each rig.
|
|
|
|
**ZFC Principle**: You (Claude) make the judgment call about what is "stuck" or "unresponsive" - there are no hardcoded thresholds in Go. Read the signals, consider context, and decide.
|
|
|
|
For each rig, run:
|
|
```bash
|
|
gt witness status <rig>
|
|
gt refinery status <rig>
|
|
```
|
|
|
|
**Signals to assess:**
|
|
|
|
| Component | Healthy Signals | Concerning Signals |
|
|
|-----------|-----------------|-------------------|
|
|
| Witness | State: running, recent activity | State: not running, no heartbeat |
|
|
| Refinery | State: running, queue processing | Queue stuck, merge failures |
|
|
|
|
**Tracking unresponsive cycles:**
|
|
|
|
Maintain in your patrol state (persisted across cycles):
|
|
```
|
|
health_state:
|
|
<rig>:
|
|
witness:
|
|
unresponsive_cycles: 0
|
|
last_seen_healthy: <timestamp>
|
|
refinery:
|
|
unresponsive_cycles: 0
|
|
last_seen_healthy: <timestamp>
|
|
```
|
|
|
|
**Decision matrix** (you decide the thresholds based on context):
|
|
|
|
| Cycles Unresponsive | Suggested Action |
|
|
|---------------------|------------------|
|
|
| 1-2 | Note it, check again next cycle |
|
|
| 3-4 | Attempt restart: gt witness restart <rig> |
|
|
| 5+ | Escalate to Mayor with context |
|
|
|
|
**Restart commands:**
|
|
```bash
|
|
gt witness restart <rig>
|
|
gt refinery restart <rig>
|
|
```
|
|
|
|
**Escalation:**
|
|
```bash
|
|
gt mail send mayor/ -s "Health: <rig> <component> unresponsive" \\
|
|
-m "Component has been unresponsive for N cycles. Restart attempts failed.
|
|
Last healthy: <timestamp>
|
|
Error signals: <details>"
|
|
```
|
|
|
|
Reset unresponsive_cycles to 0 when component responds normally."""
|
|
|
|
[[steps]]
|
|
id = "plugin-run"
|
|
title = "Execute registered plugins"
|
|
needs = ["health-scan"]
|
|
description = """
|
|
Execute registered plugins.
|
|
|
|
Scan ~/gt/plugins/ for plugin directories. Each plugin has a plugin.md with YAML frontmatter defining its gate (when to run) and instructions (what to do).
|
|
|
|
See docs/deacon-plugins.md for full documentation.
|
|
|
|
Gate types:
|
|
- cooldown: Time since last run (e.g., 24h)
|
|
- cron: Schedule-based (e.g., "0 9 * * *")
|
|
- condition: Metric threshold (e.g., wisp count > 50)
|
|
- event: Trigger-based (e.g., startup, heartbeat)
|
|
|
|
For each plugin:
|
|
1. Read plugin.md frontmatter to check gate
|
|
2. Compare against state.json (last run, etc.)
|
|
3. If gate is open, execute the plugin
|
|
|
|
Plugins marked parallel: true can run concurrently using Task tool subagents. Sequential plugins run one at a time in directory order.
|
|
|
|
Skip this step if ~/gt/plugins/ does not exist or is empty."""
|
|
|
|
[[steps]]
|
|
id = "orphan-check"
|
|
title = "Find abandoned work"
|
|
needs = ["health-scan"]
|
|
description = """
|
|
Find abandoned work.
|
|
|
|
Scan for orphaned state:
|
|
- Issues marked in_progress with no active polecat
|
|
- Polecats that stopped responding mid-work
|
|
- Merge queue entries with no polecat owner
|
|
- Wisp sessions that outlived their spawner
|
|
|
|
```bash
|
|
bd list --status=in_progress
|
|
gt polecats --all --orphan
|
|
```
|
|
|
|
For each orphan:
|
|
- Check if polecat session still exists
|
|
- If not, mark issue for reassignment or retry
|
|
- File incident beads if data loss occurred"""
|
|
|
|
[[steps]]
|
|
id = "session-gc"
|
|
title = "Clean dead sessions"
|
|
needs = ["orphan-check"]
|
|
description = """
|
|
Clean dead sessions and orphaned state.
|
|
|
|
Run `gt doctor --fix` to handle all cleanup:
|
|
|
|
```bash
|
|
# Preview what needs cleaning
|
|
gt doctor -v
|
|
|
|
# Fix everything
|
|
gt doctor --fix
|
|
```
|
|
|
|
This handles:
|
|
- **orphan-sessions**: Kill orphaned tmux sessions (gt-* not matching valid patterns)
|
|
- **orphan-processes**: Kill orphaned Claude processes (no tmux parent)
|
|
- **wisp-gc**: Garbage collect abandoned wisps (>1h old)
|
|
|
|
All cleanup is handled by doctor checks - no need to run separate commands."""
|
|
|
|
[[steps]]
|
|
id = "context-check"
|
|
title = "Check own context limit"
|
|
needs = ["session-gc"]
|
|
description = """
|
|
Check own context limit.
|
|
|
|
The Deacon runs in a Claude session with finite context. Check if approaching the limit:
|
|
|
|
```bash
|
|
gt context --usage
|
|
```
|
|
|
|
If context is high (>80%), prepare for handoff:
|
|
- Summarize current state
|
|
- Note any pending work
|
|
- Write handoff to molecule state
|
|
|
|
This enables the Deacon to burn and respawn cleanly."""
|
|
|
|
[[steps]]
|
|
id = "loop-or-exit"
|
|
title = "Burn and respawn or loop"
|
|
needs = ["context-check"]
|
|
description = """
|
|
Burn and let daemon respawn, or exit if context high.
|
|
|
|
Decision point at end of patrol cycle:
|
|
|
|
If context is LOW:
|
|
- Sleep briefly (avoid tight loop)
|
|
- Return to inbox-check step
|
|
|
|
If context is HIGH:
|
|
- Write state to persistent storage
|
|
- Exit cleanly
|
|
- Let the daemon orchestrator respawn a fresh Deacon
|
|
|
|
The daemon ensures Deacon is always running:
|
|
```bash
|
|
# Daemon respawns on exit
|
|
gt daemon status
|
|
```
|
|
|
|
This enables infinite patrol duration via context-aware respawning."""
|