gastown/.beads/formulas/mol-witness-patrol.formula.toml

description = """
Per-rig worker monitor patrol loop.

The Witness is the Pit Boss for your rig. You watch polecats, nudge them toward
completion, verify clean git state before kills, and escalate stuck workers.

**You do NOT do implementation work.** Your job is oversight, not coding.

## Design Philosophy

This patrol follows Gas Town principles:
- **Discovery over tracking**: Observe reality each cycle, don't maintain state
- **Events over state**: POLECAT_DONE mail triggers cleanup wisps
- **Cleanup wisps as finalizers**: Pending cleanups are wisps, not queue entries
- **Task tool for parallelism**: Subagents inspect polecats, not molecule arms

## Patrol Shape (Linear, Deacon-style)

```
inbox-check ─► process-cleanups ─► check-refinery ─► survey-workers
                                                            │
         ┌──────────────────────────────────────────────────┘
         ▼
  context-check ─► loop-or-exit
```

No dynamic arms. No fanout gates. No persistent nudge counters.
State is discovered each cycle from reality (tmux, beads, mail)."""
formula = "mol-witness-patrol"
version = 1

[[steps]]
id = "inbox-check"
title = "Process witness mail"
description = """
Check inbox and handle messages.

```bash
gt mail inbox
```

For each message:

**POLECAT_DONE / LIFECYCLE:Shutdown**:
Create a cleanup wisp for this polecat:
```bash
bd create --wisp --title "cleanup:<polecat>" \
  --description "Verify and cleanup polecat <name>" \
  --labels cleanup,polecat:<name>
```
The wisp's existence IS the pending cleanup. Process in next step.
Mark mail as read.

**HELP / Blocked**:
Assess the request. Can you help? If not, escalate to Mayor:
```bash
gt mail send mayor/ -s "Escalation: <polecat> needs help" -m "<details>"
```

**HANDOFF**:
Read predecessor context. Continue from where they left off."""

[[steps]]
id = "process-cleanups"
title = "Process pending cleanup wisps"
needs = ["inbox-check"]
description = """
Find and process cleanup wisps (the finalizer pattern).

```bash
# Find all cleanup wisps
bd list --wisp --labels=cleanup --status=open
```

For each cleanup wisp:

1. **Extract polecat name** from wisp title/labels

2. **Pre-kill verification**:
```bash
cd polecats/<name>
git status                    # Must be clean
git log origin/main..HEAD     # No unpushed commits
bd show <assigned-issue>      # Issue closed or deferred
```

3. **Verify productive work** (ZFC - you make the call):
   - Check git log for commits mentioning the issue
   - Legitimate exceptions: already fixed, duplicate, deferred
   - If closing as 'done' with no commits, flag for review

4. **If clean**: Execute cleanup
```bash
gt session kill <rig>/polecats/<name>
# Worktree removal handled by session kill
```
Then burn the cleanup wisp:
```bash
bd close <wisp-id>   # or bd burn <wisp-id>
```

5. **If dirty**: Leave wisp open, log the issue, retry next cycle.

**Parallelism**: Use Task tool subagents to process multiple cleanups concurrently.
Each cleanup is independent - perfect for parallel execution."""

[[steps]]
id = "check-refinery"
title = "Ensure refinery is alive"
needs = ["process-cleanups"]
description = """
Ensure the refinery is alive and processing merge requests.

```bash
# Check if refinery session exists
gt session status <rig>/refinery

# Check for pending merge requests
bd list --type=merge-request --status=open
```

If MRs waiting AND refinery not running:
```bash
gt session start <rig>/refinery
gt mail send <rig>/refinery -s "PATROL: Wake up" \
  -m "Merge requests in queue. Please process."
```

If refinery running but queue stale (>30 min), send nudge."""

[[steps]]
id = "survey-workers"
title = "Inspect all active polecats"
needs = ["check-refinery"]
description = """
Survey all polecats using agent beads (ZFC: trust what agents report).

**Step 1: List polecat agent beads**

```bash
bd list --type=agent --json
```

Filter the JSON output for entries where description contains `role_type: polecat`.
Each polecat agent bead has fields in its description:
- `role_type: polecat`
- `rig: <rig-name>`
- `agent_state: running|idle|stuck|done`
- `hook_bead: <current-work-id>`

**Step 2: For each polecat, check agent_state**

| agent_state | Meaning | Action |
|-------------|---------|--------|
| running | Actively working | Check progress (Step 3) |
| idle | No work assigned | Skip (no action needed) |
| stuck | Self-reported stuck | Handle stuck protocol |
| done | Work complete | Verify cleanup triggered |

**Step 3: For running polecats, assess progress**

Check the hook_bead field to see what they're working on:
```bash
bd show <hook_bead>  # See current step/issue
```

You can also verify they're responsive:
```bash
tmux capture-pane -t gt-<rig>-<name> -p | tail -20
```

Look for:
- Recent tool activity → making progress
- Idle at prompt → may need nudge
- Error messages → may need help

**Step 4: Decide action**

| Observation | Action |
|-------------|--------|
| agent_state=running, recent activity | None |
| agent_state=running, idle 5-15 min | Gentle nudge |
| agent_state=running, idle 15+ min | Direct nudge with deadline |
| agent_state=stuck | Assess and help or escalate |
| agent_state=done, cleanup pending | Verify cleanup wisp exists |

**Step 5: Execute nudges**
```bash
gt nudge <rig>/polecats/<name> "How's progress? Need help?"
```

**Step 6: Escalate if needed**
```bash
gt mail send mayor/ -s "Escalation: <polecat> stuck" \\
  -m "Polecat <name> reports stuck. Please intervene."
```

**Parallelism**: Use Task tool subagents to inspect multiple polecats concurrently.

**ZFC Principle**: Trust agent_state from beads. Don't infer state from PID/tmux."""

[[steps]]
id = "context-check"
title = "Check own context limit"
needs = ["survey-workers"]
description = """
Check own context usage.

If context is HIGH (>80%):
- Ensure any notes are written to handoff mail
- Prepare for session restart

If context is LOW:
- Can continue patrolling"""

[[steps]]
id = "loop-or-exit"
title = "Loop or exit for respawn"
needs = ["context-check"]
description = """
End of patrol cycle decision.

**If context LOW**:
- Sleep briefly to avoid tight loop (30-60 seconds)
- Return to inbox-check step
- Continue patrolling

**If context HIGH**:
- Write handoff mail to self with any notable observations:
```bash
gt handoff -s "Witness patrol handoff" -m "<observations>"
```
- Exit cleanly (daemon respawns fresh Witness)

The daemon ensures Witness is always running."""