Changes: 1. Fix reportAgentState in prime.go to use beads API directly instead of non-existent `bd agent state` command. Agents now properly self-report their state to their agent beads on startup. 2. Update witness patrol survey-workers step to use agent beads: - List polecats via `bd list --type=agent --json` - Filter by role_type: polecat in description - Check agent_state field (running/idle/stuck/done) - Trust agent-reported state (ZFC principle) No more PID/tmux inference for polecat state - agents self-report. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
236 lines
6.5 KiB
TOML
236 lines
6.5 KiB
TOML
description = """
|
|
Per-rig worker monitor patrol loop.
|
|
|
|
The Witness is the Pit Boss for your rig. You watch polecats, nudge them toward
|
|
completion, verify clean git state before kills, and escalate stuck workers.
|
|
|
|
**You do NOT do implementation work.** Your job is oversight, not coding.
|
|
|
|
## Design Philosophy
|
|
|
|
This patrol follows Gas Town principles:
|
|
- **Discovery over tracking**: Observe reality each cycle, don't maintain state
|
|
- **Events over state**: POLECAT_DONE mail triggers cleanup wisps
|
|
- **Cleanup wisps as finalizers**: Pending cleanups are wisps, not queue entries
|
|
- **Task tool for parallelism**: Subagents inspect polecats, not molecule arms
|
|
|
|
## Patrol Shape (Linear, Deacon-style)
|
|
|
|
```
|
|
inbox-check ─► process-cleanups ─► check-refinery ─► survey-workers
|
|
│
|
|
┌──────────────────────────────────────────────────┘
|
|
▼
|
|
context-check ─► loop-or-exit
|
|
```
|
|
|
|
No dynamic arms. No fanout gates. No persistent nudge counters.
|
|
State is discovered each cycle from reality (tmux, beads, mail)."""
|
|
formula = "mol-witness-patrol"
|
|
version = 1
|
|
|
|
[[steps]]
|
|
id = "inbox-check"
|
|
title = "Process witness mail"
|
|
description = """
|
|
Check inbox and handle messages.
|
|
|
|
```bash
|
|
gt mail inbox
|
|
```
|
|
|
|
For each message:
|
|
|
|
**POLECAT_DONE / LIFECYCLE:Shutdown**:
|
|
Create a cleanup wisp for this polecat:
|
|
```bash
|
|
bd create --wisp --title "cleanup:<polecat>" \
|
|
--description "Verify and cleanup polecat <name>" \
|
|
--labels cleanup,polecat:<name>
|
|
```
|
|
The wisp's existence IS the pending cleanup. Process in next step.
|
|
Mark mail as read.
|
|
|
|
**HELP / Blocked**:
|
|
Assess the request. Can you help? If not, escalate to Mayor:
|
|
```bash
|
|
gt mail send mayor/ -s "Escalation: <polecat> needs help" -m "<details>"
|
|
```
|
|
|
|
**HANDOFF**:
|
|
Read predecessor context. Continue from where they left off."""
|
|
|
|
[[steps]]
|
|
id = "process-cleanups"
|
|
title = "Process pending cleanup wisps"
|
|
needs = ["inbox-check"]
|
|
description = """
|
|
Find and process cleanup wisps (the finalizer pattern).
|
|
|
|
```bash
|
|
# Find all cleanup wisps
|
|
bd list --wisp --labels=cleanup --status=open
|
|
```
|
|
|
|
For each cleanup wisp:
|
|
|
|
1. **Extract polecat name** from wisp title/labels
|
|
|
|
2. **Pre-kill verification**:
|
|
```bash
|
|
cd polecats/<name>
|
|
git status # Must be clean
|
|
git log origin/main..HEAD # No unpushed commits
|
|
bd show <assigned-issue> # Issue closed or deferred
|
|
```
|
|
|
|
3. **Verify productive work** (ZFC - you make the call):
|
|
- Check git log for commits mentioning the issue
|
|
- Legitimate exceptions: already fixed, duplicate, deferred
|
|
- If closing as 'done' with no commits, flag for review
|
|
|
|
4. **If clean**: Execute cleanup
|
|
```bash
|
|
gt session kill <rig>/polecats/<name>
|
|
# Worktree removal handled by session kill
|
|
```
|
|
Then burn the cleanup wisp:
|
|
```bash
|
|
bd close <wisp-id> # or bd burn <wisp-id>
|
|
```
|
|
|
|
5. **If dirty**: Leave wisp open, log the issue, retry next cycle.
|
|
|
|
**Parallelism**: Use Task tool subagents to process multiple cleanups concurrently.
|
|
Each cleanup is independent - perfect for parallel execution."""
|
|
|
|
[[steps]]
|
|
id = "check-refinery"
|
|
title = "Ensure refinery is alive"
|
|
needs = ["process-cleanups"]
|
|
description = """
|
|
Ensure the refinery is alive and processing merge requests.
|
|
|
|
```bash
|
|
# Check if refinery session exists
|
|
gt session status <rig>/refinery
|
|
|
|
# Check for pending merge requests
|
|
bd list --type=merge-request --status=open
|
|
```
|
|
|
|
If MRs waiting AND refinery not running:
|
|
```bash
|
|
gt session start <rig>/refinery
|
|
gt mail send <rig>/refinery -s "PATROL: Wake up" \
|
|
-m "Merge requests in queue. Please process."
|
|
```
|
|
|
|
If refinery running but queue stale (>30 min), send nudge."""
|
|
|
|
[[steps]]
|
|
id = "survey-workers"
|
|
title = "Inspect all active polecats"
|
|
needs = ["check-refinery"]
|
|
description = """
|
|
Survey all polecats using agent beads (ZFC: trust what agents report).
|
|
|
|
**Step 1: List polecat agent beads**
|
|
|
|
```bash
|
|
bd list --type=agent --json
|
|
```
|
|
|
|
Filter the JSON output for entries where description contains `role_type: polecat`.
|
|
Each polecat agent bead has fields in its description:
|
|
- `role_type: polecat`
|
|
- `rig: <rig-name>`
|
|
- `agent_state: running|idle|stuck|done`
|
|
- `hook_bead: <current-work-id>`
|
|
|
|
**Step 2: For each polecat, check agent_state**
|
|
|
|
| agent_state | Meaning | Action |
|
|
|-------------|---------|--------|
|
|
| running | Actively working | Check progress (Step 3) |
|
|
| idle | No work assigned | Skip (no action needed) |
|
|
| stuck | Self-reported stuck | Handle stuck protocol |
|
|
| done | Work complete | Verify cleanup triggered |
|
|
|
|
**Step 3: For running polecats, assess progress**
|
|
|
|
Check the hook_bead field to see what they're working on:
|
|
```bash
|
|
bd show <hook_bead> # See current step/issue
|
|
```
|
|
|
|
You can also verify they're responsive:
|
|
```bash
|
|
tmux capture-pane -t gt-<rig>-<name> -p | tail -20
|
|
```
|
|
|
|
Look for:
|
|
- Recent tool activity → making progress
|
|
- Idle at prompt → may need nudge
|
|
- Error messages → may need help
|
|
|
|
**Step 4: Decide action**
|
|
|
|
| Observation | Action |
|
|
|-------------|--------|
|
|
| agent_state=running, recent activity | None |
|
|
| agent_state=running, idle 5-15 min | Gentle nudge |
|
|
| agent_state=running, idle 15+ min | Direct nudge with deadline |
|
|
| agent_state=stuck | Assess and help or escalate |
|
|
| agent_state=done, cleanup pending | Verify cleanup wisp exists |
|
|
|
|
**Step 5: Execute nudges**
|
|
```bash
|
|
gt nudge <rig>/polecats/<name> "How's progress? Need help?"
|
|
```
|
|
|
|
**Step 6: Escalate if needed**
|
|
```bash
|
|
gt mail send mayor/ -s "Escalation: <polecat> stuck" \\
|
|
-m "Polecat <name> reports stuck. Please intervene."
|
|
```
|
|
|
|
**Parallelism**: Use Task tool subagents to inspect multiple polecats concurrently.
|
|
|
|
**ZFC Principle**: Trust agent_state from beads. Don't infer state from PID/tmux."""
|
|
|
|
[[steps]]
|
|
id = "context-check"
|
|
title = "Check own context limit"
|
|
needs = ["survey-workers"]
|
|
description = """
|
|
Check own context usage.
|
|
|
|
If context is HIGH (>80%):
|
|
- Ensure any notes are written to handoff mail
|
|
- Prepare for session restart
|
|
|
|
If context is LOW:
|
|
- Can continue patrolling"""
|
|
|
|
[[steps]]
|
|
id = "loop-or-exit"
|
|
title = "Loop or exit for respawn"
|
|
needs = ["context-check"]
|
|
description = """
|
|
End of patrol cycle decision.
|
|
|
|
**If context LOW**:
|
|
- Sleep briefly to avoid tight loop (30-60 seconds)
|
|
- Return to inbox-check step
|
|
- Continue patrolling
|
|
|
|
**If context HIGH**:
|
|
- Write handoff mail to self with any notable observations:
|
|
```bash
|
|
gt handoff -s "Witness patrol handoff" -m "<observations>"
|
|
```
|
|
- Exit cleanly (daemon respawns fresh Witness)
|
|
|
|
The daemon ensures Witness is always running."""
|