Simplify Witness patrol: linear + Task tool, no Christmas Ornament (gt-p3v5n)

Design pivot:
- Remove mol-polecat-arm and dynamic bonding pattern
- Replace with linear patrol (Deacon-style) + Task tool parallelism
- Cleanup wisps as finalizers (marker wisp = pending cleanup)
- Discovery over tracking (no persistent nudge counts)

New docs:
- polecat-lifecycle.md: step-based restart model, evolution path
- witness-patrol-design.md: simplified, terse

Closed obsolete issues: gt-p3v5n.1 through gt-p3v5n.4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Steve Yegge
2025-12-26 14:16:41 -08:00
parent 1df41e23c7
commit 0e90fca49f
7 changed files with 462 additions and 740 deletions

130
docs/polecat-lifecycle.md Normal file
View File

@@ -0,0 +1,130 @@
# Polecat Lifecycle
> Polecats restart after each molecule step. This is intentional.
## Execution Model
| Phase | What Happens |
|-------|--------------|
| **Spawn** | Worktree created, session started, molecule slung to hook |
| **Step** | Polecat reads hook, executes ONE step, runs `gt mol step done` |
| **Restart** | Session respawns with fresh context, next step on hook |
| **Complete** | Last step done → POLECAT_DONE mail → cleanup wisp created |
| **Cleanup** | Witness verifies git clean, kills session, burns wisp |
```
spawn → step → restart → step → restart → ... → complete → cleanup
└──────────────────────────────────────┘
(fresh session each step)
```
## Why Restart Every Step?
| Reason | Explanation |
|--------|-------------|
| **Atomicity** | Each step completes fully or not at all |
| **No wandering** | Polecat can't half-finish and get distracted |
| **Context fresh** | No accumulation of stale context across steps |
| **Crash recovery** | Restart = re-read hook = continue from last completed step |
**Trade-off**: Session restart overhead. Worth it for reliability at current cognition levels.
## Step Packing (Author Responsibility)
Formula authors must size steps appropriately:
| Too Small | Too Large |
|-----------|-----------|
| Restart overhead dominates | Context exhaustion mid-step |
| Thrashing | Partial completion, unreliable |
**Rule of thumb**: A step should use 30-70% of available context. Batch related micro-tasks.
## The `gt mol step done` Command
Canonical way to complete a step:
```bash
gt mol step done <step-id>
```
1. Closes the step in beads
2. Finds next ready step (dependency-aware)
3. Updates hook to next step
4. Respawns pane with fresh session
**Never use `bd close` directly** - it skips the restart logic.
## Cleanup: The Finalizer Pattern
When polecat signals completion:
```
POLECAT_DONE mail → Witness creates cleanup wisp → Witness processes wisp → Burn
```
The wisp's existence IS the pending cleanup. No explicit queue.
| Cleanup Step | Verification |
|--------------|--------------|
| Git status | Must be clean |
| Unpushed commits | None allowed |
| Issue state | Closed or deferred |
| Productive work | Commits reference issue (ZFC - Witness judges) |
Failed cleanup? Leave wisp, retry next cycle.
---
## Evolution Path
Current design will evolve as model cognition improves:
| Phase | Refresh Trigger | Who Decides | Witness Load |
|-------|-----------------|-------------|--------------|
| **Now** | Step boundary | Formula (fixed) | High |
| **Spoon-feeding** | Context % + task size | Witness | Medium |
| **Self-managed** | Self-awareness | Polecat | Low |
### Now (Step-Based Restart)
- Restart every step, guaranteed
- Conservative, reliable
- `gt mol step done` handles everything
### Spoon-feeding (Future)
Requires: Claude Code exposes context usage
```
Polecat completes step
→ Witness checks: 65% context used
→ Next task estimate: 10% context
→ Decision: "send another" or "recycle"
```
Witness becomes supervisor, not babysitter.
### Self-Managed (Future)
Requires: Model cognition threshold + Gas Town patterns in training
```
Polecat completes step
→ Self-assesses: "I'm at 80%, should recycle"
→ Runs gt handoff, respawns
```
Polecats become autonomous. Witness becomes auditor.
---
## Key Commands
| Command | Effect |
|---------|--------|
| `gt mol step done <step>` | Complete step, restart for next |
| `gt mol status` | Show what's on hook |
| `gt mol progress <mol>` | Show molecule completion state |
| `gt done` | Signal POLECAT_DONE to Witness |
| `gt handoff` | Write notes, respawn (manual refresh) |

View File

@@ -1,147 +1,94 @@
# Witness Patrol: Theory of Operation
# Witness Patrol Design
## Overview
> The Witness is the Pit Boss. Oversight, not implementation.
The Witness is the per-rig worker monitor. It watches polecats, nudges them toward
completion, verifies clean state before cleanup, and escalates stuck workers.
## Core Responsibilities
**Key principle: Claude-driven execution.** The mol-witness-patrol molecule is a
playbook that Claude reads and executes. There is no Go "runtime" that auto-executes
steps. Claude provides the intelligence; gt/bd commands provide the primitives.
| Duty | Action |
|------|--------|
| Handle POLECAT_DONE | Create cleanup wisp, process later |
| Handle HELP requests | Assess, help or escalate to Mayor |
| Ensure refinery alive | Restart if needed |
| Survey workers | Detect stuck polecats, nudge or escalate |
| Process cleanups | Verify git clean, kill session, burn wisp |
## Architecture
## Patrol Shape (Linear)
```
┌─────────────────────────────────────────────────────────────────┐
│ THE AGENT (Claude) │
│ │
│ Reads molecule steps → Executes commands → Closes atoms │
│ Uses TodoWrite for complex atoms (optional) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ PRIMITIVES (gt, bd CLI) │
│ │
│ gt mail, gt nudge, gt session, bd close, bd show, etc. │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ COORDINATION (Mail) │
│ │
│ Polecats → POLECAT_DONE → Witness inbox │
│ Witness → "You're stuck" → Polecat (via gt nudge) │
│ Witness → Escalation → Mayor inbox │
└─────────────────────────────────────────────────────────────────┘
inbox-check → process-cleanups → check-refinery → survey-workers → context-check → loop
```
## The Patrol Cycle
No dynamic arms. No fanout gates. Simple loop like Deacon.
Each patrol cycle follows the mol-witness-patrol molecule:
## Key Design Principles
### 1. inbox-check
Check mail for lifecycle events:
- **POLECAT_DONE**: Polecat finished work, ready for cleanup
- **Help requests**: Polecat asking for assistance
- **Escalations**: Issues requiring attention
| Principle | Meaning |
|-----------|---------|
| **Discovery over tracking** | Observe reality each cycle, don't maintain state |
| **Events over state** | POLECAT_DONE triggers wisps, not queue updates |
| **Cleanup wisps as finalizers** | Pending cleanup = wisp exists |
| **Task tool for parallelism** | Subagents inspect polecats, not molecule arms |
| **Fresh judgment each cycle** | No persistent nudge counters |
```bash
gt mail inbox
gt mail read <id>
## Cleanup: The Finalizer Pattern
```
POLECAT_DONE arrives
Create wisp: bd create --wisp --title "cleanup:<polecat>" --labels cleanup
(wisp exists = cleanup pending)
Witness process-cleanups step:
- Verify: git status clean, no unpushed, issue closed
- Execute: gt session kill, worktree removed
- Burn wisp
Failed? Leave wisp, retry next cycle
```
### 2. survey-workers
For each polecat in the rig:
## Assessing Stuck Polecats
```bash
gt polecat list <rig>
With step-based restarts, polecats are either:
- **Working a step**: Active tool calls, progress
- **Starting a step**: Just respawned, reading hook
- **Stuck on a step**: No progress, same step for multiple cycles
| Observation | Action |
|-------------|--------|
| Active tool calls | None |
| Just started step (<5 min) | None |
| Idle 5-15 min, same step | Gentle nudge |
| Idle 15+ min, same step | Direct nudge |
| Idle 30+ min despite nudges | Escalate to Mayor |
| Errors visible | Assess, help or escalate |
| Says "done" but no POLECAT_DONE | Nudge to signal completion |
**No persistent nudge counts**. Each cycle: observe reality, make fresh judgment.
"How long stuck on same step" is discoverable from beads timestamps.
## Parallelism via Task Tool
Inspect multiple polecats concurrently using subagents:
```markdown
## survey-workers step
For each polecat, launch Task tool subagent:
- Capture tmux output
- Assess state (working/idle/error/done)
- Check beads for step progress
- Decide and execute action
Task tool handles parallelism. One subagent per polecat.
```
For each polecat:
1. **Capture**: `tmux capture-pane -t gt-<rig>-<name> -p | tail -50`
2. **Assess**: Claude reads output, determines state (working/idle/error/done)
3. **Load history**: Read nudge count from handoff bead
4. **Decide**: Apply escalation matrix (see below)
5. **Execute**: Take action (none, nudge, escalate, cleanup)
## Formula
### 3. save-state
Persist state to handoff bead for next cycle:
- Nudge counts per polecat
- Last nudge timestamps
- Pending actions
See `.beads/formulas/mol-witness-patrol.formula.toml`
### 4. burn-or-loop
- If context low: sleep briefly, loop back to inbox-check
- If context high: exit (daemon respawns fresh Witness)
## Related
## Nudge Escalation Matrix
The Witness applies escalating pressure to idle polecats:
| Idle Time | Nudge Count | Action |
|-----------|-------------|--------|
| <10min | any | none |
| 10-15min | 0 | Gentle: "How's progress?" |
| 15-20min | 1 | Direct: "Please wrap up. What's blocking?" |
| 20+min | 2 | Final: "Will escalate in 5min if no response." |
| any | 3 | Escalate to Mayor |
**Key insight**: Only Claude can assess whether a polecat is truly stuck.
Looking at tmux output requires understanding context:
- "I'm stuck on this error" → needs help
- "Running tests..." → actively working
- Sitting at prompt with no activity → maybe stuck
## State Persistence
The Witness handoff bead tracks:
```yaml
# In handoff bead description
nudges:
toast:
count: 2
last: "2025-12-24T10:30:00Z"
ace:
count: 0
last: null
pending_cleanup:
- nux # received POLECAT_DONE, queued for verification
```
This survives across patrol cycles and context burns.
## Polecat Cleanup Flow
When a polecat signals completion:
1. Polecat runs `gt done` or sends POLECAT_DONE mail
2. Witness receives mail in inbox-check
3. Witness runs pre-kill verification:
```bash
cd polecats/<name>
git status # Must be clean
git log origin/main.. # Check for unpushed
bd show <issue> # Verify closed
```
4. If clean: kill session, remove worktree, delete branch
5. If dirty: send nudge asking polecat to fix state
## What We DON'T Need
- **Go patrol runtime**: Claude executes the playbook
- **Polling for WaitsFor**: Mail tells us when things are ready
- **Automated health checks**: Claude reads tmux, assesses
- **Go nudge logic**: Claude applies the matrix
## What We DO Need
- **mol-witness-patrol**: The playbook (exists)
- **Handoff bead**: State persistence (gt-poxd)
- **CLI primitives**: gt mail, gt nudge, gt session (exist)
- **Molecule tracking**: bd close for step completion (exists)
## Related Issues
- gt-poxd: Create handoff beads for Witness and Refinery roles
- gt-y481: Patrol parity - Witness and Refinery match Deacon sophistication
- gt-tnow: Implement Christmas Ornament pattern for mol-witness-patrol
- [polecat-lifecycle.md](polecat-lifecycle.md) - Step-based execution model
- [molecular-chemistry.md](molecular-chemistry.md) - MEOW stack