Simplify Witness patrol: linear + Task tool, no Christmas Ornament (gt-p3v5n)
Design pivot: - Remove mol-polecat-arm and dynamic bonding pattern - Replace with linear patrol (Deacon-style) + Task tool parallelism - Cleanup wisps as finalizers (marker wisp = pending cleanup) - Discovery over tracking (no persistent nudge counts) New docs: - polecat-lifecycle.md: step-based restart model, evolution path - witness-patrol-design.md: simplified, terse Closed obsolete issues: gt-p3v5n.1 through gt-p3v5n.4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
130
docs/polecat-lifecycle.md
Normal file
130
docs/polecat-lifecycle.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Polecat Lifecycle
|
||||
|
||||
> Polecats restart after each molecule step. This is intentional.
|
||||
|
||||
## Execution Model
|
||||
|
||||
| Phase | What Happens |
|
||||
|-------|--------------|
|
||||
| **Spawn** | Worktree created, session started, molecule slung to hook |
|
||||
| **Step** | Polecat reads hook, executes ONE step, runs `gt mol step done` |
|
||||
| **Restart** | Session respawns with fresh context, next step on hook |
|
||||
| **Complete** | Last step done → POLECAT_DONE mail → cleanup wisp created |
|
||||
| **Cleanup** | Witness verifies git clean, kills session, burns wisp |
|
||||
|
||||
```
|
||||
spawn → step → restart → step → restart → ... → complete → cleanup
|
||||
└──────────────────────────────────────┘
|
||||
(fresh session each step)
|
||||
```
|
||||
|
||||
## Why Restart Every Step?
|
||||
|
||||
| Reason | Explanation |
|
||||
|--------|-------------|
|
||||
| **Atomicity** | Each step completes fully or not at all |
|
||||
| **No wandering** | Polecat can't half-finish and get distracted |
|
||||
| **Context fresh** | No accumulation of stale context across steps |
|
||||
| **Crash recovery** | Restart = re-read hook = continue from last completed step |
|
||||
|
||||
**Trade-off**: Session restart overhead. Worth it for reliability at current cognition levels.
|
||||
|
||||
## Step Packing (Author Responsibility)
|
||||
|
||||
Formula authors must size steps appropriately:
|
||||
|
||||
| Too Small | Too Large |
|
||||
|-----------|-----------|
|
||||
| Restart overhead dominates | Context exhaustion mid-step |
|
||||
| Thrashing | Partial completion, unreliable |
|
||||
|
||||
**Rule of thumb**: A step should use 30-70% of available context. Batch related micro-tasks.
|
||||
|
||||
## The `gt mol step done` Command
|
||||
|
||||
Canonical way to complete a step:
|
||||
|
||||
```bash
|
||||
gt mol step done <step-id>
|
||||
```
|
||||
|
||||
1. Closes the step in beads
|
||||
2. Finds next ready step (dependency-aware)
|
||||
3. Updates hook to next step
|
||||
4. Respawns pane with fresh session
|
||||
|
||||
**Never use `bd close` directly** - it skips the restart logic.
|
||||
|
||||
## Cleanup: The Finalizer Pattern
|
||||
|
||||
When polecat signals completion:
|
||||
|
||||
```
|
||||
POLECAT_DONE mail → Witness creates cleanup wisp → Witness processes wisp → Burn
|
||||
```
|
||||
|
||||
The wisp's existence IS the pending cleanup. No explicit queue.
|
||||
|
||||
| Cleanup Step | Verification |
|
||||
|--------------|--------------|
|
||||
| Git status | Must be clean |
|
||||
| Unpushed commits | None allowed |
|
||||
| Issue state | Closed or deferred |
|
||||
| Productive work | Commits reference issue (ZFC - Witness judges) |
|
||||
|
||||
Failed cleanup? Leave wisp, retry next cycle.
|
||||
|
||||
---
|
||||
|
||||
## Evolution Path
|
||||
|
||||
Current design will evolve as model cognition improves:
|
||||
|
||||
| Phase | Refresh Trigger | Who Decides | Witness Load |
|
||||
|-------|-----------------|-------------|--------------|
|
||||
| **Now** | Step boundary | Formula (fixed) | High |
|
||||
| **Spoon-feeding** | Context % + task size | Witness | Medium |
|
||||
| **Self-managed** | Self-awareness | Polecat | Low |
|
||||
|
||||
### Now (Step-Based Restart)
|
||||
|
||||
- Restart every step, guaranteed
|
||||
- Conservative, reliable
|
||||
- `gt mol step done` handles everything
|
||||
|
||||
### Spoon-feeding (Future)
|
||||
|
||||
Requires: Claude Code exposes context usage
|
||||
|
||||
```
|
||||
Polecat completes step
|
||||
→ Witness checks: 65% context used
|
||||
→ Next task estimate: 10% context
|
||||
→ Decision: "send another" or "recycle"
|
||||
```
|
||||
|
||||
Witness becomes supervisor, not babysitter.
|
||||
|
||||
### Self-Managed (Future)
|
||||
|
||||
Requires: Model cognition threshold + Gas Town patterns in training
|
||||
|
||||
```
|
||||
Polecat completes step
|
||||
→ Self-assesses: "I'm at 80%, should recycle"
|
||||
→ Runs gt handoff, respawns
|
||||
```
|
||||
|
||||
Polecats become autonomous. Witness becomes auditor.
|
||||
|
||||
---
|
||||
|
||||
## Key Commands
|
||||
|
||||
| Command | Effect |
|
||||
|---------|--------|
|
||||
| `gt mol step done <step>` | Complete step, restart for next |
|
||||
| `gt mol status` | Show what's on hook |
|
||||
| `gt mol progress <mol>` | Show molecule completion state |
|
||||
| `gt done` | Signal POLECAT_DONE to Witness |
|
||||
| `gt handoff` | Write notes, respawn (manual refresh) |
|
||||
@@ -1,147 +1,94 @@
|
||||
# Witness Patrol: Theory of Operation
|
||||
# Witness Patrol Design
|
||||
|
||||
## Overview
|
||||
> The Witness is the Pit Boss. Oversight, not implementation.
|
||||
|
||||
The Witness is the per-rig worker monitor. It watches polecats, nudges them toward
|
||||
completion, verifies clean state before cleanup, and escalates stuck workers.
|
||||
## Core Responsibilities
|
||||
|
||||
**Key principle: Claude-driven execution.** The mol-witness-patrol molecule is a
|
||||
playbook that Claude reads and executes. There is no Go "runtime" that auto-executes
|
||||
steps. Claude provides the intelligence; gt/bd commands provide the primitives.
|
||||
| Duty | Action |
|
||||
|------|--------|
|
||||
| Handle POLECAT_DONE | Create cleanup wisp, process later |
|
||||
| Handle HELP requests | Assess, help or escalate to Mayor |
|
||||
| Ensure refinery alive | Restart if needed |
|
||||
| Survey workers | Detect stuck polecats, nudge or escalate |
|
||||
| Process cleanups | Verify git clean, kill session, burn wisp |
|
||||
|
||||
## Architecture
|
||||
## Patrol Shape (Linear)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ THE AGENT (Claude) │
|
||||
│ │
|
||||
│ Reads molecule steps → Executes commands → Closes atoms │
|
||||
│ Uses TodoWrite for complex atoms (optional) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ PRIMITIVES (gt, bd CLI) │
|
||||
│ │
|
||||
│ gt mail, gt nudge, gt session, bd close, bd show, etc. │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ COORDINATION (Mail) │
|
||||
│ │
|
||||
│ Polecats → POLECAT_DONE → Witness inbox │
|
||||
│ Witness → "You're stuck" → Polecat (via gt nudge) │
|
||||
│ Witness → Escalation → Mayor inbox │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
inbox-check → process-cleanups → check-refinery → survey-workers → context-check → loop
|
||||
```
|
||||
|
||||
## The Patrol Cycle
|
||||
No dynamic arms. No fanout gates. Simple loop like Deacon.
|
||||
|
||||
Each patrol cycle follows the mol-witness-patrol molecule:
|
||||
## Key Design Principles
|
||||
|
||||
### 1. inbox-check
|
||||
Check mail for lifecycle events:
|
||||
- **POLECAT_DONE**: Polecat finished work, ready for cleanup
|
||||
- **Help requests**: Polecat asking for assistance
|
||||
- **Escalations**: Issues requiring attention
|
||||
| Principle | Meaning |
|
||||
|-----------|---------|
|
||||
| **Discovery over tracking** | Observe reality each cycle, don't maintain state |
|
||||
| **Events over state** | POLECAT_DONE triggers wisps, not queue updates |
|
||||
| **Cleanup wisps as finalizers** | Pending cleanup = wisp exists |
|
||||
| **Task tool for parallelism** | Subagents inspect polecats, not molecule arms |
|
||||
| **Fresh judgment each cycle** | No persistent nudge counters |
|
||||
|
||||
```bash
|
||||
gt mail inbox
|
||||
gt mail read <id>
|
||||
## Cleanup: The Finalizer Pattern
|
||||
|
||||
```
|
||||
POLECAT_DONE arrives
|
||||
↓
|
||||
Create wisp: bd create --wisp --title "cleanup:<polecat>" --labels cleanup
|
||||
↓
|
||||
(wisp exists = cleanup pending)
|
||||
↓
|
||||
Witness process-cleanups step:
|
||||
- Verify: git status clean, no unpushed, issue closed
|
||||
- Execute: gt session kill, worktree removed
|
||||
- Burn wisp
|
||||
↓
|
||||
Failed? Leave wisp, retry next cycle
|
||||
```
|
||||
|
||||
### 2. survey-workers
|
||||
For each polecat in the rig:
|
||||
## Assessing Stuck Polecats
|
||||
|
||||
```bash
|
||||
gt polecat list <rig>
|
||||
With step-based restarts, polecats are either:
|
||||
- **Working a step**: Active tool calls, progress
|
||||
- **Starting a step**: Just respawned, reading hook
|
||||
- **Stuck on a step**: No progress, same step for multiple cycles
|
||||
|
||||
| Observation | Action |
|
||||
|-------------|--------|
|
||||
| Active tool calls | None |
|
||||
| Just started step (<5 min) | None |
|
||||
| Idle 5-15 min, same step | Gentle nudge |
|
||||
| Idle 15+ min, same step | Direct nudge |
|
||||
| Idle 30+ min despite nudges | Escalate to Mayor |
|
||||
| Errors visible | Assess, help or escalate |
|
||||
| Says "done" but no POLECAT_DONE | Nudge to signal completion |
|
||||
|
||||
**No persistent nudge counts**. Each cycle: observe reality, make fresh judgment.
|
||||
|
||||
"How long stuck on same step" is discoverable from beads timestamps.
|
||||
|
||||
## Parallelism via Task Tool
|
||||
|
||||
Inspect multiple polecats concurrently using subagents:
|
||||
|
||||
```markdown
|
||||
## survey-workers step
|
||||
|
||||
For each polecat, launch Task tool subagent:
|
||||
- Capture tmux output
|
||||
- Assess state (working/idle/error/done)
|
||||
- Check beads for step progress
|
||||
- Decide and execute action
|
||||
|
||||
Task tool handles parallelism. One subagent per polecat.
|
||||
```
|
||||
|
||||
For each polecat:
|
||||
1. **Capture**: `tmux capture-pane -t gt-<rig>-<name> -p | tail -50`
|
||||
2. **Assess**: Claude reads output, determines state (working/idle/error/done)
|
||||
3. **Load history**: Read nudge count from handoff bead
|
||||
4. **Decide**: Apply escalation matrix (see below)
|
||||
5. **Execute**: Take action (none, nudge, escalate, cleanup)
|
||||
## Formula
|
||||
|
||||
### 3. save-state
|
||||
Persist state to handoff bead for next cycle:
|
||||
- Nudge counts per polecat
|
||||
- Last nudge timestamps
|
||||
- Pending actions
|
||||
See `.beads/formulas/mol-witness-patrol.formula.toml`
|
||||
|
||||
### 4. burn-or-loop
|
||||
- If context low: sleep briefly, loop back to inbox-check
|
||||
- If context high: exit (daemon respawns fresh Witness)
|
||||
## Related
|
||||
|
||||
## Nudge Escalation Matrix
|
||||
|
||||
The Witness applies escalating pressure to idle polecats:
|
||||
|
||||
| Idle Time | Nudge Count | Action |
|
||||
|-----------|-------------|--------|
|
||||
| <10min | any | none |
|
||||
| 10-15min | 0 | Gentle: "How's progress?" |
|
||||
| 15-20min | 1 | Direct: "Please wrap up. What's blocking?" |
|
||||
| 20+min | 2 | Final: "Will escalate in 5min if no response." |
|
||||
| any | 3 | Escalate to Mayor |
|
||||
|
||||
**Key insight**: Only Claude can assess whether a polecat is truly stuck.
|
||||
Looking at tmux output requires understanding context:
|
||||
- "I'm stuck on this error" → needs help
|
||||
- "Running tests..." → actively working
|
||||
- Sitting at prompt with no activity → maybe stuck
|
||||
|
||||
## State Persistence
|
||||
|
||||
The Witness handoff bead tracks:
|
||||
|
||||
```yaml
|
||||
# In handoff bead description
|
||||
nudges:
|
||||
toast:
|
||||
count: 2
|
||||
last: "2025-12-24T10:30:00Z"
|
||||
ace:
|
||||
count: 0
|
||||
last: null
|
||||
pending_cleanup:
|
||||
- nux # received POLECAT_DONE, queued for verification
|
||||
```
|
||||
|
||||
This survives across patrol cycles and context burns.
|
||||
|
||||
## Polecat Cleanup Flow
|
||||
|
||||
When a polecat signals completion:
|
||||
|
||||
1. Polecat runs `gt done` or sends POLECAT_DONE mail
|
||||
2. Witness receives mail in inbox-check
|
||||
3. Witness runs pre-kill verification:
|
||||
```bash
|
||||
cd polecats/<name>
|
||||
git status # Must be clean
|
||||
git log origin/main.. # Check for unpushed
|
||||
bd show <issue> # Verify closed
|
||||
```
|
||||
4. If clean: kill session, remove worktree, delete branch
|
||||
5. If dirty: send nudge asking polecat to fix state
|
||||
|
||||
## What We DON'T Need
|
||||
|
||||
- **Go patrol runtime**: Claude executes the playbook
|
||||
- **Polling for WaitsFor**: Mail tells us when things are ready
|
||||
- **Automated health checks**: Claude reads tmux, assesses
|
||||
- **Go nudge logic**: Claude applies the matrix
|
||||
|
||||
## What We DO Need
|
||||
|
||||
- **mol-witness-patrol**: The playbook (exists)
|
||||
- **Handoff bead**: State persistence (gt-poxd)
|
||||
- **CLI primitives**: gt mail, gt nudge, gt session (exist)
|
||||
- **Molecule tracking**: bd close for step completion (exists)
|
||||
|
||||
## Related Issues
|
||||
|
||||
- gt-poxd: Create handoff beads for Witness and Refinery roles
|
||||
- gt-y481: Patrol parity - Witness and Refinery match Deacon sophistication
|
||||
- gt-tnow: Implement Christmas Ornament pattern for mol-witness-patrol
|
||||
- [polecat-lifecycle.md](polecat-lifecycle.md) - Step-based execution model
|
||||
- [molecular-chemistry.md](molecular-chemistry.md) - MEOW stack
|
||||
|
||||
Reference in New Issue
Block a user