docs: add design docs for swarm shutdown, polecat beads, and mayor handoff

- swarm-shutdown-design.md: Worker cleanup, Witness verification, session cycling
- polecat-beads-access-design.md: Per-rig beads config, worker prompting
- mayor-handoff-design.md: Mayor session cycling and handoff protocol

Closes design epics: gt-82y, gt-l3c, gt-u82

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Steve Yegge
2025-12-15 20:16:31 -08:00
parent 9385429e4d
commit fbb8b1f040
3 changed files with 1495 additions and 0 deletions
+493
View File
@@ -0,0 +1,493 @@
# Mayor Session Cycling and Handoff Design
Design for Mayor session management, context cycling, and structured handoff.
**Epic**: gt-u82 (Design: Mayor session cycling and handoff)
## Overview
Mayor coordinates across all rigs and runs for extended periods. Like Witness,
Mayor needs to cycle sessions when context fills, producing structured handoff
notes for the next session.
## Key Differences from Witness
| Aspect | Witness | Mayor |
|--------|---------|-------|
| Scope | Single rig, workers | All rigs, refineries |
| State tracking | Worker status, pending verifications | Active swarms, rig status, escalations |
| Handoff recipient | Self (same rig Witness) | Self (Mayor) |
| Complexity | Medium | Higher (cross-rig coordination) |
| Daemon | Witness daemon respawns | No daemon (manual or cron restart) |
## Design Areas
1. **Session cycling recognition** - When Mayor should cycle
2. **Handoff note format** - Structured state capture
3. **Handoff delivery** - Mail to self
4. **Fresh session startup** - Reading and resuming from handoff
5. **Integration with town commands** - CLI support
---
## 1. Session Cycling Recognition
### When to Cycle
Mayor should cycle when:
- Context is noticeably filling (responses slowing, losing track of state)
- Major phase completed (swarm finished, integration done)
- User requests session end
- Extended idle period with no active work
### Proactive vs Reactive
**Proactive** (preferred):
- Mayor notices context filling and initiates handoff
- Clean state capture while still coherent
**Reactive** (fallback):
- Session times out or crashes
- Less clean, may lose state
### Recognition Cues (for prompting)
```markdown
## Session Cycling
Monitor your context usage throughout the session. Signs you should cycle:
- You've been running for several hours
- You're having trouble remembering earlier conversation context
- You've completed a major phase of work
- Responses are taking longer than usual
- You're about to start a complex new operation
When you notice these signs, proactively initiate handoff rather than
waiting for problems.
```
---
## 2. Handoff Note Format
### Structure
Mayor handoff captures cross-rig state:
```
[HANDOFF_TYPE]: mayor_cycle
[TIMESTAMP]: 2024-01-15T14:30:00Z
[SESSION_DURATION]: 3h 45m
## Active Swarms
<per-rig swarm status>
## Rig Status
<health/state of each rig>
## Pending Escalations
<issues awaiting Mayor decision>
## In-Flight Decisions
<decisions being made, context needed>
## Recent Actions
<last 5-10 significant actions>
## Delegated Work
<work sent to refineries, awaiting response>
## User Requests
<any pending user requests>
## Next Steps
<what the next session should do>
## Warnings/Notes
<anything critical for next session>
```
### Example Handoff Note
```markdown
[HANDOFF_TYPE]: mayor_cycle
[TIMESTAMP]: 2024-01-15T14:30:00Z
[SESSION_DURATION]: 3h 45m
## Active Swarms
### gastown
- Status: Active swarm on auth feature
- Refinery: gastown/refinery coordinating
- Workers: 3 active (Furiosa, Toast, Capable)
- Issues: gt-auth-1, gt-auth-2, gt-auth-3
- Expected completion: Soon (2/3 issues merged)
### beads
- Status: Idle, no active swarm
- Last activity: 2h ago (maintenance work)
## Rig Status
| Rig | Health | Last Contact | Notes |
|-----|--------|--------------|-------|
| gastown | Good | 5min ago | Swarm active |
| beads | Good | 2h ago | Idle |
## Pending Escalations
1. **gastown/Toast stuck** - Witness escalated at 14:15
- Issue: gt-auth-2 has merge conflict
- Awaiting decision: reassign or manual fix?
- Context: Toast tried 3 times, conflict in auth/middleware.go
## In-Flight Decisions
None currently.
## Recent Actions
1. 14:25 - Checked gastown swarm status
2. 14:20 - Received escalation re: Toast
3. 14:00 - Sent status request to beads/refinery
4. 13:30 - Dispatched auth swarm to gastown
5. 13:00 - Session started, read previous handoff
## Delegated Work
- gastown/refinery: Auth feature swarm (dispatched 13:30)
- Expecting completion report when done
## User Requests
- User asked for auth feature implementation (completed dispatch)
- No other pending requests
## Next Steps
1. **Resolve Toast escalation** - Decide on reassign vs manual fix
2. **Monitor gastown swarm** - Should complete soon
3. **Check beads rig** - Been quiet, verify health
## Warnings/Notes
- Toast merge conflict is blocking swarm completion
- Consider waking another polecat if reassignment needed
```
---
## 3. Handoff Delivery
### Mail to Self
Mayor mails handoff to own inbox:
```bash
town mail send mayor/ -s "Session Handoff" -m "<handoff-content>"
```
### Why Mail (not file)?
- Consistent with Witness pattern
- Timestamped and logged
- Works across potential Mayor instances
- Integrates with existing inbox check on startup
### Handoff Template Function
```python
def mayor_handoff(
active_swarms: List[SwarmStatus],
rig_status: Dict[str, RigStatus],
pending_escalations: List[Escalation],
in_flight_decisions: List[Decision],
recent_actions: List[str],
delegated_work: List[DelegatedItem],
user_requests: List[str],
next_steps: List[str],
warnings: Optional[str] = None,
session_duration: Optional[str] = None,
) -> Message:
"""Create Mayor session handoff note."""
metadata = {
"template": "MAYOR_HANDOFF",
"timestamp": datetime.utcnow().isoformat(),
"session_duration": session_duration,
"active_swarm_count": len(active_swarms),
"pending_escalation_count": len(pending_escalations),
}
# ... format sections ...
return Message.create(
sender="mayor/",
recipient="mayor/",
subject="Session Handoff",
body=body,
priority="high", # Ensure it's seen
)
```
---
## 4. Fresh Session Startup
### Startup Protocol
When Mayor session starts:
1. **Check for handoff**:
```bash
town inbox | grep "Session Handoff"
```
2. **If handoff exists**:
```bash
# Read most recent handoff
town read <latest-handoff-id>
# Resume from handoff state
# - Address pending escalations first
# - Check on in-flight work
# - Continue with next steps
```
3. **If no handoff** (fresh start):
```bash
# Full system status check
town status
town rigs
bd ready
# Check all rig inboxes for pending items
town inbox
```
### Handoff Processing
```markdown
## On Session Start
1. **Check inbox for handoff**:
```bash
town inbox
```
Look for "Session Handoff" messages.
2. **If handoff found**:
- Read the handoff note
- Process pending escalations (highest priority)
- Check status of noted swarms
- Verify rig health matches notes
- Continue with documented next steps
3. **If no handoff**:
- Do full status check: `town status`
- Check each rig: `town rigs`
- Check inbox for any messages
- Check beads for work: `bd ready`
4. **After processing handoff**:
- Archive or delete the handoff message
- You now own the current state
```
---
## 5. Integration with Town Commands
### New Commands (optional, can be deferred)
```bash
# Generate handoff note interactively
town handoff
# Generate and send in one step
town handoff --send
# Check for handoff on startup
town resume
```
### Implementation
For now, Mayor does this manually in prompting. Later can add CLI support:
```go
// cmd/gt/cmd/handoff.go
var handoffCmd = &cobra.Command{
Use: "handoff",
Short: "Generate session handoff note",
Run: func(cmd *cobra.Command, args []string) {
// Gather state
swarms := gatherActiveSwarms()
rigs := gatherRigStatus()
// ... etc
// Format handoff
note := formatHandoffNote(swarms, rigs, ...)
if send {
// Send to mayor inbox
mail.Send("mayor/", "Session Handoff", note)
} else {
// Print for review
fmt.Println(note)
}
},
}
```
---
## Subtasks
Based on this design, create these implementation subtasks:
### gt-u82.1: Mayor session cycling prompting
Add to Mayor CLAUDE.md:
- When to cycle recognition
- How to compose handoff note
- Handoff format specification
### gt-u82.2: Mayor startup protocol prompting
Add to Mayor CLAUDE.md:
- Check for handoff on start
- Process handoff content
- Fresh start fallback
### gt-u82.3: Mayor handoff mail template
Add to templates.py:
- MAYOR_HANDOFF template
- Parsing utilities
### gt-u82.4: (Optional) town handoff command
CLI support for handoff generation:
- `town handoff` - generate interactively
- `town handoff --send` - generate and mail
- `town resume` - check for and display handoff
---
## Prompting Additions
### Mayor CLAUDE.md - Session Management Section
```markdown
## Session Management
### Recognizing When to Cycle
Monitor your session health. Cycle proactively when:
- You've been running for several hours
- Context feels crowded (losing track of earlier state)
- Major phase completed (good stopping point)
- About to start complex new work
Don't wait for problems - proactive handoff produces cleaner state.
### Creating Handoff Notes
Before ending your session, capture current state:
1. **Gather information**:
```bash
town status # Overall health
town rigs # Each rig's state
town inbox # Pending messages
bd ready # Work items
```
2. **Compose handoff note** with this structure:
```
[HANDOFF_TYPE]: mayor_cycle
[TIMESTAMP]: <current time>
[SESSION_DURATION]: <how long you've been running>
## Active Swarms
<list each rig with active swarm, workers, progress>
## Rig Status
<table of rig health>
## Pending Escalations
<issues needing your decision>
## In-Flight Decisions
<decisions you were making>
## Recent Actions
<last 5-10 things you did>
## Delegated Work
<work sent to refineries>
## User Requests
<any pending user asks>
## Next Steps
<what next session should do>
## Warnings/Notes
<critical info for next session>
```
3. **Send handoff**:
```bash
town mail send mayor/ -s "Session Handoff" -m "<your handoff note>"
```
4. **End session** - next instance will pick up from handoff.
### On Session Start
1. **Check for handoff**:
```bash
town inbox | grep "Session Handoff"
```
2. **If found, read it**:
```bash
town read <msg-id>
```
3. **Process in priority order**:
- Pending escalations (urgent)
- In-flight decisions (context-dependent)
- Check noted swarm status (may have changed)
- Continue with next steps
4. **If no handoff**:
```bash
town status
town rigs
bd ready
town inbox
```
Build your own picture of current state.
### Handoff Best Practices
- **Be specific** - "Toast has merge conflict in auth/middleware.go" not "Toast is stuck"
- **Include context** - Why decisions are pending, what you were thinking
- **Prioritize next steps** - What's most urgent
- **Note time-sensitive items** - Anything that might have changed since handoff
```
---
## Implementation Checklist
- [ ] Create subtasks (gt-u82.1 through gt-u82.4)
- [ ] Add session management section to Mayor CLAUDE.md template
- [ ] Add MAYOR_HANDOFF template to templates.py
- [ ] Update startup instructions in Mayor prompting
- [ ] (Optional) Implement town handoff command
+448
View File
@@ -0,0 +1,448 @@
# Polecat Beads Write Access Design
Design for granting polecats direct beads write access.
**Epic**: gt-l3c (Design: Polecat Beads write access)
## Background
Originally, polecats were read-only for beads to prevent multi-agent conflicts.
With Beads v0.30.0's tombstone-based rearchitecture for deletions, we now have
solid multi-agent support even at high concurrent load.
## Benefits
1. **Simplifies architecture** - No need for mail-based issue filing proxy via Witness
2. **Empowers polecats** - Can file discovered work that's out of their purview
3. **Beads handles work-disavowal** - Workers can close issues they didn't start
4. **Faster feedback** - No round-trip through Witness for issue creation
## Complications
For OSS projects where you're not a maintainer:
- Can't commit to the project's `.beads/` directory
- Need to file beads in a separate repo
- Beads supports this via `--root` flag
## Subtask Designs
---
### gt-zx3: Per-Rig Beads Configuration
#### Config Location
Per-rig configuration lives in the rig's config:
**Option A: In rig state.json** (simpler)
```
<rig>/config.json (or state.json)
```
**Option B: In town-level rigs.json** (centralized)
```
config/rigs.json
```
Recommend **Option A** - each rig owns its config, easier to manage.
#### Config Schema
```json
// <rig>/config.json
{
"version": 1,
"name": "wyvern",
"git_url": "https://github.com/steveyegge/wyvern",
"beads": {
// Where polecats file beads
// Options: "local" | "<path>" | "<git-url>"
"repo": "local",
// Override bd --root (optional, derived from repo if not set)
"root": null,
// Issue prefix for this rig (used by bd create)
"prefix": "wyv"
}
}
```
#### Repo Options
| Value | Meaning | Use Case |
|-------|---------|----------|
| `"local"` | Use project's `.beads/` | Own projects, full commit access |
| `"<path>"` | Use beads at path | OSS contributions, external beads |
| `"<git-url>"` | Clone and use repo | Team shared beads |
#### Examples
**Local project (default)**:
```json
{
"beads": {
"repo": "local",
"prefix": "wyv"
}
}
```
**OSS contribution** (can't commit to project):
```json
{
"beads": {
"repo": "/home/user/my-beads/react-contributions",
"prefix": "react"
}
}
```
**Team shared beads**:
```json
{
"beads": {
"repo": "https://github.com/myteam/shared-beads",
"prefix": "team"
}
}
```
#### Environment Variable Injection
When spawning polecats, Gas Town sets:
```bash
export BEADS_ROOT="<resolved-path>"
# Polecats use bd normally; it respects BEADS_ROOT
```
Or pass explicit flag in spawn:
```bash
# Gas Town wraps bd calls internally
bd --root "$BEADS_ROOT" create --title="..."
```
#### Resolution Logic
```go
func ResolveBeadsRoot(rigConfig *RigConfig, rigPath string) (string, error) {
beads := rigConfig.Beads
switch {
case beads.Root != "":
// Explicit root override
return beads.Root, nil
case beads.Repo == "local" || beads.Repo == "":
// Use project's .beads/
return filepath.Join(rigPath, ".beads"), nil
case strings.HasPrefix(beads.Repo, "/") || strings.HasPrefix(beads.Repo, "~"):
// Absolute path
return expandPath(beads.Repo), nil
case strings.Contains(beads.Repo, "://"):
// Git URL - need to clone
return cloneAndResolve(beads.Repo)
default:
// Relative path from rig
return filepath.Join(rigPath, beads.Repo), nil
}
}
```
---
### gt-e1y: Worker Prompting - Beads Write Access
Add to polecat CLAUDE.md template (AGENTS.md.template):
```markdown
## Beads Access
You have **full beads access** - you can create, update, and close issues.
### Quick Reference
```bash
# View available work
bd ready # Issues ready to work (no blockers)
bd list # All open issues
bd show <id> # Issue details
# Create issues
bd create --title="Fix login bug" --type=bug --priority=2
bd create --title="Add dark mode" --type=feature
# Update issues
bd update <id> --status=in_progress # Claim work
bd close <id> # Mark complete
bd close <id> --reason="Duplicate of <other>"
# Sync (required before merge!)
bd sync # Commit beads changes to git
bd sync --status # Check if sync needed
```
### When to Create Issues
Create beads issues when you discover work that:
- Is outside your current task scope
- Would benefit from tracking
- Should be done by someone else (or future you)
**Good examples**:
```bash
# Found a bug while implementing feature
bd create --title="Race condition in auth middleware" --type=bug --priority=1
# Noticed missing documentation
bd create --title="Document API rate limits" --type=task --priority=3
# Tech debt worth tracking
bd create --title="Refactor legacy payment module" --type=task --priority=4
```
**Don't create issues for**:
- Tiny fixes you can do in 2 minutes (just do them)
- Vague "improvements" with no clear scope
- Work that's already tracked elsewhere
### Issue Lifecycle
```
┌─────────┐ ┌─────────────┐ ┌──────────┐
│ open │───►│ in_progress │───►│ closed │
└─────────┘ └─────────────┘ └──────────┘
│ ▲
└───────────────────────────────────┘
(can close directly)
```
You can close issues without claiming them first.
Useful for quick fixes or discovered duplicates.
### Beads Sync Protocol
**CRITICAL**: Always sync beads before merging to main!
```bash
# Before your final merge
bd sync # Commits beads changes
git status # Should show .beads/ changes
git add .beads/
git commit -m "beads: sync"
# Then proceed with merge to main
```
If you forget to sync, your beads changes will be lost when your session ends.
### Your Beads Repo
Your beads are configured for this rig. You don't need to specify --root.
Just use `bd` commands normally.
To check where your beads go:
```bash
bd config show root
```
```
---
### gt-cjb: Witness Updates - Remove Issue Filing Proxy
Update Witness CLAUDE.md to remove proxy responsibilities:
**REMOVE from Witness prompting**:
```markdown
## Issue Filing Proxy (REMOVED)
The following is NO LONGER your responsibility:
- Processing polecat "file issue" mail requests
- Creating issues on behalf of polecats
- Forwarding issue creation requests
Polecats now have direct beads write access and file their own issues.
```
**KEEP in Witness prompting** (from swarm-shutdown-design.md):
- Monitoring polecat progress
- Nudge protocol
- Pre-kill verification
- Session lifecycle management
**UPDATE**: If Witness receives an old-style "please file issue" request:
```markdown
### Legacy Issue Filing Requests
If you receive a mail asking you to file an issue on a polecat's behalf:
1. **Respond with update**:
```bash
town inject <polecat> "UPDATE: You have direct beads access now. Use 'bd create --title=\"...\" --type=...' to file issues yourself."
```
2. **Don't file the issue yourself** - let the polecat learn the new workflow.
```
---
### gt-082: Worker Cleanup - Beads Sync on Shutdown
This integrates with swarm-shutdown-design.md decommission checklist.
**Update to decommission checklist** (addition to gt-sd6):
```markdown
## Decommission Checklist (Updated)
### Pre-Done Verification
```bash
# 1. Git status - must be clean
git status
# Expected: "nothing to commit, working tree clean"
# 2. Stash list - must be empty
git stash list
# Expected: (empty)
# 3. Beads sync - MUST be synced
bd sync --status
# Expected: "Up to date" or "Nothing to sync"
# If not: run 'bd sync' first!
# 4. Beads committed - verify in git
git status
# Expected: .beads/ should NOT show changes
# If it does: git add .beads/ && git commit -m "beads: sync"
# 5. Branch merged to main
git log main --oneline -1
git log HEAD --oneline -1
# Expected: Same commit
```
### Beads Edge Cases
**Uncommitted beads changes**:
```bash
bd sync # Commits to .beads/
git add .beads/
git commit -m "beads: final sync"
```
**Beads sync conflict** (rare):
```bash
# If bd sync fails with conflict:
git fetch origin main
git checkout main -- .beads/
bd sync --force # Re-apply your changes
git add .beads/
git commit -m "beads: resolve sync conflict"
```
```
**Update to Witness pre-kill verification** (addition to gt-f8v):
```markdown
### Beads-Specific Verification
When capturing worker state, also check beads:
```bash
town capture <polecat> "bd sync --status && git status .beads/"
```
**Check for**:
- `bd sync --status` shows "Up to date"
- `git status .beads/` shows no changes
**If beads not synced**:
```
town inject <polecat> "WITNESS CHECK: Beads not synced. Run 'bd sync' then 'git add .beads/ && git commit -m \"beads: sync\"'. Signal done again when complete."
```
```
---
## Config File Examples
### Rig with local beads (default)
```json
// gastown/config.json
{
"version": 1,
"name": "gastown",
"git_url": "https://github.com/steveyegge/gastown",
"beads": {
"repo": "local",
"prefix": "gt"
}
}
```
### Rig contributing to OSS project
```json
// react/config.json
{
"version": 1,
"name": "react",
"git_url": "https://github.com/facebook/react",
"beads": {
"repo": "/home/steve/my-beads/react",
"prefix": "react"
}
}
```
### Rig with team shared beads
```json
// internal-app/config.json
{
"version": 1,
"name": "internal-app",
"git_url": "https://github.com/mycompany/internal-app",
"beads": {
"repo": "https://github.com/mycompany/team-beads",
"prefix": "app"
}
}
```
---
## Migration Notes
### For Existing Rigs
1. Add `beads` section to rig config.json
2. Default to `"repo": "local"` if not specified
3. Update polecat CLAUDE.md templates
4. Remove Witness proxy code
### Backwards Compatibility
- If `beads` section missing, assume `"repo": "local"`
- Old-style "file issue" mail requests get redirect nudge
- No breaking changes for polecats already using bd read commands
---
## Implementation Checklist
- [ ] Add beads config schema to rig config (gt-zx3)
- [ ] Update polecat CLAUDE.md template with bd write access (gt-e1y)
- [ ] Update Witness CLAUDE.md to remove proxy, add redirect (gt-cjb)
- [ ] Update decommission checklist with beads sync (gt-082)
- [ ] Update Witness verification to check beads sync (gt-082)
- [ ] Add BEADS_ROOT environment injection to spawn logic
+554
View File
@@ -0,0 +1,554 @@
# Swarm Shutdown Design
Design for graceful swarm shutdown, worker cleanup, and session cycling.
**Epic**: gt-82y (Design: Swarm shutdown and worker cleanup)
## Key Decisions (from ultrathink)
1. **Pre-kill verification uses model intelligence** - Witness assesses git status output, not framework rules
2. **Witness can request restart** - Mail self handoff notes, exit cleanly when context filling
3. **Mayor NOT involved in per-worker cleanup** - That's Witness's domain
4. **Polecats verify themselves first** - Decommission checklist in prompting, Witness double-checks
## Responsibility Boundaries (gt-gl2)
### Mayor Responsibilities
- Swarm dispatch and strategic planning
- Cross-rig coordination
- Escalation handling (when Witness reports blocked workers)
- Final integration decisions
- **NOT**: Per-worker cleanup, session killing, nudging
### Witness Responsibilities
- Monitor worker health and progress
- Nudge workers toward completion
- Pre-kill verification (capture & assess git status)
- Session lifecycle (kill, restart workers)
- Self session cycling (mail handoff, exit)
- Report blocked workers to Mayor for escalation
- **NOT**: Implementation work, cross-rig coordination
### Polecat Responsibilities
- Complete assigned work
- Self-verify before signaling done (decommission checklist)
- Respond to Witness nudges
- **NOT**: Killing own session, coordinating with other polecats directly
## Subtask Designs
---
### gt-sd6: Enhanced Polecat Decommission Prompting
Add to polecat CLAUDE.md template (AGENTS.md.template):
```markdown
## Decommission Checklist
**CRITICAL**: Before signaling you are done, you MUST complete this checklist.
The Witness will verify each item and bounce you back if anything is dirty.
### Pre-Done Verification
Run these commands and verify ALL are clean:
```bash
# 1. Git status - must be clean (no uncommitted changes)
git status
# Expected: "nothing to commit, working tree clean"
# 2. Stash list - must be empty (no forgotten stashes)
git stash list
# Expected: (empty output)
# 3. Beads sync - must be up to date
bd sync --status
# Expected: "Up to date" or "Nothing to sync"
# 4. Branch merged - your work must be on main
git log main --oneline -1
git log HEAD --oneline -1
# Expected: Same commit (your branch is merged)
```
### If Any Check Fails
- **Uncommitted changes**: Commit them or discard if truly unnecessary
- **Stashes**: Pop and commit, or drop if obsolete
- **Beads out of sync**: Run `bd sync`
- **Branch not merged**: Complete the merge workflow
### Signaling Done
Only after ALL checks pass:
```bash
# Close your issue
bd close <issue-id>
# Final sync
bd sync
# Signal ready for decommission
town mail send <rig>/witness -s "Work Complete" -m "Issue <id> done. Checklist verified."
```
The Witness will capture your git state and verify before killing your session.
If anything is dirty, you'll receive a nudge with specific issues to fix.
```
---
### gt-f8v: Witness Pre-Kill Verification Protocol
Add to Witness CLAUDE.md template:
```markdown
## Pre-Kill Verification Protocol
Before killing any worker session, you MUST verify their workspace is clean.
Use your judgment on the output - don't rely on pattern matching.
### Verification Steps
When a worker signals done:
1. **Capture worker state**:
```bash
# Attach and capture git status
town capture <polecat> "git status && git stash list && git log --oneline -3"
```
2. **Assess the output** (use your judgment):
- Is working tree clean? (no modified/untracked files that matter)
- Is stash list empty? (or only contains intentional stashes)
- Does recent history show their work is committed?
3. **Decision**:
- **CLEAN**: Proceed to kill session
- **DIRTY**: Send nudge with specific issues
### Nudge Templates
**Uncommitted Changes**:
```
town inject <polecat> "WITNESS CHECK: You have uncommitted changes. Please commit or discard: <list files>. Signal done again when clean."
```
**Stash Not Empty**:
```
town inject <polecat> "WITNESS CHECK: You have stashed changes. Please pop and commit, or drop if obsolete: <stash list>. Signal done again when clean."
```
**Work Not Merged**:
```
town inject <polecat> "WITNESS CHECK: Your commits are not on main. Please complete merge workflow. Signal done again when merged."
```
**Multiple Issues**:
```
town inject <polecat> "WITNESS CHECK: Multiple issues found:
1. <issue 1>
2. <issue 2>
Please resolve all and signal done again."
```
### Kill Sequence
Only after verification passes:
```bash
# Log the verification
echo "[$(date)] Verified clean: <polecat>" >> witness/verification.log
# Kill the session
town kill <polecat>
# Update state
town sleep <polecat>
```
### Escalation
If a worker fails verification 3+ times or becomes unresponsive:
```bash
town mail send mayor/ -s "Escalation: <polecat> stuck" -m "Worker <polecat> cannot complete cleanup after 3 attempts. Issues: <list>. Requesting guidance."
```
```
---
### gt-eu9: Witness Session Cycling and Handoff
Add to Witness CLAUDE.md template:
```markdown
## Session Cycling
Your context will fill over long swarms. When you notice significant context usage
or feel you're losing track of state, proactively cycle your session.
### Recognizing When to Cycle
Signs you should cycle:
- You've been running for many hours
- You're losing track of which workers you've checked
- Responses are getting slower or less coherent
- You're about to start a complex operation
### Handoff Protocol
1. **Capture current state**:
```bash
# Check all worker states
town list .
# Check pending verifications
town all beads
# Check your inbox for unprocessed messages
town inbox
```
2. **Compose handoff note**:
```bash
town mail send <rig>/witness -s "Session Handoff" -m "$(cat <<'EOF'
[HANDOFF_TYPE]: witness_cycle
[TIMESTAMP]: $(date -Iseconds)
[RIG]: <rig>
## Active Workers
<list workers and their current status>
## Pending Verifications
<workers who signaled done but not yet verified>
## Recent Actions
<last 3-5 actions taken>
## Warnings/Notes
<anything the next session should know>
## Next Steps
<what should happen next>
EOF
)"
```
3. **Exit cleanly**:
```bash
# Ensure no pending operations
# Then simply end your session - the daemon will spawn a fresh one
```
### Handoff Note Format
The handoff note uses metadata format for parseability:
```
[HANDOFF_TYPE]: witness_cycle
[TIMESTAMP]: 2024-01-15T10:30:00Z
[RIG]: gastown
## Active Workers
- Furiosa: working on gt-abc1 (spawned 2h ago)
- Toast: idle, awaiting assignment
- Capable: signaled done, pending verification
## Pending Verifications
- Capable: signaled done at 10:25, not yet verified
## Recent Actions
1. Verified and killed Nux (gt-xyz9 complete)
2. Spawned Furiosa on gt-abc1
3. Received done signal from Capable
## Warnings/Notes
- Furiosa has been quiet for 30min, may need nudge
- Integration branch has 3 merged PRs
## Next Steps
1. Verify Capable's workspace
2. Check on Furiosa's progress
3. Report status to Refinery if all workers done
```
### On Fresh Session Start
When you start (or restart after cycling):
1. **Check for handoff**:
```bash
town inbox | grep "Session Handoff"
```
2. **If handoff exists, read it**:
```bash
town read <handoff-msg-id>
```
3. **Resume from handoff state** - pick up pending verifications, check noted workers
4. **If no handoff** - do full status check:
```bash
town list .
town all beads
```
```
---
### gt-gl2: Mayor vs Witness Cleanup Documentation
This goes in the main Gas Town documentation or CLAUDE.md templates.
```markdown
## Cleanup Authority Model
Gas Town uses a clear separation of cleanup responsibilities:
### The Rule
**Witness handles ALL per-worker cleanup. Mayor is never involved.**
### Why This Matters
1. **Separation of concerns**: Mayor thinks strategically, Witness thinks operationally
2. **Reduced coordination overhead**: No back-and-forth for routine cleanup
3. **Faster shutdown**: Witness can kill workers immediately upon verification
4. **Cleaner escalation**: Mayor only hears about problems, not routine operations
### What "Cleanup" Means
Witness handles:
- Verifying worker git state before kill
- Nudging workers to fix dirty state
- Killing worker sessions
- Updating worker state (sleep/wake)
- Logging verification results
Mayor handles:
- Receiving "swarm complete" notifications
- Deciding whether to start new swarms
- Handling escalations (stuck workers after multiple retries)
- Cross-rig coordination if workers need to hand off
### Escalation Path
```
Worker stuck -> Witness nudges (up to 3x) -> Witness escalates to Mayor
-> Mayor decides: force kill, reassign, or human intervention
```
### Anti-Patterns
**DON'T**: Have Mayor ask Witness "is worker X clean?"
**DO**: Have Witness report "swarm complete, all workers verified and killed"
**DON'T**: Have Mayor kill worker sessions directly
**DO**: Have Mayor tell Witness "abort swarm" and let Witness handle cleanup
**DON'T**: Have workers report done to Mayor
**DO**: Have workers report done to Witness, Witness aggregates and reports to Refinery/Mayor
```
---
## Mail Templates (additions to templates.py)
### WORKER_DONE (Worker -> Witness)
```python
def worker_done(
sender: str,
rig: str,
issue_id: str,
checklist_verified: bool = True,
) -> Message:
"""Worker signals completion to Witness."""
metadata = {
"template": "WORKER_DONE",
"rig": rig,
"issue": issue_id,
"checklist_verified": checklist_verified,
}
body = f"""Work complete on {issue_id}.
{_format_metadata(metadata)}
Decommission checklist {'verified' if checklist_verified else 'NOT verified - please check'}.
Ready for verification and session termination.
"""
return Message.create(
sender=sender,
recipient=f"{rig}/witness",
subject=f"Work Complete: {issue_id}",
body=body,
)
```
### VERIFICATION_FAILED (Witness -> Worker, via inject)
```python
def verification_failed(
worker: str,
issues: List[str],
) -> str:
"""Generate nudge text for failed verification (injected, not mailed)."""
issues_text = "\n".join(f" - {issue}" for issue in issues)
return f"""WITNESS VERIFICATION FAILED
The following issues must be resolved before decommission:
{issues_text}
Please fix these issues and signal done again.
"""
```
### WITNESS_HANDOFF (Witness -> Witness)
```python
def witness_handoff(
sender: str,
rig: str,
active_workers: List[Dict],
pending_verifications: List[str],
recent_actions: List[str],
warnings: Optional[str] = None,
next_steps: List[str] = None,
) -> Message:
"""Witness session handoff note."""
metadata = {
"template": "WITNESS_HANDOFF",
"rig": rig,
"timestamp": datetime.utcnow().isoformat(),
"active_worker_count": len(active_workers),
"pending_verification_count": len(pending_verifications),
}
# Format workers
workers_text = "\n".join(
f"- {w['name']}: {w['status']}" for w in active_workers
) or "None"
# Format pending
pending_text = "\n".join(f"- {p}" for p in pending_verifications) or "None"
# Format actions
actions_text = "\n".join(f"{i+1}. {a}" for i, a in enumerate(recent_actions[-5:]))
body = f"""Session handoff for {rig} Witness.
{_format_metadata(metadata)}
## Active Workers
{workers_text}
## Pending Verifications
{pending_text}
## Recent Actions
{actions_text}
## Warnings
{warnings or "None"}
## Next Steps
{chr(10).join(f"- {s}" for s in (next_steps or ["Check pending verifications"]))}
"""
return Message.create(
sender=sender,
recipient=f"{rig}/witness",
subject="Session Handoff",
body=body,
)
```
### ESCALATION (Witness -> Mayor)
```python
def worker_escalation(
sender: str,
rig: str,
worker: str,
issue_id: str,
attempts: int,
unresolved_issues: List[str],
) -> Message:
"""Witness escalates stuck worker to Mayor."""
metadata = {
"template": "WORKER_ESCALATION",
"rig": rig,
"worker": worker,
"issue": issue_id,
"verification_attempts": attempts,
}
issues_text = "\n".join(f" - {i}" for i in unresolved_issues)
body = f"""Worker {worker} cannot complete cleanup.
{_format_metadata(metadata)}
After {attempts} verification attempts, the following issues remain:
{issues_text}
Requesting guidance:
1. Force kill and abandon changes?
2. Reassign to another worker?
3. Escalate to human?
"""
return Message.create(
sender=sender,
recipient="mayor/",
subject=f"Escalation: {worker} stuck on {issue_id}",
body=body,
priority="high",
)
```
---
## Implementation Notes
### Verification State Tracking
Witness should track verification attempts in memory (or state.json):
```json
{
"pending_verifications": {
"Furiosa": {
"issue_id": "gt-abc1",
"signaled_at": "2024-01-15T10:25:00Z",
"attempts": 1,
"last_issues": ["uncommitted changes in src/foo.py"]
}
}
}
```
### Nudge vs Mail
- **Nudge (inject)**: For immediate attention - verification failures, progress checks
- **Mail**: For async communication - handoffs, escalations, status reports
### Timeout Handling
If worker doesn't respond to nudge within reasonable time:
1. First: Re-nudge with more urgency
2. Second: Capture their session state for diagnostics
3. Third: Escalate to Mayor
---
## Checklist for Implementation
- [ ] Update AGENTS.md.template with decommission checklist (gt-sd6)
- [ ] Create WITNESS_CLAUDE.md template with verification protocol (gt-f8v)
- [ ] Add session cycling to Witness prompting (gt-eu9)
- [ ] Document cleanup authority in main docs (gt-gl2)
- [ ] Add mail templates to templates.py
- [ ] Add verification state to Witness state.json schema