bd sync: 2025-12-18 21:41:14

This commit is contained in:
Steve Yegge
2025-12-18 21:41:14 -08:00
parent 4fb0d55975
commit ce6fecbc57

View File

@@ -60,6 +60,7 @@
{"id":"gt-6z2","title":"Test Epic: GGT MVP Validation","description":"","status":"closed","priority":0,"issue_type":"epic","created_at":"2025-12-16T21:57:37.355269-08:00","updated_at":"2025-12-16T22:06:41.124727-08:00","closed_at":"2025-12-16T22:06:41.124727-08:00","close_reason":"Test issues for MVP validation"}
{"id":"gt-6z2.1","title":"Test Task 1: Add comment to gt.go","description":"","status":"closed","priority":0,"issue_type":"task","created_at":"2025-12-16T21:57:43.554166-08:00","updated_at":"2025-12-16T22:06:41.126547-08:00","closed_at":"2025-12-16T22:06:41.126547-08:00","close_reason":"Test issues for MVP validation","dependencies":[{"issue_id":"gt-6z2.1","depends_on_id":"gt-6z2","type":"parent-child","created_at":"2025-12-16T21:57:43.55748-08:00","created_by":"daemon","metadata":"{}"}]}
{"id":"gt-6z2.2","title":"Test Task 2: Add comment to version.go","description":"","status":"closed","priority":0,"issue_type":"task","created_at":"2025-12-16T21:57:43.693217-08:00","updated_at":"2025-12-16T22:06:41.127126-08:00","closed_at":"2025-12-16T22:06:41.127126-08:00","close_reason":"Test issues for MVP validation","dependencies":[{"issue_id":"gt-6z2.2","depends_on_id":"gt-6z2","type":"parent-child","created_at":"2025-12-16T21:57:43.69361-08:00","created_by":"daemon","metadata":"{}"}]}
{"id":"gt-70b3","title":"detectSender() doesn't recognize crew workers","description":"## Problem\n\n`detectSender()` in mail.go only detects polecats, not crew workers.\n\n## Current Code (line 445)\n\n```go\n// If in a rig's polecats directory, extract address\nif strings.Contains(cwd, \"/polecats/\") {\n // extract rig/polecat\n}\n\n// Default to mayor\nreturn \"mayor/\"\n```\n\n## Symptom\n\nEmma (crew worker at `/Users/stevey/gt/beads/crew/emma`) runs:\n- `gt mail inbox` → checks `mayor/` inbox (wrong!)\n- Should check `beads/emma` or `beads/crew/emma`\n\n## Fix\n\nAdd crew detection:\n```go\n// If in a rig's crew directory, extract address \nif strings.Contains(cwd, \"/crew/\") {\n parts := strings.Split(cwd, \"/crew/\")\n if len(parts) \u003e= 2 {\n rigPath := parts[0]\n crewName := strings.Split(parts[1], \"/\")[0]\n rigName := filepath.Base(rigPath)\n return fmt.Sprintf(\"%s/%s\", rigName, crewName)\n }\n}\n```\n\n## Also Check\n\n- `gt prime` correctly detects role, so there may be another detection function that works\n- Should unify detection logic","status":"open","priority":1,"issue_type":"bug","created_at":"2025-12-18T21:40:26.520559-08:00","updated_at":"2025-12-18T21:40:26.520559-08:00"}
{"id":"gt-71i","title":"Update architecture.md: Engineer role and Beads merge queue","description":"Update docs/architecture.md with recent design decisions:\n\n1. Agent table: Change \"Refinery\" role to \"Engineer\"\n - Refinery = place/module/directory\n - Engineer = role (agent that works in the Refinery)\n\n2. Merge Queue section: Document Beads-native model\n - MRs are beads issues with --type=merge-request\n - gt mq commands (submit, list, next, process, reorder)\n - Ordering via depends-on links\n\n3. CLI section: Add gt mq commands\n\n4. Key Design Decisions: Add decisions for:\n - #15: Merge Queue in Beads\n - #16: Engineer role (distinct from Refinery place)\n - #17: Session restart protocol for Engineer","status":"open","priority":1,"issue_type":"task","created_at":"2025-12-16T23:12:03.616159-08:00","updated_at":"2025-12-16T23:12:03.616159-08:00","dependencies":[{"issue_id":"gt-71i","depends_on_id":"gt-h5n","type":"blocks","created_at":"2025-12-16T23:12:14.92163-08:00","created_by":"daemon","metadata":"{}"}]}
{"id":"gt-7ik","title":"Ephemeral polecats: spawn fresh, delete on completion","description":"## Design Decision\n\nSwitch from pooled/idle polecats to ephemeral model:\n- Spawn creates fresh worktree from main\n- Polecat requests shutdown when done (bottom-up)\n- Witness verifies handoff, kills session, deletes worktree\n- No 'idle' state - polecats exist only while working\n\n## Rationale\n\n1. **Git worktrees are fast** - pooling optimization is obsolete\n2. **Pooling creates maintenance burden:**\n - Git stashes accumulate\n - Untracked artifacts pile up\n - Branches drift from main\n - Beads DB gets stale\n3. **PGT sync problems** came from persistent branches\n4. **Support infrastructure exists** - Witness, Refinery, Mayor handle continuity\n5. **Simpler mental model** - polecat exists = work in progress\n\n## Lifecycle\n\n```\nSpawn:\n gt spawn --issue \u003cid\u003e\n → Creates fresh worktree: git worktree add polecats/\u003cname\u003e -b polecat/\u003cname\u003e\n → Initializes beads in worktree\n → Starts session, assigns work\n\nWorking:\n Polecat does task\n → Pushes to polecat/\u003cname\u003e branch\n → Submits to merge queue when ready\n\nCompletion (POLECAT-INITIATED):\n Polecat runs: gt handoff\n → Verifies git state clean\n → Sends mail to Witness: \"Ready for shutdown\"\n → Marks itself done, waits for termination\n\nCleanup (WITNESS-OWNED):\n Witness receives shutdown request\n → Verifies PR merged or in queue\n → Verifies no uncommitted changes\n → Kills session: gt session stop \u003crig\u003e/\u003cpolecat\u003e\n → Deletes worktree: git worktree remove polecats/\u003cname\u003e\n → Deletes branch: git branch -d polecat/\u003cname\u003e\n → Optionally: Notifies Mayor of completion\n```\n\n## Key Insight: Bottom-Up Shutdown\n\n**Old model (wrong)**: Top-down batch shutdown - \"cancel the swarm\"\n**New model (right)**: Bottom-up individual shutdown - polecat requests, Witness executes\n\nThis enables streaming:\n- Workers come and go continuously\n- No \"swarm end\" to trigger cleanup\n- Each worker manages its own lifecycle\n- Witness is the lifecycle authority\n\n## Implementation\n\n1. Add `gt handoff` command for polecats to request shutdown\n2. Modify gt spawn to always create fresh worktree\n3. Run bd init in new worktree (beads needs initialization)\n4. Add shutdown request handler to Witness\n5. Witness verifies handoff, then cleans up:\n - Kill session\n - Remove worktree\n - Delete branch\n6. Remove 'idle' state from polecat state machine\n7. Simplify gt polecat list (only shows active)\n\n## Impact on Other Tasks\n\n- gt-17r (Zombie cleanup): Becomes trivial - orphan worktrees\n- gt-4my (Worker health): Simpler - no idle/stuck ambiguity\n- gt-f9x.5/f9x.6 (Doctor): Fewer states to validate\n- gt-eu9 (Witness handoff): Witness receives polecat shutdown requests","status":"open","priority":1,"issue_type":"feature","created_at":"2025-12-17T15:44:31.139964-08:00","updated_at":"2025-12-18T11:39:23.202854-08:00"}
{"id":"gt-7lt","title":"gt mail send should tmux-notify recipient","description":"## Problem\n\nWhen mail is sent via gt mail send, the recipient session does not get a tmux notification. In Python Gas Town (PGT), mail delivery triggers a tmux display-message or similar notification so the agent knows mail arrived.\n\n## Expected Behavior\n\nWhen gt mail send \u003caddr\u003e is called:\n1. Mail is delivered to recipient inbox\n2. If recipient has an active tmux session, send notification\n3. Notification should be visible (display-message or bell)\n\n## Current Behavior\n\nMail is delivered but no notification. Agent has to poll inbox to discover new mail.\n\n## Impact\n\n- Agents miss time-sensitive messages\n- Heartbeat pokes from daemon will not wake agents\n- Coordination is slower (polling vs push)\n\n## Implementation Notes\n\nAfter successful mail delivery, check if recipient has active session:\n- gt session list can identify active sessions\n- tmux display-message or send-keys can notify\n- Could inject a visible prompt like \"[MAIL] New message from \u003csender\u003e\"\n\n## Reference\n\nCheck PGT implementation for how it handles this.","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-12-18T12:28:50.142075-08:00","updated_at":"2025-12-18T20:09:53.112902-08:00","closed_at":"2025-12-18T20:09:53.112902-08:00","close_reason":"Implemented tmux notification via display-message for all mail recipients"}
@@ -256,8 +257,10 @@
{"id":"gt-us8","title":"Daemon: configurable heartbeat interval","description":"Heartbeat interval is hardcoded to 60s. Should be configurable via:\n- town.json config\n- Command line flag\n- Environment variable\n\nDefault 60s is reasonable but some deployments may want faster/slower.","status":"open","priority":3,"issue_type":"task","created_at":"2025-12-18T13:38:14.282216-08:00","updated_at":"2025-12-18T13:38:14.282216-08:00","dependencies":[{"issue_id":"gt-us8","depends_on_id":"gt-99m","type":"blocks","created_at":"2025-12-18T13:38:26.704111-08:00","created_by":"daemon"}]}
{"id":"gt-v5k","title":"Design: Failure modes and recovery","description":"Document failure modes and recovery strategies for Gas Town operations.\n\n## Critical Failure Modes\n\n### 1. Agent Crash Mid-Operation\n\n**Scenario**: Polecat crashes while committing, Witness crashes while verifying\n\n**Detection**:\n- Session suddenly gone (tmux check fails)\n- State shows 'working' but no session\n- Heartbeat stops (for Witness)\n\n**Recovery**:\n- Doctor detects via ZombieSessionCheck\n- Capture any recoverable state\n- Reset agent state to 'idle'\n- For Witness: auto-restart via supervisor or manual gt witness start\n\n### 2. Git State Corruption\n\n**Scenario**: Merge conflict, failed rebase, detached HEAD\n\n**Detection**:\n- Git commands fail\n- Dirty state that won't commit\n- Branch diverged from origin\n\n**Recovery**:\n- gt doctor reports git health issues\n- Manual intervention recommended\n- Severe cases: remove clone, re-clone\n\n### 3. Beads Sync Conflict\n\n**Scenario**: Two polecats modify same issue\n\n**Detection**:\n- bd sync fails with conflict\n- Beads tombstone mechanism handles most cases\n\n**Recovery**:\n- Beads has last-write-wins semantics\n- bd sync --force in extreme cases\n- Issues may need manual dedup\n\n### 4. Tmux Failure\n\n**Scenario**: Tmux server crashes, socket issues\n\n**Detection**:\n- All sessions inaccessible\n- \"no server running\" errors\n\n**Recovery**:\n- Kill any orphan processes\n- tmux kill-server \u0026\u0026 tmux start-server\n- All agent states reset to idle\n- Re-spawn active work\n\n### 5. Claude API Issues\n\n**Scenario**: Rate limits, outages, context limits\n\n**Detection**:\n- Sessions hang or produce errors\n- Repeated failure patterns\n\n**Recovery**:\n- Exponential backoff (handled by Claude Code)\n- For context limits: session cycling (mail-to-self)\n- For outages: wait and retry\n\n### 6. Disk Full\n\n**Scenario**: Clones, logs, or beads fill disk\n\n**Detection**:\n- Write operations fail\n- git/bd commands error\n\n**Recovery**:\n- Clean up logs: rm ~/.gastown/logs/*\n- Remove old polecat clones\n- gt doctor --fix can clean some cruft\n\n### 7. Network Failure\n\n**Scenario**: Can't reach GitHub, API servers\n\n**Detection**:\n- git fetch/push fails\n- Claude sessions hang\n\n**Recovery**:\n- Work continues locally\n- Queue pushes for later\n- Sync when connectivity restored\n\n## Recovery Principles\n\n1. **Fail safe**: Prefer stopping over corrupting\n2. **State is recoverable**: Git and beads have recovery mechanisms\n3. **Doctor heals**: gt doctor --fix handles common issues\n4. **Emergency stop**: gt stop --all as last resort\n5. **Human escalation**: Some failures need Overseer intervention\n\n## Implementation\n\n- Document each failure mode in architecture.md\n- Ensure doctor checks cover detection\n- Add recovery hints to error messages\n- Log all failures for debugging","status":"open","priority":1,"issue_type":"task","created_at":"2025-12-15T23:19:07.198289-08:00","updated_at":"2025-12-15T23:19:28.171942-08:00"}
{"id":"gt-vci","title":"Mayor handoff mail template","description":"Add MAYOR_HANDOFF mail template to templates.py.\n\n## Template Function\n\ndef mayor_handoff(\n active_swarms: List[SwarmStatus],\n rig_status: Dict[str, RigStatus],\n pending_escalations: List[Escalation],\n in_flight_decisions: List[Decision],\n recent_actions: List[str],\n delegated_work: List[DelegatedItem],\n user_requests: List[str],\n next_steps: List[str],\n warnings: Optional[str] = None,\n session_duration: Optional[str] = None,\n) -\u003e Message:\n metadata = {\n 'template': 'MAYOR_HANDOFF',\n 'timestamp': datetime.utcnow().isoformat(),\n 'session_duration': session_duration,\n 'active_swarm_count': len(active_swarms),\n 'pending_escalation_count': len(pending_escalations),\n }\n # ... format sections ...\n return Message.create(\n sender='mayor/',\n recipient='mayor/',\n subject='Session Handoff',\n body=body,\n priority='high',\n )\n\n## Metadata Fields\n\n- template: MAYOR_HANDOFF\n- timestamp: ISO format\n- session_duration: Human readable\n- active_swarm_count: Number of active swarms\n- pending_escalation_count: Number of escalations\n\n## Mail Priority\n\nUse priority='high' to ensure handoff is seen on startup.","status":"open","priority":1,"issue_type":"task","created_at":"2025-12-15T20:15:30.26323-08:00","updated_at":"2025-12-15T20:48:59.550689-08:00","dependencies":[{"issue_id":"gt-vci","depends_on_id":"gt-u82","type":"blocks","created_at":"2025-12-15T20:15:39.554108-08:00","created_by":"daemon","metadata":"{}"}]}
{"id":"gt-vdp0","title":"Crew workers getting wrong CLAUDE.md (shows Refinery)","description":"## Problem\n\nCrew workers (emma, dave) have CLAUDE.md that says they're the Refinery.\n\n## Evidence\n\n```\n$ head -5 /Users/stevey/gt/beads/crew/emma/CLAUDE.md\n# Claude: Beads Refinery\nYou are the **Refinery** for the **beads** rig...\n```\n\n## Expected\n\nShould use crew.md.tmpl which correctly says:\n```\n# Crew Worker Context\nYou are a **crew worker** - the overseer's (human's) personal workspace...\n```\n\n## Impact\n\n- Crew workers have wrong identity in static context\n- `gt prime` correctly outputs crew context, but CLAUDE.md conflicts\n- Confusing role information\n\n## Fix\n\nCheck `gt crew create` or whatever populates CLAUDE.md for crew workers.","status":"open","priority":1,"issue_type":"bug","created_at":"2025-12-18T21:40:56.518032-08:00","updated_at":"2025-12-18T21:40:56.518032-08:00"}
{"id":"gt-vjv","title":"Add bulk session stop command (gt session stop --all)","description":"When decommissioning a rig, need to stop multiple sessions one at a time. A --all or --rig flag would allow: gt session stop --rig gastown","status":"closed","priority":3,"issue_type":"feature","created_at":"2025-12-18T11:33:33.394649-08:00","updated_at":"2025-12-18T11:38:51.399298-08:00","closed_at":"2025-12-18T11:38:51.399298-08:00","close_reason":"Wrong model: streaming doesn't need bulk shutdown commands - polecats request their own shutdown"}
{"id":"gt-vjw","title":"Swarm learning: Session cleanup missing from swarm workflow","description":"## Problem\n\nAfter Enders Game swarm completed (18 issues merged), 16 polecat sessions were left running but idle. No automated cleanup occurred.\n\n## What Should Happen\n\n1. Witness detects polecat completed work (idle at prompt)\n2. Witness verifies git state is clean\n3. Witness shuts down session\n4. Witness reports completion to Mayor\n\n## GGT Components\n\n- gt-cxx: Witness context cycling (covers self-cycling)\n- gt-u1j.9: Witness daemon heartbeat loop\n- gt-kmn.6: Witness swarm landing protocol\n\n## Recommendation\n\nAdd to Witness responsibilities:\n- Monitor for 'work complete' signals (DONE keyword, idle detection)\n- Automated session shutdown after verification\n- Swarm completion reporting to Mayor\n\nSee also: architecture.md 'Worker Cleanup (Witness-Owned)' section which describes this but it wasn't implemented in PGT.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-12-16T01:27:52.796587-08:00","updated_at":"2025-12-16T01:51:37.409965-08:00","closed_at":"2025-12-16T01:51:37.409965-08:00","close_reason":"Addressed by gt-u1j.9 (daemon heartbeat) which now includes session lifecycle management and worker shutdown. See architecture.md Key Decision #12."}
{"id":"gt-vnp9","title":"tmux notifications: display-message too subtle, use send-keys instead","description":"## Problem\n\n`tmux display-message` notifications are not visible enough - they appear briefly in the status bar and are easy to miss.\n\n## Current Behavior\n\nrouter.go uses:\n```go\nr.tmux.DisplayMessageDefault(sessionID, notification)\n```\n\n## What Works\n\nSending echo commands directly to the terminal:\n```bash\ntmux send-keys -t \u003csession\u003e \"echo '📬 NEW MAIL from mayor'\" Enter\n```\n\n## Proposed Fix\n\nChange notification method to send visible output to the terminal, perhaps with a box/banner:\n```\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n📬 NEW MAIL from mayor\nSubject: \u003csubject\u003e\nRun: bd mail inbox\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n```\n\n## Considerations\n\n- This interrupts the terminal output (acceptable for important mail)\n- Could check if Claude is mid-response and queue notification\n- Or use tmux popup if available","notes":"Additional issues:\n1. Enter key not sent properly when chained with send-keys\n2. Need to debounce and send Enter separately\n3. Correct pattern: send text, sleep briefly, then send Enter","status":"open","priority":2,"issue_type":"bug","created_at":"2025-12-18T21:35:28.542985-08:00","updated_at":"2025-12-18T21:35:43.867634-08:00"}
{"id":"gt-w3bu","title":"gt spawn: Enter key needs debounce delay after paste","description":"## Problem\n\nWhen spawning polecats via `gt spawn`, instructions are pasted into tmux but workers often sit idle at prompt because the Enter key arrives before the paste is fully processed.\n\n## Observed Behavior\n\n- Instructions pasted via tmux send-keys\n- Enter key sent immediately after\n- Worker session shows prompt with no input (paste not submitted)\n- Requires manual intervention to nudge workers\n\n## Expected Behavior\n\n- Paste completes fully\n- Enter key submits the pasted instructions\n- Worker begins executing immediately\n\n## Root Cause\n\nLikely a race condition between paste buffer processing and keypress handling. Need either:\n1. Debounce delay before sending Enter\n2. Longer delay (Tmax) to ensure paste completes\n3. Alternative submission mechanism\n\n## Impact\n\n- Swarm of 10 workers had multiple stalls at startup\n- Required manual tmux send-keys to unstick them\n- Defeats purpose of automated spawning\n\n## References\n\n- Observed during Mad Max swarm (Dec 18, 2025)\n- Affects gt spawn command in internal/cmd/spawn.go","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-12-18T21:06:50.734123-08:00","updated_at":"2025-12-18T21:08:22.400899-08:00","closed_at":"2025-12-18T21:08:22.400899-08:00","close_reason":"Added debounce delay between paste and Enter in SendKeys. Default 100ms, Inject scales 100-500ms based on message size."}
{"id":"gt-w775","title":"MR: gt-svi.1 (polecat/Furiosa)","description":"branch: polecat/Furiosa\ntarget: main\nsource_issue: gt-svi.1","status":"closed","priority":1,"issue_type":"merge-request","created_at":"2025-12-18T20:21:40.921429-08:00","updated_at":"2025-12-18T20:21:54.163532-08:00","closed_at":"2025-12-18T20:21:54.163532-08:00","close_reason":"Test MRs, cleaning up"}
{"id":"gt-w9o","title":"/restart: Personal slash command for in-place agent restart","description":"Create ~/.claude/commands/restart.md that restarts current Gas Town agent in place.\n\n## Detection\n- Read tmux session name: gt-mayor, gt-witness-*, gt-refinery-*, gt-polecat-*\n- Fallback: check GT_ROLE env var\n\n## Behavior by role\n- mayor: gt mayor restart (sends Ctrl-C, loop respawns)\n- witness: gt witness restart\n- refinery: gt refinery restart \n- polecat: gt polecat restart (or witness-mediated)\n\n## Command format\nUses backticks for inline bash to detect context, then instructs Claude to run appropriate restart.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-12-18T18:32:30.043125-08:00","updated_at":"2025-12-18T18:43:17.182303-08:00","closed_at":"2025-12-18T18:43:17.182303-08:00","close_reason":"Implemented ~/.claude/commands/restart.md"}