vars.mode had both required=true and default="conservative", which
causes formula validation to fail with:
vars.mode: cannot have both required:true and default
This prevented gt doctor from dispatching cleanup work to dogs.
Remove required=true since the default value ensures the variable
is always populated.
Co-authored-by: mayor <ec2-user@ip-172-31-43-79.ec2.internal>
Add three new verification legs to the code-review convoy formula:
- wiring: detects dependencies added but not actually used
- commit-discipline: reviews commit quality and atomicity
- test-quality: verifies tests are meaningful, not just present
Also adds presets for common review modes:
- gate: light review (wiring, security, smells, test-quality)
- full: all 10 legs for comprehensive review
- security-focused: security-heavy for sensitive changes
- refactor: code quality focus
This is a minimal subset of PR #4's Worker → Reviewer pattern,
focusing on the most valuable additions without Go code changes.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* fix(formulas): replace hardcoded ~/gt/ paths with $GT_ROOT
Formula files contained hardcoded ~/gt/ paths that break when running
Gas Town from a non-default location (e.g., ~/gt-private/). This causes:
- Dogs stuck in working state (can't write to wrong path)
- Cross-town contamination when ~/gt/ exists as separate town
- Boot triage, deacon patrol, and log archival failures
Replaces all ~/gt/ and $HOME/gt/ references with $GT_ROOT which is
set at runtime to the actual town root directory.
Fixes#757
* chore: regenerate embedded formulas
Run go generate to sync embedded formulas with .beads/formulas/ source.
Key fix: Orphan cleanup now skips Claude processes in valid Gas Town
tmux sessions (gt-*/hq-*), preventing false kills of witnesses,
refineries, and deacon during startup.
Updated all component versions:
- gt CLI: 0.3.1 → 0.4.0
- npm package: 0.3.0 → 0.4.0
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Configure types.custom with Gas Town-specific types:
molecule, gate, convoy, merge-request, slot, agent, role, rig, event, message
These types are used by Gas Town infrastructure and will be removed from
beads core built-in types (bd-find4). This allows Gas Town to define its
own types while keeping beads core focused on work types.
Closes: bd-t5o8i
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add mol-backoff-test formula for integration testing exponential backoff
with short intervals (2s base, 10s max) to observe multiple cycles quickly.
Fix await-signal to use --since 1s when subscribing to activity feed.
Without this, historical events would immediately wake the signal,
preventing proper timeout and backoff behavior.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a workflow formula for Gas Town releases with:
- Workspace preflight checks (uncommitted work, stashes, branches)
- CHANGELOG.md and info.go versionChanges updates
- Version bump via bump-version.sh
- Local install and daemon restart
- Error handling guidance for crew vs polecat execution
* feat: Add automatic orphaned claude process cleanup
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
properly after completion. These accumulate and consume significant memory
(observed: 17 processes using ~6GB RAM).
This change adds automatic cleanup in two places:
1. **Deacon patrol** (primary): New patrol step "orphan-process-cleanup" runs
`gt deacon cleanup-orphans` early in each cycle. More responsive (~30s).
2. **Daemon heartbeat** (fallback): Runs cleanup every 3 minutes as safety net
when deacon is down.
Detection uses TTY column - processes with TTY "?" have no controlling terminal.
This is safe because:
- Processes in terminals (user sessions) have a TTY like "pts/0" - untouched
- Only kills processes with no controlling terminal
- Orphaned subagents are children of tmux server with no TTY
New files:
- internal/util/orphan.go: FindOrphanedClaudeProcesses, CleanupOrphanedClaudeProcesses
- internal/util/orphan_test.go: Tests for orphan detection
New command:
- `gt deacon cleanup-orphans`: Manual/patrol-triggered cleanup
Fixes#587
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add Windows build tag and minimum age check
Addresses review feedback on PR #588:
1. Add //go:build !windows to orphan.go and orphan_test.go
- The code uses Unix-specific syscalls (SIGTERM, ESRCH) and
ps command options that don't exist on Windows
2. Add minimum age check (60 seconds) to prevent false positives
- Prevents race conditions with newly spawned subagents
- Addresses reviewer concern about cron/systemd processes
- Uses portable etime format instead of Linux-only etimes
3. Add parseEtime helper with comprehensive tests
- Parses [[DD-]HH:]MM:SS format (works on both Linux and macOS)
- etimes (seconds) is Linux-specific, etime is portable
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add proper SIGTERM→SIGKILL escalation with state tracking
Previous approach used process age which doesn't work: a Task subagent
runs without TTY from birth, so a long-running legitimate subagent that
later fails to exit would be immediately SIGKILLed without trying SIGTERM.
New approach uses a state file to track signal history:
1. First encounter → SIGTERM, record PID + timestamp in state file
2. Next cycle (after 60s grace period) → if still alive, SIGKILL
3. Next cycle → if survived SIGKILL, log as unkillable and remove
State file: $XDG_RUNTIME_DIR/gastown-orphan-state (or /tmp/)
Format: "<pid> <signal> <unix_timestamp>" per line
The state file is automatically cleaned up:
- Dead processes removed on load
- Unkillable processes removed after logging
Also updates callers to use new CleanupResult type which includes
the signal sent (SIGTERM, SIGKILL, or UNKILLABLE).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Add commands to find and terminate orphan Claude processes (those with
PPID=1 that survived session termination):
- gt orphans list: Show orphan Claude processes
- gt orphans kill: Kill with confirmation
- gt orphans kill -f: Force kill without confirmation
Detection excludes:
- tmux processes (may contain "claude" in args)
- Claude.app desktop application processes
- Claude Helper processes
The original `gt orphans` functionality for finding orphan git commits
is preserved.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds supervision for dispatched dogs that may get stuck.
The new step (between dog-pool-maintenance and orphan-check):
- Lists dogs in "working" state
- Checks work duration vs plugin timeout (default 10m)
- Decision matrix based on how long overdue:
- < 2x timeout: log warning, check next cycle
- 2x-5x timeout: file death warrant
- > 5x timeout: force clear + escalate to Mayor
- Tracks chronic failures for repeat offenders
This closes the supervision gap where dogs could hang forever
after being dispatched via `gt dog dispatch --plugin`.
Closes: gt-s4dp3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- shiny.formula.toml: defers to role's git workflow instead of hardcoding PR
- crew.md.tmpl: checks remote origin ownership instead of absolute PR ban
- tmux.go: minor comment fix
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 1 of dynamic priming subsystem:
1. PRIME.md provisioning for all workers (hq-5z76w, hq-ukjrr Part A)
- Added ProvisionPrimeMD to beads package with Gas Town context template
- Provision at rig level in AddRig() so all workers inherit it
- Added fallback provisioning in crew and polecat managers
- Created PRIME.md for existing rigs
2. Post-handoff detection to prevent handoff loop bug (hq-ukjrr Part B)
- Added FileHandoffMarker constant (.runtime/handoff_to_successor)
- gt handoff writes marker before respawn
- gt prime detects marker and outputs "HANDOFF COMPLETE" warning
- Marker cleared after detection to prevent duplicate warnings
3. Priming health checks for gt doctor (hq-5scnt)
- New priming_check.go validates priming subsystem configuration
- Checks: SessionStart hook, gt prime command, PRIME.md presence
- Warns if CLAUDE.md is too large (should be bootstrap pointer)
- Fixable: provisions missing PRIME.md files
This ensures crew workers get Gas Town context (GUPP, hooks, propulsion)
even if the gt prime hook fails, via bd prime fallback.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Deacon patrol formula now uses `gt mol step await-signal` with
exponential backoff instead of vague "sleep 60s" instructions.
How it works:
- Subscribes to `bd activity --follow` (beads activity feed)
- Returns IMMEDIATELY when any gt/bd command triggers activity
- On timeout, waits exponentially longer: 60s → 120s → 240s → max 10m
- Tracks idle:N label on hq-deacon bead across invocations
This connects the designed-but-unintegrated backoff mechanism to the
actual patrol loop. Idle towns let Deacon sleep; active work wakes it.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(costs): redesign session cost tracking with wisps and daily digests
Implement the wisp-based cost tracking architecture per gt-cm900:
- gt costs record now creates ephemeral wisps (not exported to JSONL)
to avoid log-in-database pollution with O(sessions/day) events
- gt costs digest aggregates yesterday's session wisps into a single
permanent "Cost Report YYYY-MM-DD" bead for audit purposes
- gt costs query updated: --today queries wisps, --week queries
digest beads + today's wisps
- gt costs migrate closes legacy open session.ended beads
- Deacon patrol formula updated with costs-digest step
The new architecture:
Session ends -> Wisp (fast, N/day) -> Patrol digest -> Bead (1/day)
This preserves audit trail while keeping issues.jsonl clean.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* chore: sync canonical formula with embedded copy
Update .beads/formulas/ with the costs-digest step added to
mol-deacon-patrol.formula.toml. The go:generate copies from
.beads/formulas/ to internal/formula/formulas/.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The deacon tmux session is named hq-deacon, not gt-deacon. Fix the
incorrect references in mol-boot-triage and mol-gastown-boot formulas.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The formula parser only supports TOML (uses toml.Decode). The JSON
version of mol-gastown-boot was never used - it was likely created
by mistake or for an abandoned experiment.
Changes:
- Remove .beads/formulas/mol-gastown-boot.formula.json
- Remove internal/formula/formulas/mol-gastown-boot.formula.json
- Simplify go:generate to only copy .formula.toml files
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Deacon patrol formula's zombie-scan step now:
- Only detects zombies via --dry-run, never kills directly
- Files death warrants for Boot to handle interrogation/execution
- Includes psychological weight language about termination gravity
This prevents accidental destruction of worker context, mid-task
progress, and unsaved state. Kill authority belongs to Boot.
Bumped version to 5.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Defines the state machine that Dogs execute for death warrants:
- 3-attempt interrogation with escalating timeouts (60s, 120s, 240s)
- PARDON path when session responds with ALIVE
- EXECUTE path after all attempts exhausted
- EPITAPH step for audit logging
Key design decisions documented:
- Dogs are goroutines, not Claude sessions
- Timeout gates close on timer OR early response detection
- State persisted to ~/gt/deacon/dogs/active/ for crash recovery
Implements specification for gt-cd404.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes gt-v07fl: Polecat lifecycle cleanup for stale worktrees and git
tracking conflicts.
Changes:
1. Add .claude/ to .gitignore (prevents untracked file accumulation)
2. Add beads runtime state patterns to .gitignore (prevents future tracking)
3. Remove .beads/ runtime state from git tracking (mq/, issues.jsonl, etc.)
- Formulas and config remain tracked (needed for go install)
- Created follow-up gt-mpyuq for formulas refactor
4. Add DetectStalePolecats() to polecat manager for identifying cleanup candidates
5. Add CountCommitsBehind() to git package for staleness detection
6. Add `gt polecat stale <rig>` command for stale polecat detection/cleanup
- Shows polecats without active sessions
- Identifies polecats far behind main (configurable threshold)
- Optional --cleanup flag to auto-nuke stale polecats
The existing `gt polecat gc` command handles branch cleanup.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When patrol agents (witness, refinery, deacon) complete their patrol wisp,
they were left idle because the instructions only said "loop back" without
specifying the command to create a new wisp.
Updated:
- Work loop instructions in prime.go now explicitly tell agents:
* If context LOW: run `bd mol wisp mol-<role>-patrol` to create new wisp
* If context HIGH: use `gt handoff` and exit for daemon respawn
- mol-witness-patrol.formula.toml loop-or-exit step now has clear commands
This ensures patrol agents always either create a new wisp or exit cleanly,
preventing the "session alive but idle" state that caused mail to pile up.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Migrate all role bead references from gt-*-role to hq-*-role using
beads.RoleBeadIDTown() function. Role beads are stored in town beads
(~/gt/.beads/) with the hq- prefix.
Changes:
- internal/cmd/prime.go: Use RoleBeadIDTown() for all roles
- internal/doctor/agent_beads_check.go: Use RoleBeadIDTown() for rig agents
- internal/polecat/manager.go: Use RoleBeadIDTown("polecat")
- internal/cmd/crew_add.go: Use RoleBeadIDTown("crew")
- internal/beads/beads.go: Update comments to document hq- convention
- Templates: Update bd show gt-deacon to bd show hq-deacon
Note: Tmux session names remain as gt-* (runtime identifiers).
Bead IDs use hq-* for town-level agents (persistent storage).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Refinery monitoring to daemon heartbeat (ensureRefineriesRunning)
- Add circuit breaker: agents marked dead by checkStaleAgents() are
now force-killed and restarted, even if Claude process is alive
- Fixes zombie Claude sessions that werent updating their bead state
The circuit breaker flow:
1. Agent gets stuck (stops updating bead state)
2. After 15 minutes: checkStaleAgents() marks bead as dead
3. On next heartbeat: ensure*Running() sees state=dead
4. Force-kill session and restart fresh
Generated with Claude Code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates mol-witness-patrol.formula.toml to document the new ephemeral model:
- Added Ephemeral Polecat Model section explaining lifecycle separation
- Updated POLECAT_DONE handling to describe immediate auto-nuke
- Updated process-cleanups to focus on exception handling (dirty polecats)
- Updated survey-workers Step 4a for ephemeral done polecat handling
- Updated patrol-cleanup for ephemeral model expectations
Key principle: Polecat lifecycle is separate from MR lifecycle.
Polecats are recyclable immediately after MR submission.
(gt-si8rq.9)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Session names `gt-mayor` and `gt-deacon` were hardcoded, causing tmux
session name collisions when running multiple towns simultaneously.
Changed to `gt-{town}-mayor` and `gt-{town}-deacon` format (e.g.,
`gt-ai-mayor`) to allow concurrent multi-town operation.
Key changes:
- session.MayorSessionName() and DeaconSessionName() now take townName param
- Added workspace.GetTownName() helper to load town name from config
- Updated all callers in cmd/, daemon/, doctor/, mail/, rig/, templates/
- Updated tests with new session name format
- Bead IDs remain unchanged (already scoped by .beads/ directory)
Fixes#60🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New molecule formula mol-polecat-conflict-resolve defines the workflow for
polecats handling merge conflict resolution tasks:
1. Load task and extract metadata from conflict-resolution bead
2. Acquire merge slot (prevents racing via "Monkey Knife Fight" prevention)
3. Checkout the conflicting branch
4. Rebase onto main and resolve conflicts
5. Run tests to verify resolution
6. Push resolved changes directly to main (bypasses queue)
7. Close original MR bead and source issue
8. Release merge slot for next waiter
9. Clean up and close the conflict-resolution task
This completes the polecat side of the ephemeral merge workflow architecture.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PROBLEM: Agents were reading formula files directly and manually creating beads
for each step, rather than using the cook→pour→molecule pipeline.
FIXES:
- polecat-CLAUDE.md: Changed "following the formula" to "work through your
pinned molecule" and added explicit anti-pattern warning
- mol-polecat-work.formula.toml: Added note that formula defines template,
use bd ready to find step beads
- docs/molecules.md: Added "Common Mistake" section with WRONG/RIGHT examples
- mol-*.formula.toml (5 files): Changed "execute this formula" to "work
through molecules (poured from this formula)"
The key insight: Formulas are source templates (like source code). You never
read them directly. The cook → pour pipeline creates step beads for you.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change polecat contract from wait-for-termination to ephemeral
- Rename close-issue → prepare-for-review (Refinery closes after merge)
- Rename signal-complete → submit-and-exit (polecat recyclable after gt done)
- Bump formula version to 4
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update mol-refinery-patrol formula v4:
- Rename process-branch to "Mechanical rebase"
- Add robust conflict detection via exit codes
- Create conflict-resolution task on rebase failure
- Track conflict metadata (main SHA, branch SHA)
- Update loop-check with conflict-skip entry path
- Update generate-summary with conflict tracking
On conflicts: abort rebase, create task, preserve branch for resolution.
MR beads remain open until conflicts resolved and reprocessed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update mol-sync-workspace generate-report step to distinguish:
- Feature branches: 'Commits ahead of main: N'
- Crew on main: 'Unpushed commits: N'
The 'commits ahead of main' was confusing for crew workers who work
directly on main branch. 'Unpushed commits' is clearer for that case.
Also added explicit git commands for checking divergence.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The monolithic handle-dirty-state step handled three distinct concerns.
Split into focused steps for clarity and maintainability:
1. handle-uncommitted: Handle staged/unstaged changes
- Commit if ready (WIP)
- Stash if experimental/incomplete
2. handle-untracked: Handle untracked files
- Decision matrix by file type
- Actions: gitignore, commit, or delete
- Warns against auto-delete
3. handle-stashes: Review and clean stash entries
- Age-based decision matrix
- Drop stale entries (>1 week old)
- Caution about index shifting
Updated cleanup-worktrees to depend on handle-stashes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Creates a convoy-style formula for structured design exploration with 6
parallel legs:
- api: Interface design, developer ergonomics
- data: Data model, storage, migrations
- ux: User experience, CLI ergonomics
- scale: Performance at scale, bottlenecks
- security: Threat model, attack surface
- integration: System integration, compatibility
Each leg spawns a polecat that analyzes the design problem from its
dimension. The synthesis step combines all analyses into a unified
design document with options and decision points.
Usage:
gt formula run design --problem="Add notification levels to mayor"
(gt-ubqeg.2)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates mol-deacon-patrol formula to be silent when the town is idle:
- Added "Idle Town Principle" to formula description
- health-scan step now checks for active work before sending nudges
- loop-or-exit step specifies 60+ second sleep (2-5 min when idle)
- Explains that daemon (10-min heartbeat) is the safety net
This prevents flooding idle agents with health checks every few seconds.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements Option A from the issue: multiple refinery workers with locking.
Changes to mrqueue:
- Add ClaimedBy/ClaimedAt fields to MR struct
- Add Claim(id, worker) method with atomic file operations
- Add Release(id) method to unclaim on failure
- Add ListUnclaimed() to find available work
- Add ListClaimedBy(worker) for worker-specific queries
- Claims expire after 10 minutes for crash recovery
New CLI commands:
- gt refinery claim <mr-id> - Claim MR for processing
- gt refinery release <mr-id> - Release claim back to queue
- gt refinery unclaimed - List available MRs
Formula updates:
- queue-scan now uses gt refinery unclaimed
- process-branch claims MR before processing
- handle-failures releases claim on test failure
- Claims prevent double-processing by parallel workers
Worker ID comes from GT_REFINERY_WORKER env var (default: refinery-1).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>