* Improve tmux statusline: sort rigs by activity and add visual grouping
- Sort rigs by running state, then polecat count, then operational state
- Add visual grouping with | separators between state groups
- Show process state with icons (🟢 both running, 🟡 one running, 🅿️ parked, 🛑 docked, ⚫ idle)
- Display polecat counts for active rigs
- Improve icon spacing: 2 spaces after Park emoji, 1 space for others
* Fix golangci-lint warnings
- Check error return from os.Setenv
- Check error return from lock.Unlock
- Mark intentionally unused parameters with _
---------
Co-authored-by: joshuavial <git@codewithjv.com>
When polecats run 'gt done' without --cleanup-status, the witness may
prematurely nuke the worktree before the refinery can merge.
This fix auto-detects git state:
- uncommitted: has uncommitted changes
- stash: has stashed changes
- unpushed: branch not pushed or has unpushed commits
- clean: everything pushed
Uses BranchPushedToRemote() which properly handles polecat branches
that don't have upstream tracking (compares against origin/main).
On error, defaults to 'unpushed' to prevent accidental data loss.
Fixes: #342
Co-authored-by: mayor <mayor@gastown.local>
When running `gt install --wrappers` in an existing Gas Town HQ,
the command now installs wrappers directly without requiring --force
or recreating the entire HQ structure.
Previously, `gt install --wrappers` would fail with "directory is
already a Gas Town HQ" unless --force was used, which would then
unnecessarily reinitialize the entire workspace.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
When bd --no-daemon show <id> does not find an issue, it incorrectly exits
with code 0 (success) but writes the error to stderr and leaves stdout empty.
This causes JSON parse failures throughout gt when code tries to unmarshal
the empty stdout.
This PR handles the bug defensively in all affected code paths:
- beads.go run(): Detect empty stdout + non-empty stderr as error
- beads.go wrapError(): Add 'no issue found' to ErrNotFound patterns
- sling.go: Check len(out) == 0 in multiple functions
- convoy.go getIssueDetails(): Check stdout.Len() == 0
- prime_molecule.go: Check stdout.Len() == 0
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The verifyFormulaExists function now checks for non-empty output,
so the test stub must output something for formula show commands.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix beads.run() to always explicitly set BEADS_DIR based on the working
directory or explicit override
- This prevents inherited environment variables (e.g., from mayor session
with BEADS_DIR=/home/erik/gt/.beads) from causing prefix mismatch errors
when creating agent beads for rigs
- Update polecat manager to use NewWithBeadsDir for explicitness
- Add comprehensive test coverage for BEADS_DIR routing and validation
- Add SessionLister interface for deterministic orphan session testing
Root cause: When BEADS_DIR was set in the parent environment, all bd
commands used the town database (hq- prefix) instead of the rig database
(gt- prefix), causing "prefix mismatch: database uses 'hq' but you
specified 'gt'" errors during polecat spawn.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes CI lint failures by handling unchecked error returns and marking
unused parameters with blank identifiers.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds mark-read and mark-unread commands that allow marking messages
as read without archiving them. Uses a "read" label to track status.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove references to idle state. Polecats self-nuke after work - there is
no idle state. The Witness handles crash recovery and orphan cleanup.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Issue #336: Consolidate down/shutdown/stop commands
Changes:
- Add `gt down --polecats` flag to stop all polecat sessions
- Deprecate `gt stop` command (prints warning, directs to `gt down --polecats`)
- Update help text to clarify down vs shutdown distinction:
- down = pause (reversible, keeps worktrees)
- shutdown = done (permanent cleanup)
- Integrate --polecats with new --dry-run mode from recent PR
Note: The issue proposed renaming --nuke to --tmux, but PR #330 just
landed with --nuke having better safety (GT_NUKE_ACKNOWLEDGED env var),
so keeping --nuke as-is. The new --polecats flag absorbs gt stop
functionality as proposed.
Closes#336
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(down): add refinery shutdown to gt down
Refineries were not being stopped by gt down, causing them to continue
running after shutdown. This adds a refinery shutdown loop before
witnesses, fixing problem P3 from the v2.4 proposal.
Changes:
- Add Phase 1: Stop refineries (gt-<rig>-refinery sessions)
- Renumber existing phases (witnesses now Phase 2, etc.)
- Include refineries in halt event logging
* feat(beads): add StopAllBdProcesses for shutdown
Add functions to stop bd daemon and bd activity processes:
- StopAllBdProcesses(dryRun, force) - main entry point
- CountBdDaemons() - count running bd daemons
- CountBdActivityProcesses() - count running bd activity processes
- stopBdDaemons() - uses bd daemon killall
- stopBdActivityProcesses() - SIGTERM->wait->SIGKILL pattern
This solves problems P1 (bd daemon respawns sessions) and P2 (bd activity
causes instant wakeups) from the v2.4 proposal.
* feat(down): rename --all to --nuke, add new --all and --dry-run flags
BREAKING CHANGE: --all now stops bd processes instead of killing tmux server.
Use --nuke for the old --all behavior (killing the entire tmux server).
New flags:
- --all: Stop bd daemons/activity processes and verify shutdown
- --nuke: Kill entire tmux server (DESTRUCTIVE, with warning)
- --dry-run: Preview what would be stopped without taking action
This solves problem P4 (old --all was too destructive) from the v2.4 proposal.
The --nuke flag now requires GT_NUKE_ACKNOWLEDGED=1 environment variable
to suppress the warning about destroying all tmux sessions.
* feat(down): add shutdown lock to prevent concurrent runs
Add Phase 0 that acquires a file lock before shutdown to prevent race
conditions when multiple gt down commands are run concurrently.
- Uses gofrs/flock for cross-platform file locking
- Lock file stored at ~/gt/daemon/shutdown.lock
- 5 second timeout with 100ms retry interval
- Lock released via defer on successful acquisition
- Dry-run mode skips lock acquisition
This solves problem P6 (concurrent shutdown race) from the v2.4 proposal.
* feat(down): add verification phase for respawn detection
Add Phase 5 that verifies shutdown was complete after stopping all services:
- Waits 500ms for processes to fully terminate
- Checks for respawned bd daemons
- Checks for respawned bd activity processes
- Checks for remaining gt-*/hq-* tmux sessions
- Checks if daemon PID is still running
If anything respawned, warns user and suggests checking systemd/launchd.
This solves problem P5 (no verification) from the v2.4 proposal.
* test(down): add unit tests for shutdown functionality
Add tests for:
- parseBdDaemonCount() - array, object with count, object with daemons, empty, invalid
- CountBdActivityProcesses() - integration test
- CountBdDaemons() - integration test (skipped if bd not installed)
- StopAllBdProcesses() - dry-run mode test
- isProcessRunning() - current process, invalid PID, max PID
These tests cover the core parsing and process detection logic added
in the v2.4 shutdown enhancement.
* fix(review): add tmux check and pkill fallback for bd shutdown
Address review gaps against proposal v2.4 AC:
- AC1: Add tmux availability check BEFORE acquiring shutdown lock
- AC2: Add pkill fallback for bd daemon when killall incomplete
- AC2: Return remaining count from stop functions for error reporting
- Style: interface{} → any (Go 1.18+)
* fix(prime): add validation for --state flag combination
The --state flag should be standalone and not combined with other flags.
Add validation at start of runPrime to enforce this.
Fixes TestPrimeFlagCombinations test failures.
* fix(review): address bot review critical issues
- isProcessRunning: handle pid<=0 as invalid (return false)
- isProcessRunning: handle EPERM as process exists (return true)
- stopBdDaemons: prevent negative killed count from race conditions
- stopBdActivityProcesses: prevent negative killed count from race conditions
* fix(review): critical fixes from deep review
Platform fixes:
- CountBdActivityProcesses: use sh -c "pgrep | wc -l" for macOS compatibility
(pgrep -c flag not available on BSD/macOS)
Correctness fixes:
- stopSession: return (wasRunning, error) to distinguish "stopped" vs "not running"
- daemon.IsRunning: handle error instead of ignoring with blank identifier
- stopBdDaemons/stopBdActivityProcesses: guard against negative killed counts
Safety fixes:
- --nuke: require GT_NUKE_ACKNOWLEDGED=1, don't just warn and proceed
- pkill patterns: document limitation about broad matching
Code cleanup:
- EnsureBdDaemonHealth: remove unused issues variable
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The swarm dispatch command now always spawns fresh polecats instead of
searching for idle ones to reuse. With the self-cleaning model, polecats
self-nuke when done - there are no idle polecats to reuse.
Closes: gt-h4yc3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a polecat runs `gt done` with COMPLETED status, it now nukes its own
worktree before exiting. This is the self-cleaning model - polecats clean
up after themselves, reducing Witness/Deacon cleanup burden.
The self-nuke is:
- Only attempted for polecats (not Mayor/Witness/Deacon/Refinery)
- Only on COMPLETED status (not ESCALATED/DEFERRED)
- Non-fatal: if it fails, Witness will handle cleanup
Closes: gt-fqcst
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
gt done now always exits the session. The --exit flag is removed since
exit is the only sensible behavior - polecats don't stay alive after
signaling completion.
Closes: gt-yrz4k
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The --state flag is meant for quick state checks and cannot be
combined with --hook, --dry-run, or --explain flags.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract prime.go into focused files:
- prime_session.go: session ID handling, hooks, persistence
- prime_output.go: all output/rendering functions
- prime_molecule.go: molecule workflow context
- prime_state.go: handoff markers, session state detection
Main prime.go now ~730 lines with core flow visible as "table of contents".
No behavior changes - pure file organization following Go idioms.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fresh installs and rig adds were creating full CLAUDE.md files (285 lines
for mayor, ~100 lines for other roles), causing gt doctor to fail the
priming check immediately.
Per the priming architecture, CLAUDE.md should be a minimal bootstrap
pointer (<30 lines) that tells agents to run gt prime. Full context is
injected ephemerally at session start.
Changes:
- install.go: createMayorCLAUDEmd now writes 12-line bootstrap pointer
- manager.go: createRoleCLAUDEmd now writes role-specific bootstrap pointers
for mayor, refinery, crew, and polecat roles
Note: The AGENTS.md issue mentioned in #316 could not be reproduced - the
code does not appear to create AGENTS.md at rig level. May be from an older
version or different configuration.
Partial fix for #316
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add three new flags to gt prime command:
- --state: Output role state as JSON and exit early (for scripting)
- --dry-run: Skip side effects (persistence, locks, events)
- --explain: Show verbose role detection reasoning
The --state flag is mutually exclusive with all other flags and errors
if combined. The other flags (--dry-run, --explain, --hook) can be
combined freely.
Also fixes missing filepath import in beads.go.
Closes: bd-t8ven
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
detectSessionState() and checkSlungWork() both contained identical
logic for finding hooked/in_progress beads assigned to an agent.
Extracted this into findHookedBead() helper function.
Also includes priming subsystem improvements from mayor:
- Add --dry-run flag for testing without side effects
- Add --state flag to output detected state only
- Add --explain flag to show why sections are included
- Add missing filepath import to beads.go
Fixes: bd-hvwnb
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 1 of dynamic priming subsystem:
1. PRIME.md provisioning for all workers (hq-5z76w, hq-ukjrr Part A)
- Added ProvisionPrimeMD to beads package with Gas Town context template
- Provision at rig level in AddRig() so all workers inherit it
- Added fallback provisioning in crew and polecat managers
- Created PRIME.md for existing rigs
2. Post-handoff detection to prevent handoff loop bug (hq-ukjrr Part B)
- Added FileHandoffMarker constant (.runtime/handoff_to_successor)
- gt handoff writes marker before respawn
- gt prime detects marker and outputs "HANDOFF COMPLETE" warning
- Marker cleared after detection to prevent duplicate warnings
3. Priming health checks for gt doctor (hq-5scnt)
- New priming_check.go validates priming subsystem configuration
- Checks: SessionStart hook, gt prime command, PRIME.md presence
- Warns if CLAUDE.md is too large (should be bootstrap pointer)
- Fixable: provisions missing PRIME.md files
This ensures crew workers get Gas Town context (GUPP, hooks, propulsion)
even if the gt prime hook fails, via bd prime fallback.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When `gt formula run` fell back to the default "gastown" rig (because no
rig could be detected), it didn't set rigPath, which meant the default
formula lookup would fail. Now rigPath is properly constructed when we
have townRoot but can't detect a current rig.
Also adds tests for GetDefaultFormula helper.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Allow `gt formula run` to be called without a formula name by configuring
a default in the rig's settings/config.json under workflow.default_formula.
Co-authored-by: Brett VanderVeen <brett.vanderveen@gfs.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Import beads' UX design system into gastown:
- Add internal/ui/ package with Ayu theme colors and semantic styling
- styles.go: AdaptiveColor definitions for light/dark mode
- terminal.go: TTY detection, NO_COLOR/CLICOLOR support
- markdown.go: Glamour rendering with agent mode bypass
- pager.go: Smart paging with GT_PAGER support
- Add colorized help output (internal/cmd/help.go)
- Group headers in accent color
- Command names styled for scannability
- Flag types and defaults muted
- Add gt thanks command (internal/cmd/thanks.go)
- Contributor display with same logic as bd thanks
- Styled with Ayu theme colors
- Update gt doctor to match bd doctor UX
- Category grouping (Core, Infrastructure, Rig, Patrol, etc.)
- Semantic icons (✓ ⚠ ✖) with Ayu colors
- Tree connectors for detail lines
- Summary line with pass/warn/fail counts
- Warnings section at end with numbered issues
- Migrate existing styles to use ui package
- internal/style/style.go uses ui.ColorPass etc.
- internal/tui/feed/styles.go uses ui package colors
Co-Authored-By: SageOx <ox@sageox.ai>
hq-u0ach: done.go - Add --cleanup-status flag so agents can pass cleanup
status directly. Removes computeCleanupStatus() which violated ZFC by
having Go compute cleanup status from git state.
hq-z0zqw: beads.go - Remove strings.Contains parsing for ErrNotARepo and
ErrSyncConflict. Per ZFC, Go should transport errors to agents, not parse
them to make decisions. IsBeadsRepo() now uses file existence check.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a bead is closed externally via bd close, it could remain on
an agent's hook, causing confusion when running gt hook. Now
gt hook detects closed beads and shows a warning message with
instructions to clear the hook using gt unsling.
Closes: gt-8w0r6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The gt hook command wasn't finding hooked beads for town-level roles
(mayor, deacon) because of an identity format mismatch:
- When hooking a bead, resolveSelfTarget() sets assignee with trailing
slash (e.g., "mayor/")
- When querying, buildAgentIdentity() returned without slash ("mayor")
This caused the assignee filter to miss the hooked bead since bd does
exact matching on the assignee field.
Fix:
- Update buildAgentIdentity() to return "mayor/" and "deacon/" with
trailing slash, matching the format used when setting assignee
- Update isTownLevelRole() to accept both formats for compatibility
Fixes: gt-g6ng2
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two ZFC fixes:
1. Boot marker file (hq-zee5n): Changed IsRunning() to query
tmux.HasSession() directly instead of checking marker file
freshness with TTL. Removed stale marker check from doctor.
2. Branch pattern matching (hq-zwuh6): Replaced hardcoded "polecat/"
strings with constants.BranchPolecatPrefix for consistency.
Also removed 60-second WaitForCommand blocking from crew Start()
which was causing gt crew start to hang.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extends the --agent flag with a more general --env flag that allows
setting arbitrary environment variables when starting a witness.
Precedence (highest to lowest):
1. CLI --env overrides
2. Role bead env_vars
3. config.AgentEnv() defaults
Examples:
gt witness start greenplace --env ANTHROPIC_MODEL=claude-3-haiku
gt witness restart greenplace --env DEBUG=1 --env VERBOSE=true
Co-authored-by: joshuavial <git@codewithjv.com>
Crew workspaces use clones with redirected beads directories, like
polecat and refinery. They should bypass the bd daemon for fresh
data and isolation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Create centralized AgentEnv function as single source of truth for all
agent environment variables. All agents now consistently receive:
- GT_ROLE, BD_ACTOR, GIT_AUTHOR_NAME (role identity)
- GT_ROOT, BEADS_DIR (workspace paths)
- GT_RIG, GT_POLECAT/GT_CREW (rig-specific identity)
- BEADS_AGENT_NAME, BEADS_NO_DAEMON (beads config)
- CLAUDE_CONFIG_DIR (optional account selection)
Remove RoleEnvVars in favor of AgentEnvSimple wrapper.
Remove IncludeBeadsEnv flag - beads env vars always included.
Update all manager and cmd call sites to use AgentEnv.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a new `gt doctor` check that verifies tmux session environment
variables match expected values from `config.RoleEnvVars()`.
- Checks all Gas Town sessions (gt-*, hq-*)
- Compares actual tmux env vars against expected for each role
- Reports mismatches with guidance to restart sessions
- Treats no sessions as success (valid when Gas Town is down)
- Skips deacon (doesn't use standard env vars)
Also:
- Adds `tmux.GetAllEnvironment()` to retrieve all session env vars
- Removes redundant gtroot_check (env-vars check covers GT_ROOT)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Introduces config.RoleEnvVars() as the single source of truth for role
identity environment variables (GT_ROLE, GT_RIG, BD_ACTOR, etc.).
CLI improvements:
- Fix getRoleHome paths (witness has no /rig suffix, polecat/crew do)
- Make gt role env read-only (displays current role from env/cwd)
- Add EnvIncomplete handling: fill missing env vars from cwd with warning
- Add cwd mismatch warnings when not in role home directory
- gt role home now validates --polecat requires --rig
Includes comprehensive e2e tests for all role detection scenarios.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Start crew members concurrently instead of sequentially. Previously,
`gt crew start --all` could hang for minutes because each crew member
was started one at a time, with each waiting up to 60 seconds for
Claude to initialize.
With parallel startup, all crew members start simultaneously and
the total wait time is bounded by the slowest individual startup.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add detection for when the installed gt binary is out of date with the
source repository. This helps catch issues where commands fail mysteriously
because the installed binary doesn't have recent fixes.
Changes:
- Add internal/version package with stale binary detection logic
- Add startup warning in PersistentPreRunE when binary is stale
- Add gt doctor check for stale-binary
- Use prefix matching for commit comparison (handles short vs full hash)
The warning is non-blocking and only shows once per shell session via
the GT_STALE_WARNED environment variable.
Resolves: gt-ud912
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When routing-based verification (verifyBeadExists) fails due to
routes.jsonl configuration issues, gt sling now falls back to pattern
matching via looksLikeBeadID to accept valid bead ID formats.
The fix ensures:
1. verifyBeadExists is tried first (routing-based lookup)
2. verifyFormulaExists is tried second (formula check)
3. looksLikeBeadID pattern match is used as final fallback
Also improved looksLikeBeadID to accept any 1-5 letter lowercase
prefix followed by hyphen and alphanumeric chars.
Fixes: gt sling bd-xxx failing with "not a valid bead or formula"
when the bead exists but routing cannot find it.
Closes: gt-9e8s5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add three layers of protection to prevent accidental branch switches in
the town root (~/gt), which should always stay on main:
1. Doctor check `town-root-branch`: Verifies town root is on main/master.
Fixable via `gt doctor --fix` to switch back to main.
2. Doctor check `pre-checkout-hook`: Verifies git pre-checkout hook is
installed. The hook blocks checkout from main to any other branch.
Fixable via `gt doctor --fix` or `gt git-init`.
3. Runtime warning in all gt commands: Non-blocking warning if town root
is on wrong branch, with fix instructions.
The root cause of this issue was git commands running in the wrong
directory, switching the town root to a polecat branch. This broke gt
commands because rigs.json and other configs were on main, not the
polecat branch.
Closes: hq-1kwuj
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive crash logging improvements to help diagnose mass session death events:
- Add TypeSessionDeath and TypeMassDeath event types for feed visibility
- Log pre-death events before killing sessions (who killed, why)
- Add mass death detection in daemon (3+ deaths in 30s triggers alert)
- Add macOS crash report check in gt doctor
- Support session death events in townlog and feed curator
Closes hq-kt1o6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add BeadsCustomTypes constant ("agent,role,rig,convoy,slot") to avoid
hardcoded strings scattered across the codebase
- Add CustomTypesCheck to gt doctor that verifies Gas Town custom types
are registered with beads, with --fix support
- Register custom types during gt init (best-effort, skips if no beads)
- Update install.go, rig_check.go, and rig/manager.go to use the constant
This ensures consistent type registration across all code paths and
catches misconfigured beads databases via gt doctor.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agents were confused when receiving "gt prime" as their first prompt,
interpreting it as a command to investigate rather than understanding
they were starting a Gas Town session.
Changed crew_at.go, start.go, and handoff.go to use FormatStartupNudge()
which produces a proper beacon like:
[GAS TOWN] george/crew/george <- human • 2026-01-09T10:30 • start
The SessionStart hook (gt prime --hook) still injects context - the
prompt just needs to be something agents recognize as a greeting.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
gt shutdown was not stopping the daemon, which caused it to restart
agents (witnesses, refineries) after shutdown completed. The daemon
heartbeats every 3 minutes and calls ensureWitnessesRunning() and
ensureRefineriesRunning(), which would notice the sessions were dead
and restart them.
This adds daemon stop logic to both runGracefulShutdown (as Phase 6)
and runImmediateShutdown (after polecat cleanup), matching the behavior
that gt down already has.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>