Add tests for:
- extractPatrolRole() - various title format cases
- PatrolDigest struct - date format and field access
- PatrolCycleEntry struct - field access
Covers pure functions; bd-dependent functions would need mocking.
Fixes: gt-bm9nx5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When messages are sent to a channel, subscribers now receive a copy
in their inbox with [channel:name] prefix in the subject.
Closes: gt-3rldf6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds three new subcommands to `gt mail channel`:
- subscribe <name>: Subscribe current identity to a channel
- unsubscribe <name>: Unsubscribe current identity from a channel
- subscribers <name>: List all subscribers to a channel
These commands expose the existing beads.SubscribeToChannel and
beads.UnsubscribeFromChannel functions through the CLI.
Closes gt-77334r
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Checks if a 'Patrol Report YYYY-MM-DD' bead already exists before
attempting to create a new one. This prevents confusing output when
the patrol digest runs multiple times per day.
Fixes: gt-budqv9
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
resolveByName() only checked config-based queues/channels, missing
beads-native ones (gt:queue, gt:channel). Added lookup for both.
Also added LookupQueueByName to beads package for parity with
LookupChannelByName.
Fixes: gt-l5qbi3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents gt broadcast from nudging the sender's own session,
which would interrupt the command mid-execution with exit 137.
Fixes: gt-y5ss
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When repoBase() fails in RemoveWithOptions, the function previously
returned early after removing the directory but without calling
WorktreePrune(). This could leave stale worktree entries in
.git/worktrees/ if the polecat was created before the repo base
became unavailable.
Now we attempt to prune from both possible repo locations (bare repo
and mayor/rig) before the early return. This is a best-effort cleanup
that handles edge cases where the repo base is corrupted but worktree
entries still exist.
Resolves: gt-wisp-618ar
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Polish help text across all agent commands to clarify roles:
- crew: persistent workspaces vs ephemeral polecats
- deacon: town-level watchdog receiving heartbeats
- dog: cross-rig infrastructure workers (cats vs dogs)
- mayor: Chief of Staff for cross-rig coordination
- nudge: universal synchronous messaging API
- polecat: ephemeral one-task workers, self-cleaning
- refinery: merge queue serializer per rig
- witness: per-rig polecat health monitor
Add comprehensive gt nudge documentation to crew template explaining
when to use nudge vs mail, common patterns, and target shortcuts.
Add orphan-process-cleanup step to deacon patrol formula to clean up
claude subagent processes that fail to exit (TTY = "?").
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cost tracking infrastructure works but has no data source:
- Claude Code displays costs in TUI status bar, not scrollback
- tmux capture-pane can't see TUI chrome
- All sessions show $0.00
Changes:
- Mark gt costs command as [DISABLED] with deprecation warnings
- Mark costs-digest patrol step as [DISABLED] with skip instructions
- Document requirement for Claude Code to expose CLAUDE_SESSION_COST
Infrastructure preserved for re-enabling when Claude Code adds support.
Ref: GH#24, gt-7awfjq
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Per-cycle patrol digests were polluting JSONL with O(cycles/day) beads.
Apply the same pattern used for cost digests:
- Make per-cycle squash digests ephemeral (not exported to JSONL)
- Add 'gt patrol digest' command to aggregate into daily summary
- Add patrol-digest step to deacon patrol formula
Daily cadence reduces noise while preserving observability.
Closes: gt-nbmceh
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a workflow formula for Gas Town releases with:
- Workspace preflight checks (uncommitted work, stashes, branches)
- CHANGELOG.md and info.go versionChanges updates
- Version bump via bump-version.sh
- Local install and daemon restart
- Error handling guidance for crew vs polecat execution
Polecats must use `gt done` which goes through the Refinery merge queue.
The Refinery handles serialization, rebasing, and conflict resolution.
Added explicit "Polecats do NOT" list:
- Push directly to main (WRONG)
- Create pull requests
- Wait around to see if work merges
This addresses the failure mode where polecats push directly to main
instead of using the Refinery, causing merge conflicts that the
Refinery is designed to handle.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated polecat and crew templates to more explicitly address the
"waiting for approval" anti-pattern. LLMs naturally want to pause
and confirm before taking action, but Gas Town requires autonomous
execution.
Polecat template:
- Added "The Specific Failure Mode" section describing the exact
anti-pattern (complete work, write summary, wait)
- Added "The Self-Cleaning Model" section explaining done=gone
- Strengthened DO NOT list with explicit approval-seeking examples
Crew template:
- Added "The Approval Fallacy" section at the top
- Explains that there is no approval step in Gas Town
- Lists specific anti-patterns to avoid
These changes address the root cause of polecats sitting idle after
completing work instead of running `gt done`.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The pool state file was saving CustomNames even though Load() ignored
them (CustomNames come from settings/config.json). This caused the
state file to have stale/incorrect custom names data.
Changes:
- Create namePoolState struct for persisting only OverflowNext/MaxSize
- Save() now only writes runtime state, not configuration
- Load() uses the same struct for consistency
- Removed redundant runtime pool update from runNamepoolAdd since
the settings file is the source of truth for custom names
Fixes: gt-ofqzwv
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When gt down --all killed all Gas Town sessions, if those were the only
tmux sessions, the server would exit due to tmux's default exit-empty
setting. Users perceived this as gt down --all killed my tmux server.
Fix: Set exit-empty off before killing sessions, ensuring the server
stays running for subsequent gt up commands. The --nuke flag still
explicitly kills the server when requested.
Fixes: gt-kh8w47
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, `gt up` and `gt rig start` would start witnesses and
refineries for parked/docked rigs, bypassing the operational status
protection. Only the daemon respected the wisp config status.
Now both commands check wisp config status before starting agents:
- `gt up` shows "skipped (rig parked)" for parked/docked rigs
- `gt rig start` warns and skips parked/docked rigs
This prevents accidentally bringing parked/docked rigs back online
when running routine commands.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The CloneDivergenceCheck was calling git fetch for each clone without
a timeout, causing gt doctor to hang indefinitely when network or
authentication issues occurred. Removed the fetch - divergence detection
now uses existing local refs (may be stale but won't block).
Fixes: gt-aoklf8
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The bd dep add command was failing with only "exit status 1" shown
because stderr wasn't being captured. Now shows actual error message.
Fixes: gt-g8eqq5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two issues fixed:
1. Worktree directory cleanup used os.Remove() which only removes empty
directories. Changed to os.RemoveAll() to clean up untracked files
left behind by git worktree remove (overlay files, .beads/, etc.)
2. Branch deletion hardcoded mayor/rig but worktrees are created from
.repo.git when using bare repo architecture. Now checks for bare
repo first to match where the branch was created.
Fixes: gt-6ab3cm
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The gt namepool add command was replacing custom_names instead of
appending because it saved to the runtime state file, but Load()
intentionally ignores CustomNames from that file (expecting config
to come from settings/config.json).
Changes:
- runNamepoolAdd now loads existing settings, appends the new name,
and saves to settings/config.json (the source of truth)
- runNamepoolSet now preserves existing custom names when changing
themes (was passing nil which cleared them)
- Added duplicate check to avoid adding same name twice
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix 'add-issue' command to 'add' with correct syntax including convoy-id
- Add explanation that bead IDs and issue IDs are interchangeable terms
- Standardize convoy command parameters to match actual CLI help
Closes: gt-u7qb6p
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace placeholder issue-123 style IDs with realistic bead ID format
(prefix + 5-char alphanumeric, e.g., gt-abc12). Add explanation of bead
ID format in Beads Integration section. Update command references and
mermaid diagrams to use consistent "bead" terminology.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When running from a crew workspace, BEADS_DIR is set to the rig's beads
directory. This caused auto-convoy creation to fail because bd would use
the rig's database (prefix=bd) instead of discovering the HQ database
(prefix=hq) from the working directory.
The fix clears BEADS_DIR from the environment when running bd commands
for convoy creation, allowing bd to discover the correct database from
the townBeads directory.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes#525: gt up reports deacon success but session doesn't actually start
Previously, WaitForCommand failures were marked as "non-fatal" in the
manager Start() methods used by gt up. This caused gt up to report
success even when Claude failed to start, because the error was silently
ignored.
Now when WaitForCommand or WaitForRuntimeReady times out:
1. The zombie tmux session is killed
2. An error is returned to the caller
3. gt up properly reports the failure
This aligns the manager Start() behavior with the cmd start functions
(e.g., gt deacon start) which already had fatal WaitForCommand behavior.
Changed files:
- internal/deacon/manager.go
- internal/mayor/manager.go
- internal/witness/manager.go
- internal/refinery/manager.go
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Convoy beads use hq-cv-* IDs for visual distinction from other town beads.
The routes.jsonl entry was being added but allowed_prefixes config was not,
causing bd create --id=hq-cv-xxx to fail prefix validation.
This adds the allowed_prefixes config (hq,hq-cv) during initTownBeads so
convoy creation works out of the box after gt install.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The Start() function was returning success even if the pane died during
initialization (e.g., if Claude failed to start). This caused the caller
to get a confusing "getting pane" error when trying to use the session.
Now Start() verifies the session is still running at the end, returning
a clear error message if the session died during startup.
Fixes: gt-0cif0s
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a polecat runs gt done, the worktree is removed but the parent
polecat directory could be left behind containing only .beads/. This
caused gt polecat list to show ghost entries since exists() checks
if the polecatDir exists.
The fix adds explicit cleanup of .beads/ directories:
1. After git worktree remove succeeds, clean up any leftover .beads/
in the clonePath that was not fully removed
2. For new structure polecats, also clean up any .beads/ at the
polecatDir level before trying to remove the parent directory
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds show subcommand to gt bead that delegates to gt show (which
delegates to bd show). This completes gt-zdwy58.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add prominent warnings about the mandatory gt done requirement:
- New 'THE IDLE POLECAT HERESY' section at top of both templates
- Emphasize that sitting idle after completing work is a critical failure
- Add MANDATORY labels to completion protocol sections
- Add final reminder section before metadata block
This addresses the bug where polecats complete work but don't run gt done,
sitting idle and wasting resources instead of properly shutting down.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The gt orphans kill command now performs a unified cleanup that removes
orphaned commits via git gc AND kills orphaned Claude processes in one
operation, with a single confirmation prompt.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pin bd (beads CLI) to v0.47.1 in CI workflows and fix test agent IDs
that trigger bd's isLikelyHash() prefix extraction logic.
Changes:
- Pin bd to v0.47.1 in ci.yml and integration.yml (v0.47.2 has routing
defaults that cause prefix mismatch errors)
- Fix TestCloseAndClearAgentBead_FieldClearing: change agent IDs from
`test-testrig-polecat-0` to `test-testrig-polecat-all_fields_populated`
- Fix TestCloseAndClearAgentBead_ReasonVariations: change agent IDs from
`test-testrig-polecat-reason0` to `test-testrig-polecat-empty_reason`
Root cause: bd v0.47.1's isLikelyHash() treats suffixes of 3-8 chars
(with digits for 4+ chars) as potential git hashes. Patterns like `-0`
(single digit) and `-reason0` (7 chars with digit) caused bd to extract
the wrong prefix from agent IDs.
Using test names as suffixes (e.g., `all_fields_populated`) avoids this
because they're all >8 characters and won't trigger hash detection.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Formula scaffold beads (created when formulas are installed) were
appearing as actionable work items in `gt ready`. These are template
beads, not actual work.
Add filtering to exclude issues whose ID:
- Matches a formula name exactly (e.g., "mol-deacon-patrol")
- Starts with "<formula-name>." (step scaffolds like "mol-deacon-patrol.inbox-check")
The fix reads the formulas directory to get installed formula names
and filters issues accordingly for both town and rig beads.
Fixes: gt-579
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat: Add automatic orphaned claude process cleanup
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
properly after completion. These accumulate and consume significant memory
(observed: 17 processes using ~6GB RAM).
This change adds automatic cleanup in two places:
1. **Deacon patrol** (primary): New patrol step "orphan-process-cleanup" runs
`gt deacon cleanup-orphans` early in each cycle. More responsive (~30s).
2. **Daemon heartbeat** (fallback): Runs cleanup every 3 minutes as safety net
when deacon is down.
Detection uses TTY column - processes with TTY "?" have no controlling terminal.
This is safe because:
- Processes in terminals (user sessions) have a TTY like "pts/0" - untouched
- Only kills processes with no controlling terminal
- Orphaned subagents are children of tmux server with no TTY
New files:
- internal/util/orphan.go: FindOrphanedClaudeProcesses, CleanupOrphanedClaudeProcesses
- internal/util/orphan_test.go: Tests for orphan detection
New command:
- `gt deacon cleanup-orphans`: Manual/patrol-triggered cleanup
Fixes#587
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add Windows build tag and minimum age check
Addresses review feedback on PR #588:
1. Add //go:build !windows to orphan.go and orphan_test.go
- The code uses Unix-specific syscalls (SIGTERM, ESRCH) and
ps command options that don't exist on Windows
2. Add minimum age check (60 seconds) to prevent false positives
- Prevents race conditions with newly spawned subagents
- Addresses reviewer concern about cron/systemd processes
- Uses portable etime format instead of Linux-only etimes
3. Add parseEtime helper with comprehensive tests
- Parses [[DD-]HH:]MM:SS format (works on both Linux and macOS)
- etimes (seconds) is Linux-specific, etime is portable
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add proper SIGTERM→SIGKILL escalation with state tracking
Previous approach used process age which doesn't work: a Task subagent
runs without TTY from birth, so a long-running legitimate subagent that
later fails to exit would be immediately SIGKILLed without trying SIGTERM.
New approach uses a state file to track signal history:
1. First encounter → SIGTERM, record PID + timestamp in state file
2. Next cycle (after 60s grace period) → if still alive, SIGKILL
3. Next cycle → if survived SIGKILL, log as unkillable and remove
State file: $XDG_RUNTIME_DIR/gastown-orphan-state (or /tmp/)
Format: "<pid> <signal> <unix_timestamp>" per line
The state file is automatically cleaned up:
- Dead processes removed on load
- Unkillable processes removed after logging
Also updates callers to use new CleanupResult type which includes
the signal sent (SIGTERM, SIGKILL, or UNKILLABLE).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Fixes#566
The daemon spawned 812 refinery sessions over 4 days because:
1. Zombie detection was too strict - used IsAgentRunning(session, "node")
but Claude reports pane command as version number (e.g., "2.1.7"),
causing healthy sessions to be killed and recreated every heartbeat.
2. daemon.json patrol config was completely ignored - the daemon never
loaded or checked the enabled flags.
Changes:
- refinery/manager.go: Use IsClaudeRunning() instead of IsAgentRunning()
for robust Claude detection (handles "node", "claude", version patterns)
- daemon/types.go: Add PatrolConfig types and LoadPatrolConfig() to read
mayor/daemon.json
- daemon/daemon.go: Load patrol config at startup, check enabled flags
before calling ensureRefineriesRunning/ensureWitnessesRunning, add
diagnostic logging for "already running" cases
Tested: Verified over multiple heartbeats that refinery shows "already
running, skipping spawn" instead of spawning new sessions.
Co-authored-by: mayor <your-github-email@example.com>
Each rig now gets a deterministic theme based on its name instead of
always defaulting to mad-max. Uses a prime multiplier hash (×31) for
good distribution across themes. Same rig name always gets the same
theme. Users can still override with `gt namepool set`.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: stabilize beads and config tests
* fix: remove t.Parallel() incompatible with t.Setenv()
The test now uses t.Setenv() which cannot be used with t.Parallel() in Go.
This completes the conflict resolution from the rebase.
* style: fix gofmt issue in beads_test.go
Remove extra blank line in comment block.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- gt hook --clear: alias for 'gt unhook' (gt-eod2iv)
- gt close: wrapper for 'bd close' (gt-msak6o)
- gt bead move: move beads between repos (gt-dzdbr7)
These commands were natural guesses that agents tried but didn't exist.
Following the desire-paths approach to improve agent ergonomics.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When `gt rig add` fails due to GitHub password auth being disabled,
provide a helpful error message that:
- Explains that GitHub no longer supports password authentication
- Suggests the equivalent SSH URL for GitHub/GitLab repos
- Falls back to generic SSH suggestion for other hosts
Also adds tests for the URL conversion function.
Fixes#548
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
When attaching to a session from within tmux, use 'tmux switch-client'
instead of 'tmux attach-session' to avoid the nested session error.
Fixes#603
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(daemon): prevent runaway refinery session spawning
Fixes#566
The daemon spawned 812 refinery sessions over 4 days because:
1. Zombie detection was too strict - used IsAgentRunning(session, "node")
but Claude reports pane command as version number (e.g., "2.1.7"),
causing healthy sessions to be killed and recreated every heartbeat.
2. daemon.json patrol config was completely ignored - the daemon never
loaded or checked the enabled flags.
Changes:
- refinery/manager.go: Use IsClaudeRunning() instead of IsAgentRunning()
for robust Claude detection (handles "node", "claude", version patterns)
- daemon/types.go: Add PatrolConfig types and LoadPatrolConfig() to read
mayor/daemon.json
- daemon/daemon.go: Load patrol config at startup, check enabled flags
before calling ensureRefineriesRunning/ensureWitnessesRunning, add
diagnostic logging for "already running" cases
Tested: Verified over multiple heartbeats that refinery shows "already
running, skipping spawn" instead of spawning new sessions.
* fix: Add grace period to prevent Deacon restart loop
The daemon had a race condition where:
1. ensureDeaconRunning() starts a new Deacon session
2. checkDeaconHeartbeat() runs in same heartbeat cycle
3. Heartbeat file is stale (from before crash)
4. Session is immediately killed
5. Infinite restart loop every 3 minutes
Fix:
- Track when Deacon was last started (deaconLastStarted field)
- Skip heartbeat check during 5-minute grace period
- Add config support for Deacon (consistency with refinery/witness)
After grace period, normal heartbeat checking resumes. Genuinely
stuck sessions (no heartbeat update after 5+ min) are still detected.
Fixes#589
---------
Co-authored-by: mayor <your-github-email@example.com>
When JSON parsing of inbox output fails, the code falls back to plain
text mode. However, the error from the fallback `gt mail inbox` command
was being silently ignored with `_`, masking failures and making
debugging difficult.
This change properly captures and returns the error if the fallback
command fails.
Co-authored-by: Gastown Bot <bot@gastown.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>