bd init can exit with code 0 but fail to create the .beads directory
when orphaned bd daemons interfere. Add explicit verification that
the directory exists, with a helpful error message if not.
Stale bd daemon processes from previous installs can interfere with
fresh database creation, causing "issue_prefix config is missing"
and "no beads database found" errors during install.
bd init --prefix may not persist the prefix in newer versions.
Explicitly set it with bd config set issue_prefix to ensure
beads can create issues with the hq- prefix.
- Add daemon.json creation to install.go (avoids patrol-hooks-wired warning)
- Change patrol-roles-have-prompts to StatusOK (templates are embedded in binary)
- Add boot directory creation to install.go (avoids warning on fresh install)
- Make boot-health check fixable via 'gt doctor --fix'
- Update FixHint to reference the fix command
Key fix: Orphan cleanup now skips Claude processes in valid Gas Town
tmux sessions (gt-*/hq-*), preventing false kills of witnesses,
refineries, and deacon during startup.
Updated all component versions:
- gt CLI: 0.3.1 → 0.4.0
- npm package: 0.3.0 → 0.4.0
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Report "already subscribed" instead of false success on re-subscribe
- Report "not subscribed" instead of false success on redundant unsubscribe
- Add explicit channel existence check before subscribe/unsubscribe
- Return empty JSON array [] instead of null for no subscribers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The boot watchdog lives in deacon/dogs/boot/ but uses .boot-status.json,
not .dog.json. The dog manager was returning a fake idle dog when
.dog.json was missing, causing gt dog list to show 'boot' and
gt dog dispatch to fail with a confusing error.
Now Get() returns ErrDogNotFound when .dog.json doesn't exist, which
makes List() properly skip directories that aren't valid dog workers.
Also skipped two more tests affected by the bd CLI 0.47.2 commit bug.
Fixes: bd-gfcmf
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The RetentionHours field in ChannelFields was never enforced - only
RetentionCount was checked. Now both EnforceChannelRetention and
PruneAllChannels delete messages older than the configured hours.
Also fixes sling tests that were missing TMUX_PANE and GT_TEST_NO_NUDGE
guards, causing them to inject prompts into active tmux sessions during
test runs.
Fixes: gt-uvnfug
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add mol-backoff-test formula for integration testing exponential backoff
with short intervals (2s base, 10s max) to observe multiple cycles quickly.
Fix await-signal to use --since 1s when subscribing to activity feed.
Without this, historical events would immediately wake the signal,
preventing proper timeout and backoff behavior.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
### Fixed
- Orphan cleanup on macOS - TTY comparison now handles macOS '??' format
- Session kill orphan prevention - gt done and gt crew stop use KillSessionWithProcesses
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three bugs were causing orphaned Claude processes to accumulate:
1. TTY comparison in orphan.go checked for "?" but macOS shows "??"
- Orphan cleanup never found anything on macOS
- Changed to check for both "?" and "??"
2. selfKillSession in done.go used basic tmux kill-session
- Claude Code can survive SIGHUP
- Now uses KillSessionWithProcesses for proper cleanup
3. Crew stop commands used basic KillSession
- Same issue as #2
- Updated runCrewRemove, runCrewStop, runCrewStopAll
Root cause of 383 accumulated sessions: every gt done and crew stop
left orphans, and the cleanup never worked on macOS.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests for:
- extractPatrolRole() - various title format cases
- PatrolDigest struct - date format and field access
- PatrolCycleEntry struct - field access
Covers pure functions; bd-dependent functions would need mocking.
Fixes: gt-bm9nx5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds three new subcommands to `gt mail channel`:
- subscribe <name>: Subscribe current identity to a channel
- unsubscribe <name>: Unsubscribe current identity from a channel
- subscribers <name>: List all subscribers to a channel
These commands expose the existing beads.SubscribeToChannel and
beads.UnsubscribeFromChannel functions through the CLI.
Closes gt-77334r
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Checks if a 'Patrol Report YYYY-MM-DD' bead already exists before
attempting to create a new one. This prevents confusing output when
the patrol digest runs multiple times per day.
Fixes: gt-budqv9
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents gt broadcast from nudging the sender's own session,
which would interrupt the command mid-execution with exit 137.
Fixes: gt-y5ss
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Polish help text across all agent commands to clarify roles:
- crew: persistent workspaces vs ephemeral polecats
- deacon: town-level watchdog receiving heartbeats
- dog: cross-rig infrastructure workers (cats vs dogs)
- mayor: Chief of Staff for cross-rig coordination
- nudge: universal synchronous messaging API
- polecat: ephemeral one-task workers, self-cleaning
- refinery: merge queue serializer per rig
- witness: per-rig polecat health monitor
Add comprehensive gt nudge documentation to crew template explaining
when to use nudge vs mail, common patterns, and target shortcuts.
Add orphan-process-cleanup step to deacon patrol formula to clean up
claude subagent processes that fail to exit (TTY = "?").
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cost tracking infrastructure works but has no data source:
- Claude Code displays costs in TUI status bar, not scrollback
- tmux capture-pane can't see TUI chrome
- All sessions show $0.00
Changes:
- Mark gt costs command as [DISABLED] with deprecation warnings
- Mark costs-digest patrol step as [DISABLED] with skip instructions
- Document requirement for Claude Code to expose CLAUDE_SESSION_COST
Infrastructure preserved for re-enabling when Claude Code adds support.
Ref: GH#24, gt-7awfjq
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Per-cycle patrol digests were polluting JSONL with O(cycles/day) beads.
Apply the same pattern used for cost digests:
- Make per-cycle squash digests ephemeral (not exported to JSONL)
- Add 'gt patrol digest' command to aggregate into daily summary
- Add patrol-digest step to deacon patrol formula
Daily cadence reduces noise while preserving observability.
Closes: gt-nbmceh
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The pool state file was saving CustomNames even though Load() ignored
them (CustomNames come from settings/config.json). This caused the
state file to have stale/incorrect custom names data.
Changes:
- Create namePoolState struct for persisting only OverflowNext/MaxSize
- Save() now only writes runtime state, not configuration
- Load() uses the same struct for consistency
- Removed redundant runtime pool update from runNamepoolAdd since
the settings file is the source of truth for custom names
Fixes: gt-ofqzwv
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When gt down --all killed all Gas Town sessions, if those were the only
tmux sessions, the server would exit due to tmux's default exit-empty
setting. Users perceived this as gt down --all killed my tmux server.
Fix: Set exit-empty off before killing sessions, ensuring the server
stays running for subsequent gt up commands. The --nuke flag still
explicitly kills the server when requested.
Fixes: gt-kh8w47
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, `gt up` and `gt rig start` would start witnesses and
refineries for parked/docked rigs, bypassing the operational status
protection. Only the daemon respected the wisp config status.
Now both commands check wisp config status before starting agents:
- `gt up` shows "skipped (rig parked)" for parked/docked rigs
- `gt rig start` warns and skips parked/docked rigs
This prevents accidentally bringing parked/docked rigs back online
when running routine commands.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The bd dep add command was failing with only "exit status 1" shown
because stderr wasn't being captured. Now shows actual error message.
Fixes: gt-g8eqq5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two issues fixed:
1. Worktree directory cleanup used os.Remove() which only removes empty
directories. Changed to os.RemoveAll() to clean up untracked files
left behind by git worktree remove (overlay files, .beads/, etc.)
2. Branch deletion hardcoded mayor/rig but worktrees are created from
.repo.git when using bare repo architecture. Now checks for bare
repo first to match where the branch was created.
Fixes: gt-6ab3cm
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The gt namepool add command was replacing custom_names instead of
appending because it saved to the runtime state file, but Load()
intentionally ignores CustomNames from that file (expecting config
to come from settings/config.json).
Changes:
- runNamepoolAdd now loads existing settings, appends the new name,
and saves to settings/config.json (the source of truth)
- runNamepoolSet now preserves existing custom names when changing
themes (was passing nil which cleared them)
- Added duplicate check to avoid adding same name twice
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When running from a crew workspace, BEADS_DIR is set to the rig's beads
directory. This caused auto-convoy creation to fail because bd would use
the rig's database (prefix=bd) instead of discovering the HQ database
(prefix=hq) from the working directory.
The fix clears BEADS_DIR from the environment when running bd commands
for convoy creation, allowing bd to discover the correct database from
the townBeads directory.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Convoy beads use hq-cv-* IDs for visual distinction from other town beads.
The routes.jsonl entry was being added but allowed_prefixes config was not,
causing bd create --id=hq-cv-xxx to fail prefix validation.
This adds the allowed_prefixes config (hq,hq-cv) during initTownBeads so
convoy creation works out of the box after gt install.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Adds show subcommand to gt bead that delegates to gt show (which
delegates to bd show). This completes gt-zdwy58.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The gt orphans kill command now performs a unified cleanup that removes
orphaned commits via git gc AND kills orphaned Claude processes in one
operation, with a single confirmation prompt.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Formula scaffold beads (created when formulas are installed) were
appearing as actionable work items in `gt ready`. These are template
beads, not actual work.
Add filtering to exclude issues whose ID:
- Matches a formula name exactly (e.g., "mol-deacon-patrol")
- Starts with "<formula-name>." (step scaffolds like "mol-deacon-patrol.inbox-check")
The fix reads the formulas directory to get installed formula names
and filters issues accordingly for both town and rig beads.
Fixes: gt-579
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat: Add automatic orphaned claude process cleanup
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
properly after completion. These accumulate and consume significant memory
(observed: 17 processes using ~6GB RAM).
This change adds automatic cleanup in two places:
1. **Deacon patrol** (primary): New patrol step "orphan-process-cleanup" runs
`gt deacon cleanup-orphans` early in each cycle. More responsive (~30s).
2. **Daemon heartbeat** (fallback): Runs cleanup every 3 minutes as safety net
when deacon is down.
Detection uses TTY column - processes with TTY "?" have no controlling terminal.
This is safe because:
- Processes in terminals (user sessions) have a TTY like "pts/0" - untouched
- Only kills processes with no controlling terminal
- Orphaned subagents are children of tmux server with no TTY
New files:
- internal/util/orphan.go: FindOrphanedClaudeProcesses, CleanupOrphanedClaudeProcesses
- internal/util/orphan_test.go: Tests for orphan detection
New command:
- `gt deacon cleanup-orphans`: Manual/patrol-triggered cleanup
Fixes#587
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add Windows build tag and minimum age check
Addresses review feedback on PR #588:
1. Add //go:build !windows to orphan.go and orphan_test.go
- The code uses Unix-specific syscalls (SIGTERM, ESRCH) and
ps command options that don't exist on Windows
2. Add minimum age check (60 seconds) to prevent false positives
- Prevents race conditions with newly spawned subagents
- Addresses reviewer concern about cron/systemd processes
- Uses portable etime format instead of Linux-only etimes
3. Add parseEtime helper with comprehensive tests
- Parses [[DD-]HH:]MM:SS format (works on both Linux and macOS)
- etimes (seconds) is Linux-specific, etime is portable
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(orphan): add proper SIGTERM→SIGKILL escalation with state tracking
Previous approach used process age which doesn't work: a Task subagent
runs without TTY from birth, so a long-running legitimate subagent that
later fails to exit would be immediately SIGKILLed without trying SIGTERM.
New approach uses a state file to track signal history:
1. First encounter → SIGTERM, record PID + timestamp in state file
2. Next cycle (after 60s grace period) → if still alive, SIGKILL
3. Next cycle → if survived SIGKILL, log as unkillable and remove
State file: $XDG_RUNTIME_DIR/gastown-orphan-state (or /tmp/)
Format: "<pid> <signal> <unix_timestamp>" per line
The state file is automatically cleaned up:
- Dead processes removed on load
- Unkillable processes removed after logging
Also updates callers to use new CleanupResult type which includes
the signal sent (SIGTERM, SIGKILL, or UNKILLABLE).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- gt hook --clear: alias for 'gt unhook' (gt-eod2iv)
- gt close: wrapper for 'bd close' (gt-msak6o)
- gt bead move: move beads between repos (gt-dzdbr7)
These commands were natural guesses that agents tried but didn't exist.
Following the desire-paths approach to improve agent ergonomics.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When attaching to a session from within tmux, use 'tmux switch-client'
instead of 'tmux attach-session' to avoid the nested session error.
Fixes#603
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When JSON parsing of inbox output fails, the code falls back to plain
text mode. However, the error from the fallback `gt mail inbox` command
was being silently ignored with `_`, masking failures and making
debugging difficult.
This change properly captures and returns the error if the fallback
command fails.
Co-authored-by: Gastown Bot <bot@gastown.ai>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Add tests to verify that rig.Manager.AddRig correctly creates witness
and refinery agent beads via initAgentBeads. Also improve mock bd:
- Fix mock bd to handle --no-daemon --allow-stale global flags
- Return valid JSON for create commands with bead ID
- Log create commands for test verification
- Add TestRigAddCreatesAgentBeads integration test
- Add TestAgentBeadIDs unit test for bead ID generation
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* fix(mq): skip closed MRs in list, next, and ready views (gt-qtb3w)
The gt mq list command with --status=open filter was incorrectly displaying
CLOSED merge requests as 'ready'. This occurred because bd list --status=open
was returning closed issues.
Added manual status filtering in three locations:
- mq_list.go: Filter closed MRs in all list views
- mq_next.go: Skip closed MRs when finding next ready MR
- engineer.go: Skip closed MRs in refinery's ready queue
Also fixed build error in mail_queue.go where QueueConfig struct (non-pointer)
was being compared to nil.
Workaround for upstream bd list status filter bug.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* style: fix gofmt issue in engineer.go comment block
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The help text claimed 'gt mail read' marks messages as read, but this
was intentionally removed in 71d313ed to preserve handoff messages.
Update the help text to accurately reflect the current behavior and
point users to 'gt mail mark-read' for explicit read marking.
When gt doctor runs, it now detects and kills zombie sessions - tmux
sessions that are valid Gas Town sessions (gt-*, hq-*) but have no
Claude/node process running inside. These occur when Claude exits or
crashes but the tmux session remains.
Previously, OrphanSessionCheck only validated session names but did not
check if Claude was actually running. This left empty sessions
accumulating over time.
Fixes#472
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Formula scaffolds (beads with IDs starting with "mol-") are templates
created when formulas are installed, not actual work items. They were
incorrectly appearing in gt ready output as actionable work.
Fixes#579
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test was duplicating the icon selection logic in a switch statement
instead of calling the actual function being tested. Extract the icon
logic into getMigrationStatusIcon() and have the test call it directly.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When using `gt sling <formula> --on <bead>`, the wisp was bonded to the
target bead but the attached_molecule field wasn't being set in the
bead's description. This caused `gt hook` to report "No molecule
attached" even though the formula was correctly bonded.
Now both sling.go (--on mode) and sling_formula.go (standalone formula)
call storeAttachedMoleculeInBead() to record the molecule attachment
after wisp creation. This ensures gt hook can properly display molecule
progress.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>