When gt done runs inside a tmux session (e.g., after polecat task
completion), calling KillSessionWithProcesses would kill the gt done
process itself before it could complete cleanup operations like writing
handoff state.
Add KillSessionWithProcessesExcluding() function that accepts a list of
PIDs to exclude from the kill sequence. Update selfKillSession to pass
its own PID, ensuring gt done completes before the session is destroyed.
Also fix both Kill*WithProcesses functions to ignore "session not found"
errors from KillSession - when we kill all processes in a session, tmux
may automatically destroy it before we explicitly call KillSession.
Co-authored-by: julianknutsen <julianknutsen@users.noreply.github>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
When starting crew without mayor running, tmux has-session can return
"no current target" if no tmux server exists. This error was not
mapped to ErrNoServer, causing crew start to fail instead of
bootstrapping a new tmux server.
Add "no current target" to the ErrNoServer detection so crew (and
other agents) can start independently without requiring an existing
tmux session.
Reproduction:
- Ensure no tmux server running (tmux kill-server)
- Run: gt crew start <rig>/<name>
- Before fix: "Error: checking session: tmux has-session: no current target"
- After fix: Crew session starts successfully
Co-authored-by: Shaun <TechnicallyShaun@users.noreply.github.com>
When gt down --all killed all Gas Town sessions, if those were the only
tmux sessions, the server would exit due to tmux's default exit-empty
setting. Users perceived this as gt down --all killed my tmux server.
Fix: Set exit-empty off before killing sessions, ensuring the server
stays running for subsequent gt up commands. The --nuke flag still
explicitly kills the server when requested.
Fixes: gt-kh8w47
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
KillSessionWithProcesses was only killing descendant processes,
assuming the session kill would terminate the pane process itself.
However, if the pane process (claude) calls setsid(), it detaches
from the controlling terminal and survives the session kill.
This fix explicitly kills the pane PID after killing descendants,
before killing the tmux session. This catches processes that have
escaped the process tree via setsid().
Fixes#513
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
When multiple agents start simultaneously (e.g., `gt up`), each runs
`gt nudge deacon session-started` in their SessionStart hook. These
nudges arrive concurrently and can interleave in the tmux input buffer,
causing:
1. Text from one nudge mixing with another
2. Enter keys not properly submitting messages
3. Garbled input requiring manual intervention
This fix adds per-session mutex serialization to NudgeSession() and
NudgePane(). Concurrent nudges to the same session now queue and
execute one at a time.
## Root Cause
The NudgeSession pattern sends text, waits 500ms, sends Escape, waits
100ms, then sends Enter. When multiple nudges arrive within this ~800ms
window, their send-keys commands interleave, corrupting the input.
## Alternatives Considered
1. **Delay deacon nudges** - Add sleep before nudge in SessionStart
- Simplest (one-line change)
- But: doesn't prevent concurrent nudges from multiple agents
2. **Debounce session-started** - Deacon ignores rapid-fire nudges
- Medium complexity
- But: only helps session-started, not other nudge types
3. **File-based signaling** - Replace tmux nudges with file watches
- Avoids tmux input issues entirely
- But: significant architectural change
4. **File upstream bug** - Report to Claude Code team
- SessionStart hooks fire async and can interleave
- But: fix timeline unknown, need robustness now
## Tradeoffs
- Concurrent nudges to same session now queue (adds latency)
- Memory: one mutex per unique session name (bounded, acceptable)
- Does not fix Claude Code's underlying async hook behavior
## Testing
- Build passes
- All tmux package tests pass
- Manual testing: started deacon + multiple witnesses concurrently,
nudges processed correctly without garbled input
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* fix(beads): cache version check and add timeout to prevent cli lag
* fix(mail_queue): add nil check for queue config
Prevents potential nil pointer panic when queue config exists
in map but has nil value. Added || queueCfg == nil check to
the queue lookup condition in runMailClaim function.
Fixes potential panic that could occur if a queue entry exists
in config but with a nil value.
* fix(migrate_agents_test): fix icon expectations to match actual output
The printMigrationResult function uses icons with two leading spaces
(" ✓", " ⊘", " ✗") but the test expected icons without spaces.
This fixes the test expectations to match the actual output format.
* fix(hook): handle error from events.LogFeed
Previously the error from LogFeed was silently ignored with _.
Now we log the error to stderr at warning level but don't fail
the operation since the primary hook action succeeded.
* fix(tmux): security and error handling improvements
- Fix unchecked regexp error in IsClaudeRunning (CVE-like)
- Add input sanitization to SetPaneDiedHook to prevent shell injection
- Add session name validation to SetDynamicStatus
- Sanitize mail from/subject in SendNotificationBanner
- Return error on parse failure in GetEnvironment
- Track skipped lines in ListSessionIDs for debuggability
See: tmux.fix for full analysis
* fix(daemon): improve error handling and security
- Capture stderr in syncWorkspace for better debuggability
- Fail fast on git fetch failures to prevent stale code
- Add logging to previously silent bd list errors
- Change notification state file permissions to 0600
- Improve error messages with actual stderr content
This prevents agents from starting with stale code and provides
better visibility into daemon operations.
* perf(tmux): batch session queries in gt down to reduce N+1 subprocess calls
Add SessionSet type to tmux package for O(1) session existence checks.
Instead of calling HasSession() (which spawns a subprocess) for each
rig/session during shutdown, now calls ListSessions() once and uses
in-memory map lookups.
Changes:
- internal/tmux/tmux.go: Add SessionSet type with GetSessionSet() and Has()
- internal/cmd/down.go: Use SessionSet for dry-run checks and session stops
- internal/session/town.go: Add StopTownSessionWithCache() variant
- internal/tmux/tmux_test.go: Add test for SessionSet
With 5 rigs, this reduces subprocess calls from ~15 to 1 during shutdown
preview, saving 60-150ms of execution time.
Closes: gt-xh2bh
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* perf(tmux): optimize SessionSet to avoid intermediate slice allocation
- Build map directly from tmux output instead of calling ListSessions()
- Use strings.IndexByte for efficient newline parsing
- Pre-size map using newline count to avoid rehashing
- Simplify nil checks in Has() and Names()
* fix(sling): restore bd cook directory context for formula-on-bead mode
The bd cook command needs to run from the target rig's directory to
access the correct formula database. This was accidentally removed
in a previous commit, causing TestSlingFormulaOnBeadRoutesBDCommandsToTargetRig
to fail.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The previous implementation used `pkill -P pid` which only kills direct
children. When Claude spawns subprocesses (like node workers), those
grandchild processes would become orphaned (PPID=1) when their parent
was killed, causing them to survive `gt shutdown -fa`.
The fix recursively finds all descendant processes and kills them in
deepest-first order, ensuring no process becomes orphaned during
shutdown.
Fixes: gt-wd3ce
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Merge polecat/nux-mkd083ff: Updates KillSessionWithProcesses to use
cleaner inline exec.Command style and improved documentation.
Prevents orphan processes that survive tmux kill-session due to
SIGHUP being ignored by Claude processes.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Before calling tmux kill-session, explicitly kill the pane's process tree
using pkill. This ensures claude processes don't survive session termination
due to SIGHUP being caught/ignored.
Implementation:
- Add KillSessionWithProcesses() to tmux.go
- Update killSessionsInOrder() in start.go to use new method
- Update stopSession() in down.go to use new method
Fixes: gt-5r7zr
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
NudgeSession and NudgePane now send Escape key before Enter to exit
vim INSERT mode if enabled. Harmless in normal mode.
Fixes#307
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
IsClaudeRunning now checks for child processes when the pane command is
a shell (bash/zsh). This fixes gt crew start --all killing running crew
members that were started with "export ... && claude ..." commands.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a new `gt doctor` check that verifies tmux session environment
variables match expected values from `config.RoleEnvVars()`.
- Checks all Gas Town sessions (gt-*, hq-*)
- Compares actual tmux env vars against expected for each role
- Reports mismatches with guidance to restart sessions
- Treats no sessions as success (valid when Gas Town is down)
- Skips deacon (doesn't use standard env vars)
Also:
- Adds `tmux.GetAllEnvironment()` to retrieve all session env vars
- Removes redundant gtroot_check (env-vars check covers GT_ROOT)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agent sessions would fail on startup because send-keys arrived before the
shell was ready, causing 'bad pattern' and 'command not found' errors.
Fix: Create sessions with the command directly using tmux new-session's
command argument. This runs the agent as the pane's initial process,
avoiding shell readiness timing issues entirely.
Updated all agent managers: mayor, deacon, witness, refinery, polecat, crew.
Also fixes pre-existing build error in polecat/manager.go (polecatPath →
clonePath/newClonePath).
Closes#280
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds EnableMouseMode() and calls it from ConfigureGasTownSession so all
new GT sessions get mouse support. Users can now click panes, scroll with
mouse wheel, and resize by dragging. Hold Shift for terminal text selection.
Closes#33
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds support for alternative AI runtime backends (Codex, OpenCode) alongside
the default Claude backend through a runtime abstraction layer.
- internal/runtime/runtime.go - Runtime-agnostic helper functions
- Extended RuntimeConfig with provider-specific settings
- internal/opencode/ for OpenCode plugin support
- Updated session managers to use runtime abstraction
- Removed unused ensureXxxSession functions
- Fixed daemon.go indentation, updated terminology to runtime
Backward compatible: Claude remains default runtime.
Co-Authored-By: Ben Kraus <ben@cinematicsoftware.com>
Co-Authored-By: Cameron Palmer <cameronmpalmer@users.noreply.github.com>
Claude Code can report its pane command as "node", "claude", or a version
number like "2.0.76". Previously only "node" was detected, causing healthy
sessions to be incorrectly identified as zombies and killed during daemon
heartbeat recovery.
This fix detects all three patterns to prevent witness sessions from being
killed every 3 minutes.
Based on michaellady's work in PR #174.
Co-Authored-By: michaellady <michaellady@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat: add Cursor Agent as compatible agent for Gas Town
Add AgentCursor preset with ProcessNames field for multi-agent detection:
- AgentCursor preset: cursor-agent -p -f (headless + force mode)
- ProcessNames field on AgentPresetInfo for agent detection
- IsAgentRunning(session, processNames) in tmux package
- GetProcessNames(agentName) helper function
Closes: ga-vwr
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: centralize agent preset list in config.go
Replace hardcoded ["claude", "gemini", "codex"] arrays with calls to
config.ListAgentPresets() to dynamically include all registered agents.
This fixes cursor agent not appearing in `gt config agent list` and
ensures new agent presets are automatically included everywhere.
Also updated doc comments to include "cursor" in example lists.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* test: add comprehensive agent client tests
Add tests for agent detection and command generation:
- TestIsAgentRunning: validates process name detection for all agents
(claude/node, gemini, codex, cursor-agent)
- TestIsAgentRunning_NonexistentSession: edge case handling
- TestIsClaudeRunning: backwards compatibility wrapper
- TestListAgentPresetsMatchesConstants: ensures ListAgentPresets()
returns all AgentPreset constants
- TestAgentCommandGeneration: validates full command line generation
for all supported agents
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat: add Auggie agent, fix Cursor interactive mode
Add Auggie CLI as supported agent:
- Command: auggie
- Args: --allow-indexing
- Supports session resume via --resume flag
Fix Cursor agent configuration:
- Remove -p flag (requires prompt, breaks interactive mode)
- Clear SessionIDEnv (cursor uses --resume with chatId directly)
- Keep -f flag for force/YOLO mode
Updated all test cases for both agents.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(agents): add Sourcegraph AMP as agent preset
Add AgentAmp constant and builtinPresets entry for Sourcegraph AMP CLI.
Configuration:
- Command: amp
- Args: --dangerously-allow-all --no-ide
- ResumeStyle: subcommand (amp threads continue <threadId>)
- ProcessNames: amp
Closes: ga-guq
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: lint error in cleanBeadsRuntimeFiles
Change function to not return error (was always nil).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: beads v0.46.0 compatibility and test fixes
- Add custom types config (agent,role,rig,convoy,event) after bd init calls
- Fix tmux_test.go to use variadic IsAgentRunning signature
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: update agent documentation for new presets
- README.md: Update agent examples to show cursor/auggie, add built-in presets list
- docs/reference.md: Add cursor, auggie, amp to built-in agents list
- CHANGELOG.md: Add entry for new agent presets under [Unreleased]
Addresses PR #247 review feedback.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
When sending messages to Claude sessions via tmux, the Enter key send
could fail silently. This caused polecats to receive their initial
prompt but never submit it - the message appeared in Claude's input
area but Enter was never pressed.
Add retry logic (up to 3 attempts with 200ms delays) for the Enter
send step in both NudgeSession() and NudgePane(). This ensures message
submission is more reliable even if tmux has transient issues.
Fixes#41
tmux has-session -t does prefix matching by default, so "gt-deacon-boot"
would match when checking for "gt-deacon". This caused gt start to think
the Deacon was running when only a stale gt-deacon-boot session existed.
Using "=" prefix forces exact matching in tmux.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The daemon was failing to restart agents when zombie tmux sessions existed
(session alive but Claude dead). Added EnsureSessionFresh() helper to
tmux package that:
- Checks if session exists
- If exists but Claude not running (zombie), kills the session
- Creates fresh session
Updated all daemon session creation points to use EnsureSessionFresh:
- ensureDeaconRunning()
- ensureWitnessRunning()
- restartPolecatSession()
- restartSession() in lifecycle.go
Added tests for the new helper function. (gt-j1i0r)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Claude starts with --dangerously-skip-permissions, it shows a warning
dialog requiring Down+Enter to accept. This blocked automated polecat and
agent startup.
Added AcceptBypassPermissionsWarning() to tmux package that:
- Checks if the warning dialog is present by capturing pane content
- Only sends Down+Enter if "Bypass Permissions mode" text is found
- Avoids interfering with sessions that don't show the warning
Updated all Claude startup locations:
- session/manager.go (polecat sessions)
- cmd/up.go (mayor, witness, crew, polecat cold starts)
- daemon/daemon.go (crashed polecat restarts)
- daemon/lifecycle.go (role session starts)
Use conditional bindings (if-shell) to check if session name starts
with "gt-" before running Gas Town commands. For non-GT sessions:
- C-b n/p fall back to default next-window/previous-window
- C-b a shows a help message
This prevents Gas Town from hijacking keybindings in unrelated tmux
sessions that happen to share the same tmux server.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added centralized emoji constants in internal/constants/constants.go
for easy customization. Updated all files that hardcoded role emojis
to use the new constants.
Changes:
- Witness emoji: 👁 (eye) → 🦉 (owl) - less creepy, matches visual identity
- Deacon emoji: 🦉 (owl) → 🐺 (wolf) - matches hierarchy prompt (wolf in engine room)
- Added RoleEmoji() helper function for string-based lookups
- Maintained backwards compatibility with legacy role names
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Boot is a watchdog that the daemon pokes instead of Deacon directly,
centralizing the 'when to wake Deacon' decision in an agent that can
reason about context.
Key changes:
- Add internal/boot package with marker file and status tracking
- Add gt boot commands: status, spawn, triage
- Add mol-boot-triage formula for Boot's triage cycle
- Modify daemon to call ensureBootRunning instead of ensureDeaconRunning
- Add tmux.IsAvailable() for degraded mode detection
- Add boot.md.tmpl role template
Boot lifecycle:
1. Daemon tick spawns Boot (fresh each time)
2. Boot runs triage: observe, decide, act
3. Boot cleans stale handoffs from Deacon inbox
4. Boot exits (or handoffs in non-degraded mode)
In degraded mode (no tmux), Boot runs mechanical triage directly.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Before: gt crew at only looked for tmux sessions with the specific naming
convention gt-<rig>-crew-<name>. If the user started Claude manually or
via a different mechanism, it would create a duplicate session.
After: Before creating a new session, check if any existing tmux session
has Claude running in the crews directory. If found, attach to that
session instead of creating a new one.
Changes:
- Add FindSessionByWorkDir() to internal/tmux/tmux.go to search sessions
by working directory, optionally filtering for Claude (node) running
- Update runCrewAt() to check for existing sessions before creating new
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added constants.SupportedShells for consistent shell list
- Updated 7 usages across start.go, crew_lifecycle.go, crew_helpers.go, tmux.go
- All tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extended the unified cycle system to include rig infrastructure sessions:
- Witness ↔ Refinery (per rig) now cycle with C-b n/p
Also moved SetCycleBindings into ConfigureGasTownSession so ALL Gas Town
sessions automatically get the unified cycle bindings. Removed redundant
individual calls from crew, mayor, and deacon startup code.
Cycle groups are now:
- Town: Mayor ↔ Deacon
- Crew (per rig): All crew members in same rig
- Infra (per rig): Witness ↔ Refinery
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous implementation set separate bindings for crew vs town
sessions, but tmux key bindings are global. This meant whichever
session type was started last would overwrite the other's bindings.
New approach:
- Add unified `gt cycle next/prev` command that auto-detects session type
- Town sessions (gt-mayor, gt-deacon) cycle within town group
- Crew sessions (gt-*-crew-*) cycle within their rig's crew
- Other sessions (polecats, witness, refinery) do nothing on cycle
The old SetCrewCycleBindings and SetTownCycleBindings are now aliases
for the unified SetCycleBindings function.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create gt town next/prev commands for cycling between town-level sessions
- Add SetTownCycleBindings() to tmux package
- Wire up bindings when starting mayor and deacon sessions
Now mayor and deacon have the same C-b n/p cycling behavior as crew workers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, gt doctor --fix would kill workers whose spawning process had
exited, even though the Claude session was still running in tmux.
Now both identity_check.go and CleanStaleLocks check if the tmux session
exists before declaring a lock stale. A lock is only truly stale if BOTH
the PID is dead AND the session does not exist.
Also added ListSessionIDs() to tmux package to handle locks that store
session IDs (%N or $N format) instead of session names.
Replace SendKeys approach with respawn-pane -k when starting Claude
in crew sessions. This gives cleaner exit behavior:
- Before: Claude exits → shell prompt → exit shell → session ends
- After: Claude exits → session ends (no intermediate shell)
Changes:
- Add GetPaneID() to tmux package for pane ID retrieval
- Update crew_at.go to use RespawnPane for both new and restart cases
- Remove unnecessary waits and multi-step Claude startup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All 156 instances of _ = error suppression in non-test code now have
explanatory comments documenting why the error is intentionally ignored.
Categories of intentional suppressions:
- non-fatal: session works without these - tmux environment setup
- non-fatal: theming failure does not affect operation - visual styling
- best-effort cleanup - defer cleanup on failure paths
- best-effort notification - mail/notifications that should not block
- best-effort interrupt - graceful shutdown attempts
- crypto/rand.Read only fails on broken system - random ID generation
- output errors non-actionable - fmt.Fprint to io.Writer
This addresses the silent failure and debugging concerns raised in the
issue by making the intentionality explicit in the code.
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds ClearHistory method to tmux package and calls it before
respawn-pane during handoff. This resets copy-mode display
from [0/N] to [0/0] for a clean session start.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When running `gt crew at <other>` from inside tmux, the linked window
would auto-select, causing users to unknowingly switch agents. Add -d
flag to keep user in their current window.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses tmux respawn-pane to kill and restart agent sessions in place,
bypassing the handoff/manager flow for quick context cycling.
Supports local recycle (current session) and remote recycle (other
roles) with automatic client switching.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>