Commit Graph

16 Commits

Author SHA1 Message Date
gastown/crew/joe
5823c9fb36 fix(down): prevent tmux server exit when all sessions killed
When gt down --all killed all Gas Town sessions, if those were the only
tmux sessions, the server would exit due to tmux's default exit-empty
setting. Users perceived this as gt down --all killed my tmux server.

Fix: Set exit-empty off before killing sessions, ensuring the server
stays running for subsequent gt up commands. The --nuke flag still
explicitly kills the server when requested.

Fixes: gt-kh8w47

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 01:34:38 -08:00
Keith Wyatt
08755f62cd perf(tmux): batch session queries in gt down (#477)
* perf(tmux): batch session queries in gt down to reduce N+1 subprocess calls

Add SessionSet type to tmux package for O(1) session existence checks.
Instead of calling HasSession() (which spawns a subprocess) for each
rig/session during shutdown, now calls ListSessions() once and uses
in-memory map lookups.

Changes:
- internal/tmux/tmux.go: Add SessionSet type with GetSessionSet() and Has()
- internal/cmd/down.go: Use SessionSet for dry-run checks and session stops
- internal/session/town.go: Add StopTownSessionWithCache() variant
- internal/tmux/tmux_test.go: Add test for SessionSet

With 5 rigs, this reduces subprocess calls from ~15 to 1 during shutdown
preview, saving 60-150ms of execution time.

Closes: gt-xh2bh

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* perf(tmux): optimize SessionSet to avoid intermediate slice allocation

- Build map directly from tmux output instead of calling ListSessions()
- Use strings.IndexByte for efficient newline parsing
- Pre-size map using newline count to avoid rehashing
- Simplify nil checks in Has() and Names()

* fix(sling): restore bd cook directory context for formula-on-bead mode

The bd cook command needs to run from the target rig's directory to
access the correct formula database. This was accidentally removed
in a previous commit, causing TestSlingFormulaOnBeadRoutesBDCommandsToTargetRig
to fail.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 22:07:05 -08:00
Johann Dirry
5d96243414 fix: Windows build support with platform-specific process/signal handling
Separate platform-dependent code into build-tagged files:
- process_unix.go / process_windows.go: isProcessRunning() implementation
- signals_unix.go / signals_windows.go: daemon signal handling (Windows lacks SIGUSR1)

Windows implementation uses windows.OpenProcess with PROCESS_QUERY_LIMITED_INFORMATION
and checks exit code against STILL_ACTIVE (259).

Original-PR: #447
Co-Authored-By: Johann Dirry <johann.dirry@microsea.at>
2026-01-13 20:59:15 -08:00
nux
e043f4a16c feat(tmux): add KillSessionWithProcesses for explicit process termination
Before calling tmux kill-session, explicitly kill the pane's process tree
using pkill. This ensures claude processes don't survive session termination
due to SIGHUP being caught/ignored.

Implementation:
- Add KillSessionWithProcesses() to tmux.go
- Update killSessionsInOrder() in start.go to use new method
- Update stopSession() in down.go to use new method

Fixes: gt-5r7zr

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 11:53:57 -08:00
gastown/crew/dennis
b9e8be4352 fix(lint): resolve errcheck and unparam violations
Fixes CI lint failures by handling unchecked error returns and marking
unused parameters with blank identifiers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 18:06:09 -08:00
gastown/crew/max
dab619b3d0 feat(down): add --polecats flag and deprecate gt stop command
Issue #336: Consolidate down/shutdown/stop commands

Changes:
- Add `gt down --polecats` flag to stop all polecat sessions
- Deprecate `gt stop` command (prints warning, directs to `gt down --polecats`)
- Update help text to clarify down vs shutdown distinction:
  - down = pause (reversible, keeps worktrees)
  - shutdown = done (permanent cleanup)
- Integrate --polecats with new --dry-run mode from recent PR

Note: The issue proposed renaming --nuke to --tmux, but PR #330 just
landed with --nuke having better safety (GT_NUKE_ACKNOWLEDGED env var),
so keeping --nuke as-is. The new --polecats flag absorbs gt stop
functionality as proposed.

Closes #336

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 23:04:37 -08:00
Subhrajit Makur
6a705f6210 Feat/gt down tests (#15) (#18) (#330)
* fix(down): add refinery shutdown to gt down

Refineries were not being stopped by gt down, causing them to continue
running after shutdown. This adds a refinery shutdown loop before
witnesses, fixing problem P3 from the v2.4 proposal.

Changes:
- Add Phase 1: Stop refineries (gt-<rig>-refinery sessions)
- Renumber existing phases (witnesses now Phase 2, etc.)
- Include refineries in halt event logging

* feat(beads): add StopAllBdProcesses for shutdown

Add functions to stop bd daemon and bd activity processes:
- StopAllBdProcesses(dryRun, force) - main entry point
- CountBdDaemons() - count running bd daemons
- CountBdActivityProcesses() - count running bd activity processes
- stopBdDaemons() - uses bd daemon killall
- stopBdActivityProcesses() - SIGTERM->wait->SIGKILL pattern

This solves problems P1 (bd daemon respawns sessions) and P2 (bd activity
causes instant wakeups) from the v2.4 proposal.

* feat(down): rename --all to --nuke, add new --all and --dry-run flags

BREAKING CHANGE: --all now stops bd processes instead of killing tmux server.
Use --nuke for the old --all behavior (killing the entire tmux server).

New flags:
- --all: Stop bd daemons/activity processes and verify shutdown
- --nuke: Kill entire tmux server (DESTRUCTIVE, with warning)
- --dry-run: Preview what would be stopped without taking action

This solves problem P4 (old --all was too destructive) from the v2.4 proposal.

The --nuke flag now requires GT_NUKE_ACKNOWLEDGED=1 environment variable
to suppress the warning about destroying all tmux sessions.

* feat(down): add shutdown lock to prevent concurrent runs

Add Phase 0 that acquires a file lock before shutdown to prevent race
conditions when multiple gt down commands are run concurrently.

- Uses gofrs/flock for cross-platform file locking
- Lock file stored at ~/gt/daemon/shutdown.lock
- 5 second timeout with 100ms retry interval
- Lock released via defer on successful acquisition
- Dry-run mode skips lock acquisition

This solves problem P6 (concurrent shutdown race) from the v2.4 proposal.

* feat(down): add verification phase for respawn detection

Add Phase 5 that verifies shutdown was complete after stopping all services:
- Waits 500ms for processes to fully terminate
- Checks for respawned bd daemons
- Checks for respawned bd activity processes
- Checks for remaining gt-*/hq-* tmux sessions
- Checks if daemon PID is still running

If anything respawned, warns user and suggests checking systemd/launchd.

This solves problem P5 (no verification) from the v2.4 proposal.

* test(down): add unit tests for shutdown functionality

Add tests for:
- parseBdDaemonCount() - array, object with count, object with daemons, empty, invalid
- CountBdActivityProcesses() - integration test
- CountBdDaemons() - integration test (skipped if bd not installed)
- StopAllBdProcesses() - dry-run mode test
- isProcessRunning() - current process, invalid PID, max PID

These tests cover the core parsing and process detection logic added
in the v2.4 shutdown enhancement.

* fix(review): add tmux check and pkill fallback for bd shutdown

Address review gaps against proposal v2.4 AC:

- AC1: Add tmux availability check BEFORE acquiring shutdown lock
- AC2: Add pkill fallback for bd daemon when killall incomplete
- AC2: Return remaining count from stop functions for error reporting
- Style: interface{} → any (Go 1.18+)



* fix(prime): add validation for --state flag combination

The --state flag should be standalone and not combined with other flags.
Add validation at start of runPrime to enforce this.

Fixes TestPrimeFlagCombinations test failures.

* fix(review): address bot review critical issues

- isProcessRunning: handle pid<=0 as invalid (return false)
- isProcessRunning: handle EPERM as process exists (return true)
- stopBdDaemons: prevent negative killed count from race conditions
- stopBdActivityProcesses: prevent negative killed count from race conditions



* fix(review): critical fixes from deep review

Platform fixes:
- CountBdActivityProcesses: use sh -c "pgrep | wc -l" for macOS compatibility
  (pgrep -c flag not available on BSD/macOS)

Correctness fixes:
- stopSession: return (wasRunning, error) to distinguish "stopped" vs "not running"
- daemon.IsRunning: handle error instead of ignoring with blank identifier
- stopBdDaemons/stopBdActivityProcesses: guard against negative killed counts

Safety fixes:
- --nuke: require GT_NUKE_ACKNOWLEDGED=1, don't just warn and proceed
- pkill patterns: document limitation about broad matching

Code cleanup:
- EnsureBdDaemonHealth: remove unused issues variable



---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 22:56:33 -08:00
PV
c299f44aea refactor(session): extract town session helpers for DRY shutdown
Previously, town-level session stopping (Mayor, Boot, Deacon) was
implemented inline in gt down with separate code blocks for each
session. The shutdown order (Boot must stop before Deacon to prevent
the watchdog from restarting Deacon) was implicit in the code ordering.

Add session.TownSessions() and session.StopTownSession() to centralize
town-level session management. This provides a single source of truth
for the session list, shutdown order, and graceful/force logic.

Refactor gt down to use these helpers instead of inline logic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 00:17:37 -08:00
max
1b69576573 fix: Address golangci-lint errors (errcheck, gosec) (#76)
Apply PR #76 from dannomayernotabot:

- Add golangci exclusions for internal package false positives
- Tighten file permissions (0644 -> 0600) for sensitive files
- Add ReadHeaderTimeout to HTTP server (slowloris prevention)
- Explicit error ignoring with _ = for intentional cases
- Add //nolint comments with justifications
- Spelling: cancelled -> canceled (US locale)

Co-Authored-By: dannomayernotabot <noreply@github.com>

🤖 Generated with Claude Code
2026-01-03 16:11:55 -08:00
markov-kernel
e7145cfd77 fix: Make Mayor/Deacon session names include town name
Session names `gt-mayor` and `gt-deacon` were hardcoded, causing tmux
session name collisions when running multiple towns simultaneously.

Changed to `gt-{town}-mayor` and `gt-{town}-deacon` format (e.g.,
`gt-ai-mayor`) to allow concurrent multi-town operation.

Key changes:
- session.MayorSessionName() and DeaconSessionName() now take townName param
- Added workspace.GetTownName() helper to load town name from config
- Updated all callers in cmd/, daemon/, doctor/, mail/, rig/, templates/
- Updated tests with new session name format
- Bead IDs remain unchanged (already scoped by .beads/ directory)

Fixes #60

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 21:37:05 +01:00
gastown/polecats/rictus
637e959c48 Add Boot shutdown to gt down command (gt-xyzb5)
Boot is spawned by the daemon as the Deacons watchdog. During shutdown
Boot was never explicitly stopped. Now gt down stops Boot between Mayor
and Deacon (step 3) to ensure clean shutdown.

Generated with Claude Code

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 22:12:10 -08:00
Steve Yegge
907eba7194 Instrument gt commands to emit events to activity feed (gt-7aw1m)
Add events.LogFeed calls to the following gt commands:
- nudge.go: Log TypeNudge events when nudging agents
- unsling.go: Log TypeUnhook events when removing work from hooks
- up.go: Log TypeBoot events when starting Gas Town services
- down.go: Log TypeHalt events when stopping Gas Town services
- stop.go: Log TypeKill events when stopping polecat sessions
- polecat_spawn.go: Log TypeSpawn events when spawning polecats

Also add helper functions to events package:
- UnhookPayload: Creates payload for unhook events
- KillPayload: Creates payload for kill events
- HaltPayload: Creates payload for halt events

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 10:43:34 -08:00
Steve Yegge
34b5a3bb8d Document intentional error suppressions with comments (gt-zn9m)
All 156 instances of _ = error suppression in non-test code now have
explanatory comments documenting why the error is intentionally ignored.

Categories of intentional suppressions:
- non-fatal: session works without these - tmux environment setup
- non-fatal: theming failure does not affect operation - visual styling
- best-effort cleanup - defer cleanup on failure paths
- best-effort notification - mail/notifications that should not block
- best-effort interrupt - graceful shutdown attempts
- crypto/rand.Read only fails on broken system - random ID generation
- output errors non-actionable - fmt.Fprint to io.Writer

This addresses the silent failure and debugging concerns raised in the
issue by making the intentionality explicit in the code.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 23:14:29 -08:00
Steve Yegge
f86a73c2f0 feat: Add gcloud-style command grouping to gt help output
Organize 43 commands into 7 logical groups using cobra's built-in
AddGroup/GroupID feature:

- Work Management: spawn, sling, hook, handoff, done, mol, mq, etc.
- Agent Management: mayor, witness, refinery, deacon, polecat, etc.
- Communication: mail, nudge, broadcast, peek
- Services: daemon, start, stop, up, down, shutdown
- Workspace: rig, crew, init, install, git-init, namepool
- Configuration: account, theme, hooks, issue, completion
- Diagnostics: status, doctor, prime, version, help

Also renamed molecule to mol as the primary command name
(molecule is now an alias).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 02:32:01 -08:00
Steve Yegge
be5509fb18 gt down --all: also kill tmux server 2025-12-23 02:04:17 -08:00
Steve Yegge
da12531d3d feat: add gt up/down commands and daemon doctor check
New commands:
- `gt up` - Idempotent boot command that brings up all services:
  Daemon, Deacon, Mayor, and Witnesses for all rigs
- `gt down` - Graceful shutdown of all services

Doctor improvements:
- New daemon check verifies daemon is running
- Fixable with `gt doctor --fix` to auto-start daemon

The system can run degraded (any services down) but `gt up` ensures
a fully operational Gas Town with one idempotent command.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 02:26:09 -08:00