Commit Graph

65 Commits

Author SHA1 Message Date
aleiby
b9f5797b9e fix(refinery): use role-specific runtime config for startup (#756) (#847)
Use ResolveRoleAgentConfig instead of LoadRuntimeConfig to properly
resolve role-specific agent settings (like ready_prompt_prefix) when
starting the refinery. This fixes timeout issues when role_agents.refinery
is configured with a non-default agent.

Also move AcceptBypassPermissionsWarning before WaitForRuntimeReady to
avoid a race condition where the bypass dialog can block prompt detection.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 19:11:39 -08:00
furiosa
78ca8bd5bf fix(witness,refinery): remove ZFC-violating state types
Remove Witness and Refinery structs that recorded observable state
(State, PID, StartedAt, etc.) in violation of ZFC and "Discover,
Don't Track" principles.

Changes:
- Remove Witness struct and State type alias from witness/types.go
- Remove Refinery struct and State type alias from refinery/types.go
- Remove deprecated run(*Refinery) method from refinery/manager.go
- Update witness/types_test.go to remove tests for deleted types

The managers already derive running state from tmux sessions
(following the deacon pattern). The deleted types were vestigial
and unused.

Resolves: gt-r5pui

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 20:45:28 -08:00
gastown/crew/mel
5218102f49 refactor(witness,refinery): ZFC-compliant state management
Remove state files from witness and refinery managers, following
the "Discover, Don't Track" principle. Tmux session existence is
now the source of truth for running state (like deacon).

Changes:
- Add IsRunning() that checks tmux HasSession
- Change Status() to return *tmux.SessionInfo
- Remove loadState/saveState/stateManager
- Simplify Start()/Stop() to not use state files
- Update CLI commands (witness/refinery/rig) for new API
- Update tests to be ZFC-compliant

This fixes state file divergence issues where witness/refinery
could show "running" when the actual tmux session was dead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 20:00:56 -08:00
aleiby
15d1dc8fa8 fix: Make WaitForCommand/WaitForRuntimeReady fatal in manager Start() (#529)
Fixes #525: gt up reports deacon success but session doesn't actually start

Previously, WaitForCommand failures were marked as "non-fatal" in the
manager Start() methods used by gt up. This caused gt up to report
success even when Claude failed to start, because the error was silently
ignored.

Now when WaitForCommand or WaitForRuntimeReady times out:
1. The zombie tmux session is killed
2. An error is returned to the caller
3. gt up properly reports the failure

This aligns the manager Start() behavior with the cmd start functions
(e.g., gt deacon start) which already had fatal WaitForCommand behavior.

Changed files:
- internal/deacon/manager.go
- internal/mayor/manager.go
- internal/witness/manager.go
- internal/refinery/manager.go

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 00:00:53 -08:00
Walter McGivney
29f8dd67e2 fix: Add grace period to prevent Deacon restart loop (#590)
* fix(daemon): prevent runaway refinery session spawning

Fixes #566

The daemon spawned 812 refinery sessions over 4 days because:

1. Zombie detection was too strict - used IsAgentRunning(session, "node")
   but Claude reports pane command as version number (e.g., "2.1.7"),
   causing healthy sessions to be killed and recreated every heartbeat.

2. daemon.json patrol config was completely ignored - the daemon never
   loaded or checked the enabled flags.

Changes:
- refinery/manager.go: Use IsClaudeRunning() instead of IsAgentRunning()
  for robust Claude detection (handles "node", "claude", version patterns)
- daemon/types.go: Add PatrolConfig types and LoadPatrolConfig() to read
  mayor/daemon.json
- daemon/daemon.go: Load patrol config at startup, check enabled flags
  before calling ensureRefineriesRunning/ensureWitnessesRunning, add
  diagnostic logging for "already running" cases

Tested: Verified over multiple heartbeats that refinery shows "already
running, skipping spawn" instead of spawning new sessions.

* fix: Add grace period to prevent Deacon restart loop

The daemon had a race condition where:
1. ensureDeaconRunning() starts a new Deacon session
2. checkDeaconHeartbeat() runs in same heartbeat cycle
3. Heartbeat file is stale (from before crash)
4. Session is immediately killed
5. Infinite restart loop every 3 minutes

Fix:
- Track when Deacon was last started (deaconLastStarted field)
- Skip heartbeat check during 5-minute grace period
- Add config support for Deacon (consistency with refinery/witness)

After grace period, normal heartbeat checking resumes. Genuinely
stuck sessions (no heartbeat update after 5+ min) are still detected.

Fixes #589

---------

Co-authored-by: mayor <your-github-email@example.com>
2026-01-16 15:27:41 -08:00
Julian Knutsen
e7ca4908dc refactor(config): remove BEADS_DIR from agent environment and add doctor check (#455)
* fix(sling_test): update test for cook dir change

The cook command no longer needs database context and runs from cwd,
not the target rig directory. Update test to match this behavior
change from bd2a5ab5.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(tests): skip tests requiring missing binaries, handle --allow-stale

- Add skipIfAgentBinaryMissing helper to skip tests when codex/gemini
  binaries aren't available in the test environment
- Update rig manager test stub to handle --allow-stale flag

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(config): remove BEADS_DIR from agent environment

Stop exporting BEADS_DIR in AgentEnv - agents should use beads redirect
mechanism instead of relying on environment variable. This prevents
prefix mismatches when agents operate across different beads databases.

Changes:
- Remove BeadsDir field from AgentEnvConfig
- Remove BEADS_DIR from env vars set on agent sessions
- Update doctor env_check to not expect BEADS_DIR
- Update all manager Start() calls to not pass BeadsDir

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(doctor): detect BEADS_DIR in tmux session environment

Add a doctor check that warns when BEADS_DIR is set in any Gas Town
tmux session. BEADS_DIR in the environment overrides prefix-based
routing and breaks multi-rig lookups - agents should use the beads
redirect mechanism instead.

The check:
- Iterates over all Gas Town tmux sessions (gt-* and hq-*)
- Checks if BEADS_DIR is set in the session environment
- Returns a warning with fix hint to restart sessions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: julianknutsen <julianknutsen@users.noreply.github>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 22:13:57 -08:00
Subhrajit Makur
c61b67eb03 fix(config): implement role_agents support in BuildStartupCommand (#19) (#456)
* fix(config): implement role_agents support in BuildStartupCommand

The role_agents field in TownSettings and RigSettings existed but was
not being used by the startup command builders. All services fell back
to the default agent instead of using role-specific agent assignments.

Changes:
- BuildStartupCommand now extracts GT_ROLE from envVars and uses
  ResolveRoleAgentConfig() for role-based agent selection
- BuildStartupCommandWithAgentOverride follows the same pattern when
  no explicit override is provided
- refinery/manager.go uses ResolveRoleAgentConfig with constants
- cmd/start.go uses ResolveRoleAgentConfig with constants
- Updated comments from hardcoded agent name to generic "agent"
- Added ValidateAgentConfig() to check agent exists and binary is in PATH
- Added lookupAgentConfigIfExists() helper for validation
- ResolveRoleAgentConfig now warns to stderr and falls back to default
  if configured agent is invalid or binary is missing

Resolution priority (now working):
1. Explicit --agent override
2. Rig's role_agents[role] (validated)
3. Town's role_agents[role] (validated)
4. Rig's agent setting
5. Town's default_agent
6. Hardcoded default fallback

Adds tests for:
- TestBuildStartupCommand_UsesRoleAgentsFromTownSettings
- TestBuildStartupCommand_RigRoleAgentsOverridesTownRoleAgents
- TestBuildAgentStartupCommand_UsesRoleAgents
- TestValidateAgentConfig
- TestResolveRoleAgentConfig_FallsBackOnInvalidAgent

Fixes: role_agents configuration not being applied to services

* fix(config): add GT_ROOT to BuildStartupCommandWithAgentOverride

- Fixes missing GT_ROOT and GT_SESSION_ID_ENV exports in
  BuildStartupCommandWithAgentOverride, matching BuildStartupCommand behavior
- Adds test for override priority over role_agents
- Adds test verifying GT_ROOT is included in command

This addresses the Greptile review comment about agents started with
an override not having access to town-level resources.

Co-authored-by: Steve Yegge <steve.yegge@gmail.com>
2026-01-13 13:34:22 -08:00
Will Saults
bda248fb9a feat(refinery,boot): add --agent flag for model selection (#469)
* feat(refinery,boot): add --agent flag for model selection (hq-7d5m)

Add --agent flag to gt refinery start/attach/restart and gt boot spawn
commands for consistent model selection across all agent launch points.

Implementation follows the existing pattern from gt deacon start:
- Add StringVar flag for agent alias
- Pass override to Manager/Boot via SetAgentOverride()
- Use BuildAgentStartupCommandWithAgentOverride when override is set

Files affected:
- cmd/gt/refinery.go: add flags to start/attach/restart commands
- internal/refinery/manager.go: add SetAgentOverride and use in Start()
- cmd/gt/boot.go: add flag to spawn command
- internal/boot/boot.go: add SetAgentOverride and use in spawnTmux()

Closes #438

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(refinery,boot): use parameter-passing pattern for --agent flag

Address PR review feedback:

1. ADD TESTS: Add tests for --agent flag existence following witness_test.go pattern
   - internal/cmd/refinery_test.go: tests for start/attach/restart
   - internal/cmd/boot_test.go: test for spawn

2. ALIGN PATTERN: Change from setter pattern to parameter-passing pattern
   - Manager.Start(foreground, agentOverride) instead of SetAgentOverride + Start
   - Boot.Spawn(agentOverride) instead of SetAgentOverride + Spawn
   - Matches witness.go style: Start(foreground bool, agentOverride string, ...)

Updated all callers to pass empty string for default agent:
- internal/daemon/daemon.go
- internal/cmd/rig.go
- internal/cmd/start.go
- internal/cmd/up.go

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: furiosa <will@saults.io>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 13:14:47 -08:00
gastown/refinery
9315248134 fix(mq): persist MR rejection to beads storage
The RejectMR function was modifying the in-memory MR object but never
persisting the change to beads storage. This caused rejected MRs to
continue showing in the queue with status "open".

Fix: Call beads.CloseWithReason() to properly close the MR bead before
updating the in-memory state.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 12:54:54 -08:00
george
58207a00ec refactor(mrqueue): remove mrqueue package, use beads for MRs (gt-dqi)
Remove the mrqueue side-channel from gastown. The merge queue now uses
beads merge-request wisps exclusively, not parallel .beads/mq/*.json files.

Changes:
- Delete internal/mrqueue/ package (~830 lines removed)
- Move scoring logic to internal/refinery/score.go
- Update Refinery engineer to query beads via ReadyWithType("merge-request")
- Add MRInfo struct to replace mrqueue.MR
- Add ClaimMR/ReleaseMR methods using beads assignee field
- Update HandleMergeReady to not create duplicate queue entries
- Update gt refinery commands (claim, release, unclaimed) to use beads
- Stub out MQEventSource (no longer needed)

The Refinery now:
- Lists MRs via beads.ReadyWithType("merge-request")
- Claims via beads.Update(..., {Assignee: worker})
- Closes via beads.CloseWithReason("merged", mrID)
- Blocks on conflicts via beads.AddDependency(mrID, taskID)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 23:48:56 -08:00
gastown/crew/george
c94d59dca7 fix: ZFC improvements - query tmux directly instead of marker TTL
Two ZFC fixes:

1. Boot marker file (hq-zee5n): Changed IsRunning() to query
   tmux.HasSession() directly instead of checking marker file
   freshness with TTL. Removed stale marker check from doctor.

2. Branch pattern matching (hq-zwuh6): Replaced hardcoded "polecat/"
   strings with constants.BranchPolecatPrefix for consistency.

Also removed 60-second WaitForCommand blocking from crew Start()
which was causing gt crew start to hang.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 22:08:12 -08:00
dennis
0f633be4b1 fix(zfc): remove PID-based agent liveness detection
Replace ProcessExists() checks in witness and refinery managers with
tmux session detection. Agent liveness should be derived from tmux
session state, not PID probing (per ZFC tracking principles).

- Remove util.ProcessExists() from witness/manager.go and refinery/manager.go
- Delete internal/util/process.go and process_test.go (now unused)
- Foreground mode and Stop() now rely solely on tmux HasSession/KillSession

Closes: hq-yxkdr (recentDeaths already removed)
Closes: hq-1sd4o (ProcessExists removed)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 22:02:34 -08:00
julianknutsen
e999ceb1c1 refactor: consolidate agent env vars into config.AgentEnv
Create centralized AgentEnv function as single source of truth for all
agent environment variables. All agents now consistently receive:
- GT_ROLE, BD_ACTOR, GIT_AUTHOR_NAME (role identity)
- GT_ROOT, BEADS_DIR (workspace paths)
- GT_RIG, GT_POLECAT/GT_CREW (rig-specific identity)
- BEADS_AGENT_NAME, BEADS_NO_DAEMON (beads config)
- CLAUDE_CONFIG_DIR (optional account selection)

Remove RoleEnvVars in favor of AgentEnvSimple wrapper.
Remove IncludeBeadsEnv flag - beads env vars always included.
Update all manager and cmd call sites to use AgentEnv.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 21:52:30 -08:00
julianknutsen
7150ce2624 refactor: update managers to use RoleEnvVars
Consolidates all role startup code to use the shared RoleEnvVars()
function, ensuring consistent env vars across tmux SetEnvironment
and Claude startup command exports.

Updated:
- Mayor manager
- Deacon startup (daemon.go)
- Witness manager
- Refinery manager
- Polecat startup (daemon.go)
- BuildPolecatStartupCommand, BuildCrewStartupCommand helpers

This ensures all agents receive the same identity env vars regardless
of startup path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 21:52:09 -08:00
mayor
ff3d1b2e23 fix(refinery): use mayor/rig fallback for correct git remote
Refinery was using rig.Path which found the town's .git with rig-named
remotes (e.g., 'gastown') instead of 'origin'. This caused refinery to
miss polecat branches when fetching.

Now falls back to mayor/rig (which has 'origin' pointing to the project
repo) when refinery/rig doesn't exist.

Fixes: hq-uvrzt
2026-01-09 12:42:14 -08:00
jack
afff85cdff fix(tmux): use NewSessionWithCommand to avoid send-keys race condition
Agent sessions would fail on startup because send-keys arrived before the
shell was ready, causing 'bad pattern' and 'command not found' errors.

Fix: Create sessions with the command directly using tmux new-session's
command argument. This runs the agent as the pane's initial process,
avoiding shell readiness timing issues entirely.

Updated all agent managers: mayor, deacon, witness, refinery, polecat, crew.

Also fixes pre-existing build error in polecat/manager.go (polecatPath →
clonePath/newClonePath).

Closes #280

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 23:35:31 -08:00
Ben Kraus
38adfa4d8b codex 2026-01-08 12:36:54 -05:00
Julian Knutsen
28a9de64d5 fix(tmux): wait for shell ready before sending keys (#264)
Add WaitForShellReady call before SendKeys in all agent managers
(deacon, mayor, witness, refinery). This prevents intermittent
"can't find pane" errors that occur when the tmux session is
created but the shell isn't ready to receive input yet.

The issue manifests under load (e.g., during `gt up` when multiple
agents start in sequence) where the 200ms delay in SendKeysDelayed
isn't sufficient for the pane to be fully initialized.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 20:49:25 -08:00
julianknutsen
ea8bef2029 refactor: unify agent startup with Manager pattern
- Create mayor.Manager for mayor lifecycle (Start/Stop/IsRunning/Status)
- Create deacon.Manager for deacon lifecycle with respawn loop
- Move session.Manager to polecat.SessionManager (clearer naming)
- Add zombie session detection for mayor/deacon (kills tmux if Claude dead)
- Remove duplicate session startup code from up.go, start.go, mayor.go
- Rename sessMgr -> polecatMgr for consistency
- Make witness/refinery SessionName() public for status display

All agent types now follow the same Manager pattern:
  mgr := agent.NewManager(...)
  mgr.Start(...)
  mgr.Stop()
  mgr.IsRunning()
  mgr.Status()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 22:32:35 -08:00
julianknutsen
72544cc06d Unify agent startup with Manager pattern
Refactors all agent startup paths (witness, refinery, crew, polecat) to use
a consistent Manager interface with Start(), Stop(), IsRunning(), and
SessionName() methods.

Includes:
- Witness manager with GUPP propulsion nudge for startup
- Refinery manager for engineer sessions
- Crew manager for worker agents
- Session/polecat manager updates
- claude_settings_check doctor check for settings validation
- Settings management consolidated from rig/manager.go
- Settings location moved outside source repos to prevent conflicts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 21:44:04 -08:00
jv
22693c1dcc feat: runtime-aware tmux agent checks 2026-01-06 19:13:14 -08:00
jv
02ca9e43fa fix: honor rig agent when starting witness/refinery 2026-01-06 19:12:55 -08:00
gastown/crew/joe
c2451b85e7 Merge origin/main into fix/205-address-claude-startup-issues
Resolved conflict in internal/witness/manager.go:
- Kept session import (used by PR code)
- Kept PR's more accurate comment for PID check
- Removed duplicate sessionName method introduced by merge
2026-01-06 19:04:29 -08:00
Julian Knutsen
9d7dcde1e2 feat: Unified beads redirect for tracked and local beads (#222)
* feat: Beads redirect architecture for tracked and local beads

This change implements proper redirect handling so that all rig agents
(Witness, Refinery, Crew, Polecats) can work with both:
- Tracked beads: .beads/ checked into git at mayor/rig/.beads
- Local beads: .beads/ created at rig root during gt rig add

Key changes:

1. SetupRedirect now handles tracked beads by skipping redirect chains.
   The bd CLI doesn't support chains (A→B→C), so worktrees redirect
   directly to the final destination (mayor/rig/.beads for tracked).

2. ResolveBeadsDir is now used consistently in polecat and refinery
   managers instead of hardcoded mayor/rig paths.

3. Rig-level agents (witness, refinery) now use rig beads with rig
   prefix instead of town beads. This follows the architecture where
   town beads are only for Mayor/Deacon.

4. prime.go simplified to always use ../../.beads for crew redirects,
   letting rig-level redirect handle tracked vs local routing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(doctor): Add beads-redirect check for tracked beads

When a repo has .beads/ tracked in git (at mayor/rig/.beads), the rig root
needs a redirect file pointing to that location. This check:

- Detects missing rig-level redirect for tracked beads
- Verifies redirect points to correct location (mayor/rig/.beads)
- Auto-fixes with 'gt doctor --fix'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Handle fileLock.Unlock error in daemon

Wrap fileLock.Unlock() return value to satisfy errcheck linter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 12:59:37 -08:00
mayor
31a32c084b Unify agent startup with GUPP propulsion nudge
Witness and Refinery startup was duplicated across cmd/witness.go, cmd/up.go,
cmd/rig.go, and daemon.go. Worse, not all code paths sent the propulsion nudge
(GUPP - Gas Town Universal Propulsion Principle). Now unified in Manager.Start()
which handles everything including nudges.

Changes:
- witness/manager.go: Full rewrite with session creation, env vars, theming,
  WaitForClaudeReady, startup nudge, and propulsion nudge (GUPP)
- refinery/manager.go: Add propulsion nudge sequence after Claude startup
- cmd/witness.go: Simplify to just call mgr.Start(), remove ensureWitnessSession
- cmd/rig.go: Use witness.Manager.Start() instead of inline session creation
- cmd/start.go: Use witness.Manager.Start()
- cmd/up.go: Use witness.Manager.Start(), remove ensureWitness(),
  add EnsureSettingsForRole in ensureSession()
- daemon.go: Use witness.Manager.Start() and refinery.Manager.Start() for
  unified startup with proper nudges

This ensures all agent startup paths (gt witness start, gt rig boot, gt up,
daemon restarts) consistently apply GUPP propulsion nudges.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 01:51:25 -08:00
wraith
ef248a1824 refactor: extract ExecWithOutput utility for command execution (gt-vurfr)
Create util.ExecWithOutput and util.ExecRun to consolidate repeated
exec.Command patterns across witness/handlers.go and refinery/manager.go.

Changes:
- Add internal/util/exec.go with ExecWithOutput (returns stdout) and
  ExecRun (runs command without output)
- Refactor witness/handlers.go to use utility functions (7 call sites)
- Refactor refinery/manager.go, removing unused gitRun/gitOutput methods
- Add comprehensive tests in exec_test.go

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 00:18:47 -08:00
gastown/crew/jack
eea4435269 feat(rig): Add --branch flag for custom default branch
Add --branch flag to `gt rig add` to specify a custom default branch
instead of auto-detecting from remote. This supports repositories that
use non-standard default branches like `develop` or `release`.

Changes:
- Add --branch flag to `gt rig add` command
- Store default_branch in rig config.json
- Propagate default branch to refinery, witness, daemon, and all commands
- Rename ensureMainBranch to ensureDefaultBranch for clarity
- Add Rig.DefaultBranch() method for consistent access
- Update crew/manager.go and swarm/manager.go to use rig config

Based on PR #49 by @kustrun - rebased and extended with additional fixes.

Co-authored-by: kustrun <kustrun@users.noreply.github.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 14:01:02 -08:00
max
1b69576573 fix: Address golangci-lint errors (errcheck, gosec) (#76)
Apply PR #76 from dannomayernotabot:

- Add golangci exclusions for internal package false positives
- Tighten file permissions (0644 -> 0600) for sensitive files
- Add ReadHeaderTimeout to HTTP server (slowloris prevention)
- Explicit error ignoring with _ = for intentional cases
- Add //nolint comments with justifications
- Spelling: cancelled -> canceled (US locale)

Co-Authored-By: dannomayernotabot <noreply@github.com>

🤖 Generated with Claude Code
2026-01-03 16:11:55 -08:00
keeper
354219033a feat: Add 'ensure' semantics to witness/refinery start commands
gt witness start and gt refinery start now detect zombie sessions
(tmux alive but Claude dead) and automatically kill and recreate them.

This makes the start commands idempotent:
- If no session exists → create new session
- If session exists and healthy → do nothing (already running)
- If session exists but zombie → kill and recreate

Previously users had to manually run stop then start, or use restart.

Closes: gt-ekc5u

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 18:53:52 -08:00
joe
c3e83a3d09 Fix remaining worker startups for gt seance metadata
- startCrewMember: now uses BuildCrewStartupCommand (was GetRuntimeCommand)
- refinery/manager.go: now uses BuildAgentStartupCommand (was GetRuntimeCommand)

Both now properly inject BD_ACTOR and GT_ROLE so seance can identify
sessions correctly. This completes the seance metadata fix started in
the previous commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 18:09:42 -08:00
nux
8517ff0650 feat(mq): Add priority-ordered queue display and processing (gt-si8rq.6)
Implements priority scoring for merge queue ordering:

## Changes to gt mq list
- Add SCORE column showing priority score (higher = process first)
- Sort MRs by score descending instead of simple priority
- Add CONVOY column showing convoy ID if tracked

## New gt mq next command
- Returns highest-score MR ready for processing
- Supports --strategy=fifo for FIFO ordering fallback
- Supports --quiet for just printing MR ID
- Supports --json for programmatic access

## Changes to Refinery
- Queue() now sorts by priority score instead of simple priority
- Uses ScoreMR from mrqueue package for consistent scoring

## MR Fields Extended
- Added retry_count, last_conflict_sha, conflict_task_id
- Added convoy_id, convoy_created_at for convoy tracking
- These fields feed into priority scoring function

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 17:18:27 -08:00
mayor
4ea2b719ba fix(review): Address code review feedback for hq-eggh5
- Rename issueToBead → issueToMR (clearer naming)
- Handle empty TargetBranch with fallback to main
- Sort queue by priority (P0 first)
- Remove dead code (discoverWorkBranches, branchToMR)
- Fix parseTime fallback format
2025-12-31 11:36:48 -08:00
mayor
36f17bbadd fix: Refinery queue uses beads MQ as source of truth (hq-eggh5)
The refinery was checking git branches instead of the beads merge queue.
This caused MRs to pile up when branches were deleted but MR beads remained.

- manager.go: Queue() now queries beads for type=merge-request issues
- refinery.md.tmpl: Updated queue-scan to use gt mq list
- mol-refinery-patrol.formula.toml: Updated queue-scan step instructions
2025-12-31 11:36:48 -08:00
gastown/crew/gus
626a24e013 Refactor startup paths to use RuntimeConfig (gt-j0546)
Replaced all hardcoded 'claude --dangerously-skip-permissions' invocations
with configurable helpers from internal/config:

- GetRuntimeCommand(rigPath) - simple command string
- GetRuntimeCommandWithPrompt(rigPath, prompt) - with initial prompt
- BuildAgentStartupCommand(role, bdActor, rigPath, prompt) - generic agent
- BuildPolecatStartupCommand(rigName, polecatName, rigPath, prompt) - polecat
- BuildCrewStartupCommand(rigName, crewName, rigPath, prompt) - crew
- BuildStartupCommand(envVars, rigPath, prompt) - custom env vars

Files updated:
- internal/cmd/start.go (4 locations)
- internal/cmd/crew_lifecycle.go (2 locations)
- internal/cmd/crew_at.go (2 locations)
- internal/cmd/deacon.go
- internal/cmd/witness.go
- internal/cmd/up.go (2 locations)
- internal/cmd/handoff.go (2 locations)
- internal/daemon/daemon.go (3 locations)
- internal/daemon/lifecycle.go
- internal/session/manager.go
- internal/refinery/manager.go
- internal/boot/boot.go

This enables future support for alternative LLM runtimes (aider, etc.)
via rig/town settings configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 23:48:34 -08:00
Steve Yegge
ede5406d36 ZFC #5: Move merge/conflict decisions from Go to Refinery agent
Remove decision-making logic from Go code and document agent responsibility:

- ProcessMR() now returns error indicating agent handles processing
- Foreground mode deprecated with message directing to background mode
- Retry() no longer calls ProcessMR (agent picks up retried MRs)
- Added ZFC compliance section to refinery role template
- Marked unused helper functions as deprecated (runTests, getMergeConfig, pushWithRetry)

The Refinery agent (Claude) now:
- Runs git commands directly (fetch, checkout, merge, push)
- Detects conflicts and decides: retry, notify polecat, escalate
- Runs tests and decides: proceed, rollback, retry
- Makes all engineering decisions based on command output

Go code provides only primitives: queue listing, status, mail notifications.

(gt-sxa64)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 02:12:43 -08:00
Steve Yegge
8e96e12e37 Remove stats from .runtime/*.json observable state (gt-lfi2d)
Remove Stats fields from witness and refinery types that tracked
observable metrics (total counts, today counts). These are now
handled by the activity stream/beads system instead.

Removed:
- WitnessStats type and Stats field from Witness
- RefineryStats type and Stats field from Refinery
- LastCheckAt field from Witness
- Stats display from gt witness status
- Stats display from gt refinery status
- Stats increment code from refinery.completeMR()

Kept minimal process state:
- RigName, State, PID, StartedAt (both)
- MonitoredPolecats, Config, SpawnedIssues (witness)
- CurrentMR, PendingMRs, LastMergeAt (refinery)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 02:05:54 -08:00
Steve Yegge
65d4dbe222 Refinery emits activity events for MQ lifecycle (gt-ytsxp)
Add merge queue activity events to the refinery:
- merge_started: When refinery begins processing an MR
- merged: When MR successfully merged to main
- merge_failed: When merge fails (conflict, tests, push, etc.)
- merge_skipped: When MR skipped (superseded)

Events include MR ID, worker, branch, and reason (for failures).
Implemented in both Manager.ProcessMR and Engineer.ProcessMRFromQueue.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 23:48:25 -08:00
Steve Yegge
3421fe9303 Remove dead code: splitLines loop, unused methods (gt-2g130)
- polecat.go: Remove unreachable first loop in splitLines (was using filepath.SplitList incorrectly)
- townlog/logger.go: Remove unused Event.JSON() method and json import
- lock/lock.go: Remove unused ErrStaleLock error variable
- refinery/manager.go: Remove unused getTestCommand() method

Note: witness.StatePaused is actually used by cmd/witness.go, not dead code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:13:34 -08:00
Steve Yegge
5e7c5b4728 refactor: Extract processExists to util.ProcessExists (gt-560ge)
Move duplicated processExists function to shared util package:
- Create internal/util/process.go with ProcessExists function
- Add internal/util/process_test.go with basic tests
- Update witness/manager.go to use util.ProcessExists
- Update refinery/manager.go to use util.ProcessExists
- Remove local processExists functions from both files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 16:16:25 -08:00
Steve Yegge
213b3bab20 feat: Add atomic write pattern for state files (gt-wled7)
Prevents data loss from concurrent/interrupted state file writes by using
atomic write pattern (write to .tmp, then rename).

Changes:
- Add internal/util package with AtomicWriteJSON/AtomicWriteFile helpers
- Update witness/manager.go saveState to use atomic writes
- Update refinery/manager.go saveState to use atomic writes
- Update crew/manager.go saveState to use atomic writes
- Update daemon/types.go SaveState to use atomic writes
- Update polecat/namepool.go Save to use atomic writes
- Add comprehensive tests for atomic write utilities

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 15:57:53 -08:00
Steve Yegge
2d6b93f26b fix: Remove PID/tmux state inference (gt-psuw7)
ZFC compliance: daemon becomes pure transport layer, trusting agent beads.

Changes:
- refinery Status(): Simply returns loaded state, no PID/tmux reconciliation
- witness Status(): Simply returns loaded state, no PID inference
- daemon ensureDeaconRunning(): Trusts agent bead state, no tmux fallback
- daemon pokeDeacon(): Trusts agent bead state, no HasSession check

Removed:
- 78 lines of state inference code (PID checks, tmux session parsing)
- "Reconciliation" logic that overwrote agent-reported state

Note: Timeout fallback for dead agents is gt-2hzl4 (separate issue).

Reference: ~/gt/docs/zfc-violations-audit.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 01:56:42 -08:00
Steve Yegge
34b5a3bb8d Document intentional error suppressions with comments (gt-zn9m)
All 156 instances of _ = error suppression in non-test code now have
explanatory comments documenting why the error is intentionally ignored.

Categories of intentional suppressions:
- non-fatal: session works without these - tmux environment setup
- non-fatal: theming failure does not affect operation - visual styling
- best-effort cleanup - defer cleanup on failure paths
- best-effort notification - mail/notifications that should not block
- best-effort interrupt - graceful shutdown attempts
- crypto/rand.Read only fails on broken system - random ID generation
- output errors non-actionable - fmt.Fprint to io.Writer

This addresses the silent failure and debugging concerns raised in the
issue by making the intentionality explicit in the code.

Generated with Claude Code https://claude.com/claude-code

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 23:14:29 -08:00
Steve Yegge
9afd6c5572 feat: Set BD_ACTOR env var when spawning agents (gt-rhfji)
When gt spawns agents (polecats, crew, patrol roles), it now sets the
BD_ACTOR env var so that bd commands (like `bd hook`) know the agent
identity without coupling to gt.

Updated spawn points:
- gt up (mayor, deacon, witness via ensureSession/ensureWitness)
- gt deacon start
- gt witness start
- gt start refinery
- gt mayor start
- Daemon deacon restart
- Daemon lifecycle restart
- Handoff respawn
- Refinery manager start

BD_ACTOR uses slash format (e.g., gastown/witness, gastown/crew/max)
while GT_ROLE may use dash format internally.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 13:26:38 -08:00
Steve Yegge
b685879b63 Refinery as worktree: local MR integration (gt-4u5z)
Architecture changes:
- Refinery created as worktree of mayor clone (shares .git)
- Polecat branches stay local (never pushed to origin)
- MRs stored as wisps in .beads-wisp/mq/ (ephemeral)
- Only main gets pushed to origin after merge

New mrqueue package for wisp-based MR storage.
Updated spawn, done, mq_submit, refinery, molecule templates.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 21:25:01 -08:00
Steve Yegge
46a5e38fa2 feat: add restart subcommands to witness and refinery (gt-9kc2)
- Add `gt witness restart <rig>` command (stop + start)
- Add `gt refinery restart [rig]` command (stop + start)
- Add self-cycling documentation to refinery template
- Clarify that restarts are daemon-managed, not shell loops

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 03:56:31 -08:00
Steve Yegge
59d656470e Add Claude settings templates for autonomous roles (gt-6957)
- Create internal/claude package with embedded settings templates
- settings-autonomous.json: gt prime && gt mail check --inject (SessionStart)
- settings-interactive.json: gt prime only (SessionStart)

- Update witness.go: EnsureSettings before session, remove broken gt prime injection
- Update refinery/manager.go: EnsureSettings before session, remove broken NudgeSession
- Update session/manager.go: EnsureSettings for polecats, remove broken issue injection

All autonomous roles (polecat, witness, refinery) now get proper SessionStart hooks
automatically when their sessions are created. No more timing-based gt prime injection.
2025-12-22 17:51:15 -08:00
Steve Yegge
18152b73ff fix(refinery): Use NudgeSession for reliable gt prime delivery
Wait for Claude to start before sending the prime command, and use
NudgeSession (with 500ms debounce) instead of SendKeysDelayed for
reliable message delivery.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 14:23:16 -08:00
Steve Yegge
97e0535bfe Implement three-tier config architecture (gt-k1lr tasks 1-5)
**Architecture changes:**
- Renamed `.gastown/` → `.runtime/` for runtime state (gitignored)
- Added `settings/` directory for rig behavioral config (git-tracked)
- Added `mayor/config.json` for town-level config (MayorConfig type)
- Separated RigConfig (identity) from RigSettings (behavioral)

**File location changes:**
- Town runtime: `~/.gastown/*` → `~/.runtime/*`
- Rig runtime: `<rig>/.gastown/*` → `<rig>/.runtime/*`
- Rig config: `<rig>/.gastown/config.json` → `<rig>/settings/config.json`
- Namepool state: `namepool.json` → `namepool-state.json`

**New types:**
- MayorConfig: town-level behavioral config
- RigSettings: rig behavioral config (merge_queue, theme, namepool)
- RigConfig now identity-only (name, git_url, beads, created_at)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 01:22:43 -08:00
Steve Yegge
2a8ae43041 refactor(refinery): use io.Writer instead of fmt.Print for output
Add output field (io.Writer) to Manager and Engineer structs with
SetOutput() methods to enable testability and output redirection.

Replace all 30+ fmt.Printf/Println calls with fmt.Fprintf/Fprintln
using the configurable output writer, defaulting to os.Stdout.

This enables:
- Testing output without capturing stdout
- Redirecting output in different contexts
- Following cobra best practices

Closes: gt-cvfg

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 22:18:10 -08:00
Steve Yegge
a9d204aa22 feat(daemon): add lifecycle support for refinery and crew
- Add refinery and crew patterns to identityToSession()
- Add refinery and crew handling to restartSession() with pre-sync
- Add refinery and crew to identityToStateFile()
- Fix gt refinery start to send gt prime after Claude starts
- New syncWorkspace() helper for agents with persistent clones
2025-12-20 17:42:21 -08:00