The agent_state field was recording observable state like "running",
"dead", "idle" which violated the "Discover, Don't Track" principle.
This caused stale state bugs where agents were marked "dead" in beads
but actually running in tmux.
Changes:
- Remove daemon's checkStaleAgents() which marked agents "dead"
- Simplify ensureXxxRunning() to use tmux.IsClaudeRunning() directly
- Remove reportAgentState() calls from gt prime and gt handoff
- Add SetHookBead/ClearHookBead helpers that don't update agent_state
- Use ClearHookBead in gt done and gt unsling
- Simplify gt status to derive state from tmux, not bead
Non-observable states (stuck, awaiting-gate, muted, paused) are still
set because they represent intentional agent decisions that can't be
discovered from tmux state.
Fixes: gt-zecmc
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When the daemon detects that an agent bead state doesn't match tmux
(e.g., bead says stopped but Claude is running), it now:
1. Logs the divergence clearly with STATE DIVERGENCE prefix
2. Nudges the agent with an actionable command to fix its state
3. Still skips the restart (safety - don't kill healthy sessions)
This prevents silent state drift where bead state diverges from reality.
Applied to: Deacon, Witness, Refinery ensure functions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved conflict in internal/witness/manager.go:
- Kept session import (used by PR code)
- Kept PR's more accurate comment for PID check
- Removed duplicate sessionName method introduced by merge
* feat: Beads redirect architecture for tracked and local beads
This change implements proper redirect handling so that all rig agents
(Witness, Refinery, Crew, Polecats) can work with both:
- Tracked beads: .beads/ checked into git at mayor/rig/.beads
- Local beads: .beads/ created at rig root during gt rig add
Key changes:
1. SetupRedirect now handles tracked beads by skipping redirect chains.
The bd CLI doesn't support chains (A→B→C), so worktrees redirect
directly to the final destination (mayor/rig/.beads for tracked).
2. ResolveBeadsDir is now used consistently in polecat and refinery
managers instead of hardcoded mayor/rig paths.
3. Rig-level agents (witness, refinery) now use rig beads with rig
prefix instead of town beads. This follows the architecture where
town beads are only for Mayor/Deacon.
4. prime.go simplified to always use ../../.beads for crew redirects,
letting rig-level redirect handle tracked vs local routing.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(doctor): Add beads-redirect check for tracked beads
When a repo has .beads/ tracked in git (at mayor/rig/.beads), the rig root
needs a redirect file pointing to that location. This check:
- Detects missing rig-level redirect for tracked beads
- Verifies redirect points to correct location (mayor/rig/.beads)
- Auto-fixes with 'gt doctor --fix'
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: Handle fileLock.Unlock error in daemon
Wrap fileLock.Unlock() return value to satisfy errcheck linter.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Handle empty repos in ConfigureSparseCheckout (skip read-tree when no HEAD)
- Fix errcheck: wrap fileLock.Unlock() error in defer
- Fix unparam: remove unused *rig.Rig return from getWitnessManager
- Fix unparam: mark unused agentType parameter with blank identifier
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Witness and Refinery startup was duplicated across cmd/witness.go, cmd/up.go,
cmd/rig.go, and daemon.go. Worse, not all code paths sent the propulsion nudge
(GUPP - Gas Town Universal Propulsion Principle). Now unified in Manager.Start()
which handles everything including nudges.
Changes:
- witness/manager.go: Full rewrite with session creation, env vars, theming,
WaitForClaudeReady, startup nudge, and propulsion nudge (GUPP)
- refinery/manager.go: Add propulsion nudge sequence after Claude startup
- cmd/witness.go: Simplify to just call mgr.Start(), remove ensureWitnessSession
- cmd/rig.go: Use witness.Manager.Start() instead of inline session creation
- cmd/start.go: Use witness.Manager.Start()
- cmd/up.go: Use witness.Manager.Start(), remove ensureWitness(),
add EnsureSettingsForRole in ensureSession()
- daemon.go: Use witness.Manager.Start() and refinery.Manager.Start() for
unified startup with proper nudges
This ensures all agent startup paths (gt witness start, gt rig boot, gt up,
daemon restarts) consistently apply GUPP propulsion nudges.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Per docs/architecture.md, Witness and Refinery are rig-level agents that
should use the rig's configured prefix (e.g., pi- for pixelforge) instead
of hardcoded "gt-".
This extends PR #183's creation fix to also fix all lookup paths:
- internal/rig/manager.go: Create agent beads in rig beads with rig prefix
- internal/daemon/daemon.go: Use rig prefix when looking up agent state
- internal/daemon/lifecycle.go: Use rig prefix for identity-to-bead mapping
- internal/cmd/sling.go: Pass townRoot for prefix lookup
- internal/cmd/unsling.go: Pass townRoot for prefix lookup
- internal/cmd/molecule_status.go: Use rig prefix for agent bead lookups
- internal/cmd/molecule_attach.go: Use rig prefix for agent bead lookups
- internal/config/loader.go: Add GetRigPrefix helper
Without this fix, the daemon would:
- Create pi-gastown-witness but look for gt-gastown-witness
- Report agents as missing/dead when they are running
- Fail to manage agent lifecycle correctly
Based on work by Johann Taberlet in PR #183.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Johann Taberlet <johann.taberlet@gmail.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace Unix-only syscall.Flock with gofrs/flock library for
cross-platform file locking. This enables the daemon to run on
Windows in addition to Unix-like systems.
- Add github.com/gofrs/flock v0.13.0 dependency
- Replace syscall.Flock calls with flock.TryLock/Unlock
- Maintain same non-blocking exclusive lock semantics
(gt-5354h)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Deacon respawns refinery/witness sessions via LIFECYCLE requests,
the new sessions were starting at the Claude welcome screen without
the propulsion nudge that triggers autonomous execution.
Added StartupNudge and PropulsionNudgeForRole calls to restartSession()
in lifecycle.go, matching the pattern used in ensureRefinerySession()
in start.go. This ensures respawned agents receive the GUPP nudge and
begin autonomous work immediately.
Fixes: gt-01jpg
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Town-level services (Mayor, Deacon) now use hq- prefix instead of gt-:
- hq-mayor (was gt-mayor)
- hq-deacon (was gt-deacon)
This distinguishes town-level sessions from rig-level sessions which
continue to use gt- prefix (gt-gastown-witness, gt-gastown-crew-max, etc).
Changes:
- session.MayorSessionName() returns "hq-mayor"
- session.DeaconSessionName() returns "hq-deacon"
- ParseSessionName() handles both hq- and gt- prefixes
- categorizeSession() handles both prefixes
- categorizeSessions() accepts both prefixes
- Updated all tests and documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add --branch flag to `gt rig add` to specify a custom default branch
instead of auto-detecting from remote. This supports repositories that
use non-standard default branches like `develop` or `release`.
Changes:
- Add --branch flag to `gt rig add` command
- Store default_branch in rig config.json
- Propagate default branch to refinery, witness, daemon, and all commands
- Rename ensureMainBranch to ensureDefaultBranch for clarity
- Add Rig.DefaultBranch() method for consistent access
- Update crew/manager.go and swarm/manager.go to use rig config
Based on PR #49 by @kustrun - rebased and extended with additional fixes.
Co-authored-by: kustrun <kustrun@users.noreply.github.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add syscall.Flock() exclusive lock in daemon.Run() to prevent TOCTOU
race condition where concurrent 'gt daemon start' commands could spawn
multiple daemons. Only the first to acquire the lock succeeds; others
exit cleanly. Lock is per-town (in townRoot/daemon/daemon.lock) so
multiple GT instances from different directories work independently.
Also detect race losers in runDaemonStart() by comparing spawned PID
with PID file, reporting 'already running' instead of false success.
Agent directories (witness/, refinery/, mayor/) contained state.json files
with last_active timestamps that were never updated, making them stale and
misleading. This change removes:
- initAgentStates function that created vestigial state.json files
- AgentState type and related Load/Save functions from config package
- MayorStateValidCheck from doctor checks
- requesting_* lifecycle verification (dead code - flags were never set)
- FileStateJSON constant and MayorStatePath function
Kept intact:
- daemon/state.json (actively used for daemon runtime state)
- crew/<name>/state.json (operational CrewWorker metadata)
- Agent state tracking via beads (the ZFC-compliant approach)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all callers of deprecated MayorBeadID()/DeaconBeadID() to use
MayorBeadIDTown()/DeaconBeadIDTown() which return hq- prefix IDs for
town-level beads storage.
Changes:
- internal/daemon/lifecycle.go: identityToAgentBeadID and checkStaleAgents
- internal/cmd/prime.go: getAgentBeadID
- internal/cmd/molecule_status.go: buildAgentBeadID
- internal/cmd/prime_test.go: update expected values to hq-*
- Comments updated to reflect hq- prefix for town-level agents
The deprecated functions remain for backward compatibility and are used
by the migration tool (migrate_agents.go).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reverts the session naming changes from PR #70. Multi-town support
on a single machine is not a real use case - rigs provide project
isolation, and true isolation should use containers/VMs.
Changes:
- MayorSessionName() and DeaconSessionName() no longer take townName parameter
- ParseSessionName() handles simple gt-mayor and gt-deacon formats
- Removed Town field from AgentIdentity and AgentSession structs
- Updated all callers and tests
Generated with Claude Code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Refinery monitoring to daemon heartbeat (ensureRefineriesRunning)
- Add circuit breaker: agents marked dead by checkStaleAgents() are
now force-killed and restarted, even if Claude process is alive
- Fixes zombie Claude sessions that werent updating their bead state
The circuit breaker flow:
1. Agent gets stuck (stops updating bead state)
2. After 15 minutes: checkStaleAgents() marks bead as dead
3. On next heartbeat: ensure*Running() sees state=dead
4. Force-kill session and restart fresh
Generated with Claude Code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Session names `gt-mayor` and `gt-deacon` were hardcoded, causing tmux
session name collisions when running multiple towns simultaneously.
Changed to `gt-{town}-mayor` and `gt-{town}-deacon` format (e.g.,
`gt-ai-mayor`) to allow concurrent multi-town operation.
Key changes:
- session.MayorSessionName() and DeaconSessionName() now take townName param
- Added workspace.GetTownName() helper to load town name from config
- Updated all callers in cmd/, daemon/, doctor/, mail/, rig/, templates/
- Updated tests with new session name format
- Bead IDs remain unchanged (already scoped by .beads/ directory)
Fixes#60🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When the daemon checks if Deacon/Witness is running, it first checks the
agent bead state. If this check fails (bead not found, JSON parse error,
or stale state), it would previously attempt to restart the session -
even if the tmux session was perfectly healthy.
This caused "session already exists" errors when:
1. Agent bead state couldn't be read (prefix mismatch, missing bead)
2. But the tmux session was actually running with Claude active
Fix: Add a tmux session health check as fallback before attempting restart.
If the session exists AND Claude is running in it, skip the restart and
log that we're preserving the healthy session despite stale bead state.
This maintains ZFC compliance (still trusts agent bead as primary source)
while adding a defensive check to prevent unnecessary session kills.
Fixes#63🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The daemon was failing to restart agents when zombie tmux sessions existed
(session alive but Claude dead). Added EnsureSessionFresh() helper to
tmux package that:
- Checks if session exists
- If exists but Claude not running (zombie), kills the session
- Creates fresh session
Updated all daemon session creation points to use EnsureSessionFresh:
- ensureDeaconRunning()
- ensureWitnessRunning()
- restartPolecatSession()
- restartSession() in lifecycle.go
Added tests for the new helper function. (gt-j1i0r)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Claude starts with --dangerously-skip-permissions, it shows a warning
dialog requiring Down+Enter to accept. This blocked automated polecat and
agent startup.
Added AcceptBypassPermissionsWarning() to tmux package that:
- Checks if the warning dialog is present by capturing pane content
- Only sends Down+Enter if "Bypass Permissions mode" text is found
- Avoids interfering with sessions that don't show the warning
Updated all Claude startup locations:
- session/manager.go (polecat sessions)
- cmd/up.go (mayor, witness, crew, polecat cold starts)
- daemon/daemon.go (crashed polecat restarts)
- daemon/lifecycle.go (role session starts)
The 10-minute recovery heartbeat was too slow to detect stuck agents promptly.
Reduced to 3 minutes for better balance between detection speed and overhead.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The hook discovery code was reading hook_bead from the agent bead's
description field (parsed via ParseAgentFieldsFromDescription), but
the slot update code writes to the hook_bead database column via
'bd slot set'. This mismatch caused polecats to see stale hook values
from the description instead of the current value from the database.
Fixed in:
- molecule_status.go: Use agentBead.HookBead instead of parsing description
- status.go: Use issue.HookBead directly
- lifecycle.go: Update all GUPP and orphan detection to read from
database columns instead of parsing description
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add heartbeat checking to Boot degraded triage: detects stale Deacon
heartbeat (>15min nudges, >30min restarts)
- Add checkDeaconHeartbeat to daemon heartbeat cycle as fallback
- This ensures the Deacon is monitored continuously, not just at startup
The mol-deacon-patrol formula was also updated separately to use
gt deacon health-check instead of ephemeral context memory tracking.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of hardcoding the rig list (gastown, beads), now loads rigs
from mayor/rigs.json using config.LoadRigsConfig. Falls back gracefully
to just checking global agents if the config cannot be loaded.
(gt-mmp0q)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove TouchTownActivity() calls from root.go PersistentPreRun hook
- Remove ReadTownActivity() and activity-based backoff from daemon.go
- Delete TouchTownActivity/ReadTownActivity functions from keepalive.go
- Replace dynamic backoff with fixed 10-min recovery heartbeat
Normal wake is now handled by feed subscription (bd activity --follow).
The daemon is a safety net for dead sessions, GUPP violations, and orphaned work.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Boot is a watchdog that the daemon pokes instead of Deacon directly,
centralizing the 'when to wake Deacon' decision in an agent that can
reason about context.
Key changes:
- Add internal/boot package with marker file and status tracking
- Add gt boot commands: status, spawn, triage
- Add mol-boot-triage formula for Boot's triage cycle
- Modify daemon to call ensureBootRunning instead of ensureDeaconRunning
- Add tmux.IsAvailable() for degraded mode detection
- Add boot.md.tmpl role template
Boot lifecycle:
1. Daemon tick spawns Boot (fresh each time)
2. Boot runs triage: observe, decide, act
3. Boot cleans stale handoffs from Deacon inbox
4. Boot exits (or handoffs in non-degraded mode)
In degraded mode (no tmux), Boot runs mechanical triage directly.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements role-based lifecycle configuration where agent types self-register
via role beads instead of hardcoded identity string parsing in the daemon.
Changes:
- Add RoleConfig struct with lifecycle fields (session_pattern, work_dir_pattern,
needs_pre_sync, start_command, env_vars)
- Add ParseRoleConfig/FormatRoleConfig/ExpandRolePattern to beads package
- Add role bead ID helpers (RoleBeadID, MayorRoleBeadID, etc.)
- Refactor daemon to use single parseIdentity function as ONLY place where
identity strings are parsed
- Daemon now looks up role beads to get lifecycle config, with fallback to
defaults when role bead is missing or has no config
- Updated all role beads (mayor, deacon, witness, refinery, crew, polecat)
with structured lifecycle configuration fields
- Add comprehensive unit tests for RoleConfig parsing and expansion
This makes the daemon ZFC-compliant by trusting what agents self-report in
their role beads rather than encoding agent-specific knowledge in Go code.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements the feed daemon goroutine that:
- Tails ~/gt/.events.jsonl for raw events
- Filters by visibility tag (keeps feed/both, drops audit)
- Deduplicates repeated done events from same actor
- Aggregates sling events when multiple occur in window
- Writes curated events to ~/gt/.feed.jsonl with summaries
The curator runs as a goroutine in the gt daemon, starting when
the daemon starts and stopping gracefully on shutdown.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This fix addresses the issue where polecat sessions terminate unexpectedly
during work without recovery:
Changes:
- Add `checkPolecatSessionHealth()` to daemon heartbeat loop
- Proactively validates tmux sessions are alive for polecats
- Detects crashed polecats that have work-on-hook
- Auto-restarts crashed polecats with proper environment setup
- Notifies Witness if restart fails as fallback
- Add polecat support to lifecycle identity mapping
- `identityToSession()` now handles polecat identities
- `restartSession()` can restart crashed polecat sessions
- `identityToStateFile()` handles polecat state files
- `identityToAgentBeadID()` handles polecat agent beads
- `identityToBDActor()` handles polecat BD_ACTOR conversion
- Add `gt session check` command for manual health checking
- Validates tmux sessions exist for all polecats
- Shows summary of healthy vs not-running sessions
- Useful for debugging session issues
This provides faster recovery (within heartbeat interval) compared to
waiting for GUPP violation timeout (30 min) or Witness detection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Change daemon from wake-focused to recovery-focused:
Before: Daemon pokes agents every 5-60min as primary wake
After: Daemon only checks for edge cases that feed-wake cannot handle
Recovery checks:
- Dead sessions that need restart (ensureDeaconRunning, ensureWitnessesRunning)
- Stale agents that crashed without updating state (checkStaleAgents)
- GUPP violations: agents with work-on-hook not progressing (checkGUPPViolations)
- Orphaned work: work assigned to dead agents (checkOrphanedWork)
Removed:
- pokeDeacon() - no longer sending HEARTBEAT messages
- pokeWitness()/pokeWitnesses() - no longer sending HEARTBEAT messages
- MOTD message arrays - only used by poke functions
Normal agent wake is now handled by feed subscription (bd activity --follow).
The daemon is the safety net for edge cases, not the primary propulsion.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements canonical naming convention for agent bead IDs:
- Town-level: gt-mayor, gt-deacon (unchanged)
- Rig-level: gt-<rig>-witness, gt-<rig>-refinery (was gt-witness-<rig>)
- Named: gt-<rig>-crew-<name>, gt-<rig>-polecat-<name> (was gt-crew-<rig>-<name>)
Changes:
- Added AgentBeadID helper functions to internal/beads/beads.go
- Updated all ID generation call sites to use helpers
- Fixed session parsing in theme.go, statusline.go, agents.go
- Updated doctor check and fix to use canonical format
- Updated tests for new format
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents data loss from concurrent/interrupted state file writes by using
atomic write pattern (write to .tmp, then rename).
Changes:
- Add internal/util package with AtomicWriteJSON/AtomicWriteFile helpers
- Update witness/manager.go saveState to use atomic writes
- Update refinery/manager.go saveState to use atomic writes
- Update crew/manager.go saveState to use atomic writes
- Update daemon/types.go SaveState to use atomic writes
- Update polecat/namepool.go Save to use atomic writes
- Add comprehensive tests for atomic write utilities
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Strip (gt-xxx), (bd-xxx) suffixes from code comments. The descriptions
remain meaningful without the ephemeral issue IDs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds witness patrol to daemon heartbeat cycle:
- ensureWitnessesRunning(): Start witness for each rig if not running
- pokeWitnesses(): Send heartbeat messages to all witnesses
- getKnownRigs(): Read rig list from mayor/rigs.json
The daemon now ensures both Deacon and Witnesses are running
and sends periodic heartbeats to trigger their patrol molecules.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add checkStaleAgents() to detect agents reporting "running" but not updating
- Add markAgentDead() to update agent bead state to "dead"
- Integrate stale agent check into heartbeat cycle
- DeadAgentTimeout set to 15 minutes
This is a safety mechanism for agents that crash without updating their state.
The daemon now marks them as dead so they can be restarted.
Also fixes duplicate AgentFields declaration - now uses beads.go version with
ParseAgentFieldsFromDescription alias in fields.go.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add AgentFields and ParseAgentFieldsFromDescription to internal/beads/fields.go
- Update daemon/lifecycle.go to use shared parsing
- Update cmd/molecule_status.go to use shared parsing
- Remove duplicate parsing code and unused isAgentRunningByBead function
This consolidates agent bead field parsing in one place, following the pattern
established for AttachmentFields and MRFields.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add getAgentBeadState() and getAgentBeadInfo() to read agent state from beads
- Add identityToAgentBeadID() to map daemon identities to agent bead IDs
- Update ensureDeaconRunning() to check agent bead state first (ZFC compliant)
- Add agent bead state logging in executeLifecycleAction()
This is the first step toward ZFC-compliant state detection. Dependent tasks:
- gt-psuw7: Remove PID/tmux state inference
- gt-2hzl4: Add timeout fallback for dead agents
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix crew next/prev: Pass session name via key binding to avoid run-shell context issue
- Add TouchTownActivity() for town-level activity signaling
- Implement daemon exponential backoff based on activity.json:
- 0-5 min idle → 5 min heartbeat
- 5-15 min idle → 10 min heartbeat
- 15-45 min idle → 30 min heartbeat
- 45+ min idle → 60 min heartbeat (max)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Defense in depth: Even if message deletion fails, stale lifecycle
requests (>6 hours old) are now ignored and deleted without execution.
This prevents ancient LIFECYCLE messages from killing sessions if
they somehow survive in the inbox.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
P0 bug: Crew workers were being killed every 5 minutes due to stale
LIFECYCLE messages from days ago being reprocessed on each heartbeat.
Root cause: closeMessage() was only called after successful action
execution. Failed actions left messages in inbox to be reprocessed.
Fix: "Claim then execute" pattern - delete message FIRST, before
attempting the action. This guarantees stale messages are cleaned up
regardless of action outcome. If a lifecycle action is needed, the
sender must re-request.
Also improved closeMessage() to capture and log actual error output.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All 156 instances of _ = error suppression in non-test code now have
explanatory comments documenting why the error is intentionally ignored.
Categories of intentional suppressions:
- non-fatal: session works without these - tmux environment setup
- non-fatal: theming failure does not affect operation - visual styling
- best-effort cleanup - defer cleanup on failure paths
- best-effort notification - mail/notifications that should not block
- best-effort interrupt - graceful shutdown attempts
- crypto/rand.Read only fails on broken system - random ID generation
- output errors non-actionable - fmt.Fprint to io.Writer
This addresses the silent failure and debugging concerns raised in the
issue by making the intentionality explicit in the code.
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>