* fix(daemon): prevent runaway refinery session spawning
Fixes#566
The daemon spawned 812 refinery sessions over 4 days because:
1. Zombie detection was too strict - used IsAgentRunning(session, "node")
but Claude reports pane command as version number (e.g., "2.1.7"),
causing healthy sessions to be killed and recreated every heartbeat.
2. daemon.json patrol config was completely ignored - the daemon never
loaded or checked the enabled flags.
Changes:
- refinery/manager.go: Use IsClaudeRunning() instead of IsAgentRunning()
for robust Claude detection (handles "node", "claude", version patterns)
- daemon/types.go: Add PatrolConfig types and LoadPatrolConfig() to read
mayor/daemon.json
- daemon/daemon.go: Load patrol config at startup, check enabled flags
before calling ensureRefineriesRunning/ensureWitnessesRunning, add
diagnostic logging for "already running" cases
Tested: Verified over multiple heartbeats that refinery shows "already
running, skipping spawn" instead of spawning new sessions.
* fix: Add grace period to prevent Deacon restart loop
The daemon had a race condition where:
1. ensureDeaconRunning() starts a new Deacon session
2. checkDeaconHeartbeat() runs in same heartbeat cycle
3. Heartbeat file is stale (from before crash)
4. Session is immediately killed
5. Infinite restart loop every 3 minutes
Fix:
- Track when Deacon was last started (deaconLastStarted field)
- Skip heartbeat check during 5-minute grace period
- Add config support for Deacon (consistency with refinery/witness)
After grace period, normal heartbeat checking resumes. Genuinely
stuck sessions (no heartbeat update after 5+ min) are still detected.
Fixes#589
---------
Co-authored-by: mayor <your-github-email@example.com>