Implement Boot: daemon entry point dog for Deacon triage (gt-rwd5j)

Boot is a watchdog that the daemon pokes instead of Deacon directly,
centralizing the 'when to wake Deacon' decision in an agent that can
reason about context.

Key changes:
- Add internal/boot package with marker file and status tracking
- Add gt boot commands: status, spawn, triage
- Add mol-boot-triage formula for Boot's triage cycle
- Modify daemon to call ensureBootRunning instead of ensureDeaconRunning
- Add tmux.IsAvailable() for degraded mode detection
- Add boot.md.tmpl role template

Boot lifecycle:
1. Daemon tick spawns Boot (fresh each time)
2. Boot runs triage: observe, decide, act
3. Boot cleans stale handoffs from Deacon inbox
4. Boot exits (or handoffs in non-degraded mode)

In degraded mode (no tmux), Boot runs mechanical triage directly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Steve Yegge
2025-12-30 16:14:54 -08:00
parent 3099d99424
commit 2112804aba
6 changed files with 991 additions and 2 deletions

View File

@@ -0,0 +1,228 @@
description = """
Boot triage cycle - the daemon's watchdog for Deacon health.
Boot is spawned fresh on each daemon tick to decide whether to start/wake/nudge/interrupt
the Deacon, or do nothing. This centralizes the "when to wake" decision in an agent that
can reason about context rather than relying on mechanical thresholds.
Boot lifecycle:
1. Observe (wisps, mail, git state, tmux panes)
2. Decide (start/wake/nudge/interrupt/nothing)
3. Act
4. Clean inbox (discard stale handoffs)
5. Exit (or handoff in non-degraded mode)
Boot is always fresh - no persistent state between invocations.
Handoff mail provides continuity for the next Boot instance.
"""
formula = "mol-boot-triage"
version = 1
[[steps]]
id = "observe"
title = "Observe system state"
description = """
Observe the current system state to inform triage decisions.
**Step 1: Check Deacon state**
```bash
# Is Deacon session alive?
tmux has-session -t gt-deacon 2>/dev/null && echo "alive" || echo "dead"
# If alive, what's the pane output showing?
gt peek deacon --lines 20
```
**Step 2: Check agent bead state**
```bash
bd show gt-deacon 2>/dev/null
# Look for:
# - state: running/working/idle
# - last_activity: when was last update?
```
**Step 3: Check recent activity**
```bash
# Recent feed events
gt feed --since 10m --plain | head -20
# Recent wisps (operational state)
ls -lt ~/gt/.beads-wisp/*.wisp.json 2>/dev/null | head -5
```
**Step 4: Check Deacon mail**
```bash
# Does Deacon have unread mail?
gt mail inbox deacon 2>/dev/null | head -10
```
Record observations for the decide step:
- deacon_alive: true/false
- pane_activity: active/idle/stuck
- last_activity_age: duration since last activity
- pending_mail: count of unread messages
- error_signals: any errors observed
"""
[[steps]]
id = "decide"
title = "Decide on action"
needs = ["observe"]
description = """
Analyze observations and decide what action to take.
**Decision Matrix**
| Deacon State | Pane Activity | Action |
|--------------|---------------|--------|
| Dead session | N/A | START |
| Alive, active output | N/A | NOTHING |
| Alive, idle < 5 min | N/A | NOTHING |
| Alive, idle 5-15 min | No mail | NOTHING |
| Alive, idle 5-15 min | Has mail | NUDGE |
| Alive, idle > 15 min | Any | WAKE |
| Alive, stuck (errors) | Any | INTERRUPT |
**Judgment Guidance**
Agents may take several minutes on legitimate work. Ten minutes or more in edge cases.
Don't be too aggressive - false positives are disruptive.
Signs of stuck:
- Same error repeated in pane
- Tool prompt waiting indefinitely
- Silence with pending mail
- Agent reporting issues but not progressing
Signs of working:
- Tool calls in progress
- File reads/writes happening
- Recent commits or beads updates
**Output**: Record decision as one of:
- NOTHING: Let Deacon continue
- NUDGE: Gentle wake signal (gt nudge)
- WAKE: Stronger wake (escape + message)
- INTERRUPT: Force restart needed
- START: Session is dead, start fresh
"""
[[steps]]
id = "act"
title = "Execute decided action"
needs = ["decide"]
description = """
Execute the action decided in the previous step.
**NOTHING**
No action needed. Log observation and exit.
**NUDGE**
```bash
gt nudge deacon "Boot check-in: you have pending work"
```
**WAKE**
```bash
# Send escape to break any tool waiting
tmux send-keys -t gt-deacon Escape
# Brief pause
sleep 1
# Send wake message
gt nudge deacon "Boot wake: please check your inbox and pending work"
```
**INTERRUPT**
```bash
# This is more aggressive - signals Deacon to restart
gt mail send deacon -s "INTERRUPT: Boot detected stuck state" \
-m "Boot observed stuck state. Please check your context and consider handoff.
Observations:
- <summary of what was observed>
If you're making progress, please update your agent bead to reflect activity."
```
**START**
```bash
# Deacon is dead - daemon will restart it
# Just log that we detected this
echo "Boot detected dead Deacon session - daemon will restart"
```
Record action taken for status update.
"""
[[steps]]
id = "cleanup"
title = "Clean stale handoffs"
needs = ["act"]
description = """
Clean up stale handoff messages from Deacon's inbox.
Handoff messages older than 1 hour are likely stale - the intended recipient
either processed them or crashed before seeing them.
**Step 1: List Deacon inbox**
```bash
gt mail inbox deacon --json 2>/dev/null
```
**Step 2: Archive stale handoffs**
For each message:
- Check if subject contains "HANDOFF" or "handoff"
- Check if age > 1 hour
- If both: archive it
```bash
# For each stale handoff:
gt mail archive <message-id>
```
**Step 3: Archive Boot's own old mail**
Boot doesn't need persistent inbox. Archive anything processed:
```bash
gt mail inbox boot --json 2>/dev/null
# Archive any messages older than current session
```
Keep the system clean - old handoffs just add noise.
"""
[[steps]]
id = "exit"
title = "Exit or handoff"
needs = ["cleanup"]
description = """
Complete this Boot cycle.
**In degraded mode (GT_DEGRADED=true)**
Exit directly - no handoff needed:
```bash
# Log completion
echo "Boot triage complete: <action taken>"
exit 0
```
**In normal mode**
Write brief handoff for next Boot instance:
```bash
gt mail send boot -s "Boot handoff" -m "Completed triage cycle.
Action: <action taken>
Observations: <brief summary>
Time: $(date)"
```
Then exit. The next daemon tick will spawn a fresh Boot.
**Update status file**
```bash
# The gt boot command handles this automatically
# Status is written to ~/gt/deacon/dogs/boot/.boot-status.json
```
Boot is ephemeral by design. Each instance runs fresh.
"""