docs: design Deacon molecule health-scan step (gt-gaxo.5)
ZFC cleanup: health checking belongs in Deacon molecule, not Go code. Updated health-scan step with: - Specific commands: gt witness status, gt refinery status - Signal assessment table for Claude to interpret - Cycle tracking for persistent unresponsive state - Decision matrix with suggested (not hardcoded) thresholds - Restart and escalation workflows Key ZFC principle: Claude makes the judgment call about what is stuck or unresponsive - no hardcoded time.Duration in Go. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
+954
-962
File diff suppressed because it is too large
Load Diff
@@ -58,22 +58,62 @@ Timeout: 60 seconds per polecat. If not ready, try again next cycle.
|
|||||||
Needs: inbox-check
|
Needs: inbox-check
|
||||||
|
|
||||||
## Step: health-scan
|
## Step: health-scan
|
||||||
Ping Witnesses and Refineries.
|
Check Witness and Refinery health for each rig.
|
||||||
|
|
||||||
For each rig, verify:
|
**ZFC Principle**: You (Claude) make the judgment call about what is "stuck" or
|
||||||
- Witness is responsive
|
"unresponsive" - there are no hardcoded thresholds in Go. Read the signals,
|
||||||
- Refinery is processing queue
|
consider context, and decide.
|
||||||
- No stalled operations
|
|
||||||
|
|
||||||
|
For each rig, run:
|
||||||
` + "```" + `bash
|
` + "```" + `bash
|
||||||
gt status --health
|
gt witness status <rig>
|
||||||
# Check each rig
|
gt refinery status <rig>
|
||||||
for rig in $(gt rigs); do
|
|
||||||
gt rig status $rig
|
|
||||||
done
|
|
||||||
` + "```" + `
|
` + "```" + `
|
||||||
|
|
||||||
Report any issues found. Restart unresponsive components if needed.
|
**Signals to assess:**
|
||||||
|
|
||||||
|
| Component | Healthy Signals | Concerning Signals |
|
||||||
|
|-----------|-----------------|-------------------|
|
||||||
|
| Witness | State: running, recent activity | State: not running, no heartbeat |
|
||||||
|
| Refinery | State: running, queue processing | Queue stuck, merge failures |
|
||||||
|
|
||||||
|
**Tracking unresponsive cycles:**
|
||||||
|
|
||||||
|
Maintain in your patrol state (persisted across cycles):
|
||||||
|
` + "```" + `
|
||||||
|
health_state:
|
||||||
|
<rig>:
|
||||||
|
witness:
|
||||||
|
unresponsive_cycles: 0
|
||||||
|
last_seen_healthy: <timestamp>
|
||||||
|
refinery:
|
||||||
|
unresponsive_cycles: 0
|
||||||
|
last_seen_healthy: <timestamp>
|
||||||
|
` + "```" + `
|
||||||
|
|
||||||
|
**Decision matrix** (you decide the thresholds based on context):
|
||||||
|
|
||||||
|
| Cycles Unresponsive | Suggested Action |
|
||||||
|
|---------------------|------------------|
|
||||||
|
| 1-2 | Note it, check again next cycle |
|
||||||
|
| 3-4 | Attempt restart: gt witness restart <rig> |
|
||||||
|
| 5+ | Escalate to Mayor with context |
|
||||||
|
|
||||||
|
**Restart commands:**
|
||||||
|
` + "```" + `bash
|
||||||
|
gt witness restart <rig>
|
||||||
|
gt refinery restart <rig>
|
||||||
|
` + "```" + `
|
||||||
|
|
||||||
|
**Escalation:**
|
||||||
|
` + "```" + `bash
|
||||||
|
gt mail send mayor/ -s "Health: <rig> <component> unresponsive" \
|
||||||
|
-m "Component has been unresponsive for N cycles. Restart attempts failed.
|
||||||
|
Last healthy: <timestamp>
|
||||||
|
Error signals: <details>"
|
||||||
|
` + "```" + `
|
||||||
|
|
||||||
|
Reset unresponsive_cycles to 0 when component responds normally.
|
||||||
Needs: trigger-pending-spawns
|
Needs: trigger-pending-spawns
|
||||||
|
|
||||||
## Step: plugin-run
|
## Step: plugin-run
|
||||||
|
|||||||
Reference in New Issue
Block a user