chore(beads): Sync beads from main
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -257,33 +257,26 @@ For each rig, run:
|
|||||||
```bash
|
```bash
|
||||||
gt witness status <rig>
|
gt witness status <rig>
|
||||||
gt refinery status <rig>
|
gt refinery status <rig>
|
||||||
```
|
|
||||||
|
|
||||||
**IMPORTANT: Do NOT send health pings on every patrol cycle.**
|
# Health ping (clears backoff as side effect)
|
||||||
Health check nudges disturb idle agents and waste their context. Only send
|
|
||||||
a HEALTH_CHECK nudge when you have **concerning signals** (see table below).
|
|
||||||
|
|
||||||
**When to send health pings** (only if concerned):
|
|
||||||
```bash
|
|
||||||
# Only if agent appears unresponsive or status check fails:
|
|
||||||
gt nudge <rig>/witness 'HEALTH_CHECK from deacon'
|
gt nudge <rig>/witness 'HEALTH_CHECK from deacon'
|
||||||
gt nudge <rig>/refinery 'HEALTH_CHECK from deacon'
|
gt nudge <rig>/refinery 'HEALTH_CHECK from deacon'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Health Ping Benefit**: The nudge commands serve dual purposes:
|
||||||
|
1. **Liveness verification** - Agent responds to prove it's alive
|
||||||
|
2. **Backoff reset** - Any nudge resets agent's backoff to base interval
|
||||||
|
|
||||||
|
This ensures patrol agents remain responsive even during quiet periods when the
|
||||||
|
feed has no mutations. Deacon patrols every ~1-2 minutes, so maximum backoff
|
||||||
|
is bounded by the ping interval.
|
||||||
|
|
||||||
**Signals to assess:**
|
**Signals to assess:**
|
||||||
|
|
||||||
| Component | Healthy Signals | Concerning Signals |
|
| Component | Healthy Signals | Concerning Signals |
|
||||||
|-----------|-----------------|-------------------|
|
|-----------|-----------------|-------------------|
|
||||||
| Witness | State: running, recent activity | State: not running, no heartbeat, or status check hangs |
|
| Witness | State: running, recent activity | State: not running, no heartbeat |
|
||||||
| Refinery | State: running, queue processing | Queue stuck, merge failures, or not responding |
|
| Refinery | State: running, queue processing | Queue stuck, merge failures |
|
||||||
|
|
||||||
**When NOT to ping:**
|
|
||||||
- Agent reports state=running with recent activity
|
|
||||||
- No pending work in queues
|
|
||||||
- Status check returns promptly with healthy output
|
|
||||||
|
|
||||||
The status commands (`gt witness status`, `gt refinery status`) provide
|
|
||||||
sufficient liveness information. Reserve nudges for actual problems.
|
|
||||||
|
|
||||||
**Tracking unresponsive cycles:**
|
**Tracking unresponsive cycles:**
|
||||||
|
|
||||||
@@ -303,7 +296,7 @@ health_state:
|
|||||||
|
|
||||||
| Cycles Unresponsive | Suggested Action |
|
| Cycles Unresponsive | Suggested Action |
|
||||||
|---------------------|------------------|
|
|---------------------|------------------|
|
||||||
| 1-2 | Note it, send HEALTH_CHECK nudge |
|
| 1-2 | Note it, check again next cycle |
|
||||||
| 3-4 | Attempt restart: gt witness restart <rig> |
|
| 3-4 | Attempt restart: gt witness restart <rig> |
|
||||||
| 5+ | Escalate to Mayor with context |
|
| 5+ | Escalate to Mayor with context |
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user