From 9d87f0182317e6db7ee09154ba0dce24eb55c1f2 Mon Sep 17 00:00:00 2001 From: kerosene Date: Mon, 26 Jan 2026 13:05:05 -0800 Subject: [PATCH] research: analyze role template management strategy Findings: - Two competing mechanisms: embedded templates vs local-fork edits - Local-fork created ~200 lines of divergent content in mayor/CLAUDE.md - TOML config overrides exist but only handle operational config Recommendation: Extend TOML override system to support [content] sections for template customization, unifying all override mechanisms. --- .../formulas/mol-deacon-patrol.formula.toml | 181 ++++++++---- .../shared/research/role-template-strategy.md | 273 ++++++++++++++++++ 2 files changed, 397 insertions(+), 57 deletions(-) create mode 100644 thoughts/shared/research/role-template-strategy.md diff --git a/.beads/formulas/mol-deacon-patrol.formula.toml b/.beads/formulas/mol-deacon-patrol.formula.toml index f49f7f5c..97106f0c 100644 --- a/.beads/formulas/mol-deacon-patrol.formula.toml +++ b/.beads/formulas/mol-deacon-patrol.formula.toml @@ -1,29 +1,45 @@ description = """ -Mayor's daemon patrol loop. +Mayor's daemon patrol loop - CONTINUOUS EXECUTION. -The Deacon is the Mayor's background process that runs continuously, handling callbacks, monitoring rig health, and performing cleanup. Each patrol cycle runs these steps in sequence, then loops or exits. +The Deacon is the Mayor's background process that runs CONTINUOUSLY in a loop: +1. Execute all patrol steps (inbox-check through context-check) +2. Wait for activity OR timeout (15-minute max) +3. Create new patrol wisp and repeat from step 1 + +**This is a continuous loop, not a one-shot execution.** + +## Patrol Loop Flow + +``` +START → inbox-check → [all patrol steps] → loop-or-exit + ↓ + await-signal (wait for activity) + ↓ + create new wisp → START +``` + +## Plugin Dispatch + +The plugin-run step scans $GT_ROOT/plugins/ for plugins with open gates and +dispatches them to dogs. With a 15-minute max backoff, plugins with 15m +cooldown gates will be checked at least once per interval. ## Idle Town Principle **The Deacon should be silent/invisible when the town is healthy and idle.** - Skip HEALTH_CHECK nudges when no active work exists -- Sleep 60+ seconds between patrol cycles (longer when idle) -- Let the feed subscription wake agents on actual events -- The daemon (10-minute heartbeat) is the safety net for dead sessions - -This prevents flooding idle agents with health checks every few seconds. +- Sleep via await-signal (exponential backoff up to 15 min) +- Let the feed subscription wake on actual events +- The daemon is the safety net for dead sessions ## Second-Order Monitoring Witnesses send WITNESS_PING messages to verify the Deacon is alive. This prevents the "who watches the watchers" problem - if the Deacon dies, -Witnesses detect it and escalate to the Mayor. - -The Deacon's agent bead last_activity timestamp is updated during each patrol -cycle. Witnesses check this timestamp to verify health.""" +Witnesses detect it and escalate to the Mayor.""" formula = "mol-deacon-patrol" -version = 8 +version = 9 [[steps]] id = "inbox-check" @@ -488,29 +504,48 @@ investigate why the Witness isn't cleaning up properly.""" [[steps]] id = "plugin-run" -title = "Execute registered plugins" +title = "Scan and dispatch plugins" needs = ["zombie-scan"] description = """ -Execute registered plugins. +Scan plugins and dispatch any with open gates to dogs. -Scan $GT_ROOT/plugins/ for plugin directories. Each plugin has a plugin.md with TOML frontmatter defining its gate (when to run) and instructions (what to do). +**Step 1: List plugins and check gates** +```bash +gt plugin list +``` -See docs/deacon-plugins.md for full documentation. +For each plugin, check if its gate is open: +- **cooldown**: Time since last run (e.g., 15m) - check state.json +- **cron**: Schedule-based (e.g., "0 9 * * *") +- **condition**: Metric threshold (e.g., wisp count > 50) +- **event**: Trigger-based (e.g., startup, heartbeat) -Gate types: -- cooldown: Time since last run (e.g., 24h) -- cron: Schedule-based (e.g., "0 9 * * *") -- condition: Metric threshold (e.g., wisp count > 50) -- event: Trigger-based (e.g., startup, heartbeat) +**Step 2: Dispatch plugins with open gates** +```bash +# For each plugin with an open gate: +gt dog dispatch --plugin +``` -For each plugin: -1. Read plugin.md frontmatter to check gate -2. Compare against state.json (last run, etc.) -3. If gate is open, execute the plugin +This sends the plugin to an idle dog for execution. The dog will: +1. Execute the plugin instructions from plugin.md +2. Send DOG_DONE mail when complete (processed in next patrol's inbox-check) -Plugins marked parallel: true can run concurrently using Task tool subagents. Sequential plugins run one at a time in directory order. +**Step 3: Track dispatched plugins** +Record in state.json which plugins were dispatched this cycle: +```json +{ + "plugins_dispatched": ["scout-patrol"], + "last_plugin_run": "2026-01-23T13:45:00Z" +} +``` -Skip this step if $GT_ROOT/plugins/ does not exist or is empty.""" +**If no plugins have open gates:** +Skip dispatch - all plugins are within their cooldown/schedule. + +**If no dogs available:** +Log warning and skip dispatch this cycle. Dog pool maintenance step will spawn dogs. + +See docs/deacon-plugins.md for full documentation.""" [[steps]] id = "dog-pool-maintenance" @@ -837,57 +872,89 @@ This enables the Deacon to burn and respawn cleanly.""" [[steps]] id = "loop-or-exit" -title = "Burn and respawn or loop" +title = "Continuous patrol loop" needs = ["context-check"] description = """ -Burn and let daemon respawn, or exit if context high. +Continue the patrol loop or exit for context refresh. -Decision point at end of patrol cycle: +**CRITICAL**: This is where the continuous patrol loop happens. The Deacon MUST +loop back and start a new patrol cycle. Do NOT wait for external triggers. -If context is LOW: -Use await-signal with exponential backoff to wait for activity: +## Decision Matrix + +1. **Check context usage**: `gt context --usage` +2. **If context HIGH (>80%)**: Exit cleanly, daemon respawns fresh session +3. **If context LOW**: Continue to patrol loop below + +## The Continuous Patrol Loop + +When context is low, execute this loop: ```bash +# Step 1: Squash current patrol wisp (clean up) +gt mol squash + +# Step 2: Wait for activity OR timeout (15-minute default) gt mol step await-signal --agent-bead hq-deacon \ - --backoff-base 60s --backoff-mult 2 --backoff-max 10m + --backoff-base 60s --backoff-mult 2 --backoff-max 15m + +# Step 3: Reset idle counter if activity was detected +gt agents state hq-deacon --set idle=0 + +# Step 4: Create new patrol wisp +WISP_ID=$(bd mol wisp mol-deacon-patrol 2>&1 | grep -o 'hq-[a-z0-9]*') + +# Step 5: Hook it and start executing +gt hook $WISP_ID ``` -This command: +After hooking, immediately begin executing the new wisp from its first step +(inbox-check). The wisp is now on your hook, so just continue with patrol. + +**IMPORTANT**: After await-signal returns (either signal or timeout), you MUST: +1. Squash the current wisp +2. Create a new patrol wisp +3. Hook it +4. Start executing from inbox-check + +This IS the loop. There is no "return to inbox-check" command - you create a new +wisp and that wisp starts fresh from inbox-check. + +## await-signal Behavior + +The await-signal command: 1. Subscribes to `bd activity --follow` (beads activity feed) 2. Returns IMMEDIATELY when any beads activity occurs 3. If no activity, times out with exponential backoff: - First timeout: 60s - Second timeout: 120s - - Third timeout: 240s - - ...capped at 10 minutes max + - Third timeout: 240s (4 min) + - ...capped at 15 minutes max 4. Tracks `idle:N` label on hq-deacon bead for backoff state -**On signal received** (activity detected): -Reset the idle counter and start next patrol cycle: -```bash -gt agent state hq-deacon --set idle=0 -``` -Then return to inbox-check step. - -**On timeout** (no activity): -The idle counter was auto-incremented. Continue to next patrol cycle -(the longer backoff will apply next time). Return to inbox-check step. - **Why this approach?** - Any `gt` or `bd` command triggers beads activity, waking the Deacon -- Idle towns let the Deacon sleep longer (up to 10 min between patrols) +- Idle towns let the Deacon sleep longer (up to 15 min between patrols) - Active work wakes the Deacon immediately via the feed -- No polling or fixed sleep intervals +- No fixed polling intervals - event-driven wake -If context is HIGH: -- Write state to persistent storage -- Exit cleanly -- Let the daemon orchestrator respawn a fresh Deacon +## Plugin Dispatch Timing -The daemon ensures Deacon is always running: +The plugin-run step (earlier in patrol) handles plugin dispatch: +- Scans $GT_ROOT/plugins/ for plugins with open gates +- Dispatches to dogs via `gt dog dispatch --plugin ` +- Dogs send DOG_DONE when complete (processed in next patrol's inbox-check) + +With a 15-minute max backoff, plugins with 15m cooldown gates will be checked +at least once per interval when idle. + +## Exit Path (High Context) + +If context is HIGH (>80%): ```bash -# Daemon respawns on exit -gt daemon status +# Exit cleanly - daemon will respawn with fresh context +exit 0 ``` -This enables infinite patrol duration via context-aware respawning.""" +The daemon ensures Deacon is always running. Exiting is safe - you'll be +respawned with fresh context and the patrol loop continues.""" diff --git a/thoughts/shared/research/role-template-strategy.md b/thoughts/shared/research/role-template-strategy.md new file mode 100644 index 00000000..c3af1b97 --- /dev/null +++ b/thoughts/shared/research/role-template-strategy.md @@ -0,0 +1,273 @@ +# Role Template Management Strategy + +**Research Date:** 2026-01-26 +**Researcher:** kerosene (gastown/crew) +**Status:** Analysis complete, recommendation provided + +## Executive Summary + +Gas Town currently has **two competing mechanisms** for managing role context, leading to divergent content and maintenance complexity: + +1. **Embedded templates** (`internal/templates/roles/*.md.tmpl`) - source of truth in binary +2. **Local-fork edits** - direct modifications to runtime `CLAUDE.md` files + +Additionally, there's a **third mechanism** for operational config that works well: + +3. **Role config overrides** (`internal/config/roles.go`) - TOML-based config override chain + +**Recommendation:** Extend the TOML override pattern to support template content sections, unifying all customization under one mechanism. + +--- + +## Inventory: Current Mechanisms + +### 1. Embedded Templates (internal/templates/roles/*.md.tmpl) + +**Location:** `internal/templates/roles/` + +**Files:** +- `mayor.md.tmpl` (337 lines) +- `crew.md.tmpl` (17,607 bytes) +- `polecat.md.tmpl` (17,527 bytes) +- `witness.md.tmpl` (11,746 bytes) +- `refinery.md.tmpl` (13,525 bytes) +- `deacon.md.tmpl` (13,727 bytes) +- `boot.md.tmpl` (4,445 bytes) + +**How it works:** +- Templates are embedded into the binary via `//go:embed` directive +- `gt prime` command renders templates with role-specific data (TownRoot, RigName, etc.) +- Output is printed to stdout, where Claude picks it up as context +- Uses Go template syntax: `{{ .TownRoot }}`, `{{ .RigName }}`, etc. + +**Code path:** `templates.New()` → `tmpl.RenderRole()` → stdout + +### 2. Local-Fork Edits (Runtime CLAUDE.md) + +**Location:** Various agent directories (e.g., `mayor/CLAUDE.md`, `/crew//CLAUDE.md`) + +**How it works:** +- `gt install` creates minimal bootstrap CLAUDE.md (~15 lines) via `createMayorCLAUDEmd()` +- Bootstrap content just says "Run `gt prime` for full context" +- THEN humans/agents directly edit these files with custom content +- These edits are committed to the town's git repo + +**Example:** Mayor's CLAUDE.md grew from bootstrap to 532 lines + +**Key local-fork commit:** +``` +1cdbc27 docs: Enhance Mayor role template with coordination system knowledge (sc-n2oiz) +``` + +This commit added ~500 lines to `mayor/CLAUDE.md` including: +- Colony Model (why Gas Town uses coordinated specialists) +- Escalation Patterns (Witness vs Mayor responsibilities) +- Decision Flow (when to use polecats vs crew) +- Multi-phase Orchestration +- Monitoring without Micromanaging +- Teaching GUPP patterns +- Communication Patterns +- Speed Asymmetry + +**None of this content exists in the embedded template** - it's purely local-fork. + +### 3. Role Config Overrides (TOML files) + +**Location:** +- Built-in: `internal/config/roles/*.toml` (embedded in binary) +- Town-level: `/roles/.toml` (optional override) +- Rig-level: `/roles/.toml` (optional override) + +**Resolution order (later wins):** +1. Built-in defaults (embedded) +2. Town-level overrides +3. Rig-level overrides + +**What it handles:** +```toml +# Example: mayor.toml +role = "mayor" +scope = "town" +nudge = "Check mail and hook status, then act accordingly." +prompt_template = "mayor.md.tmpl" + +[session] +pattern = "hq-mayor" +work_dir = "{town}" +needs_pre_sync = false +start_command = "exec claude --dangerously-skip-permissions" + +[env] +GT_ROLE = "mayor" +GT_SCOPE = "town" + +[health] +ping_timeout = "30s" +consecutive_failures = 3 +kill_cooldown = "5m" +stuck_threshold = "1h" +``` + +**What it DOES NOT handle:** +- Template content (the actual markdown context) +- The `prompt_template` field just names which .md.tmpl to use + +**Implementation:** `LoadRoleDefinition()` in `roles.go` handles the override chain with `mergeRoleDefinition()`. + +--- + +## Analysis: Trade-offs + +### Embedded Templates + +| Pros | Cons | +|------|------| +| Single source of truth in binary | Requires recompile for changes | +| Consistent across all installations | No per-town customization | +| Supports placeholder substitution | Can't add town-specific sections | +| Version-controlled in gastown repo | Changes don't propagate to existing installs | + +### Local-Fork Edits + +| Pros | Cons | +|------|------| +| Per-installation customization | Diverges from template source | +| No recompile needed | Manual sync to keep up with template changes | +| Town-specific content | Each install is unique snowflake | +| Immediate effect | Template improvements don't propagate | + +### Role Config Overrides + +| Pros | Cons | +|------|------| +| Clean override chain | Only handles operational config | +| Town/rig level customization | Doesn't handle template content | +| Merge semantics (not replace) | - | +| No recompile needed | - | + +--- + +## Problem Statement + +The current situation creates **three-way divergence**: + +``` + ┌──────────────────────────────────────────┐ + │ Embedded Template (mayor.md.tmpl) │ + │ 337 lines - "official" content │ + └──────────────────────────────────────────┘ + │ + │ gt prime renders + │ BUT doesn't include + │ local-fork additions + v +┌──────────────────────────────────────────────────────────────────┐ +│ Runtime CLAUDE.md (mayor/CLAUDE.md) │ +│ 532 lines - has ~200 lines of local-fork content │ +│ INCLUDING: Colony Model, Escalation Patterns, etc. │ +└──────────────────────────────────────────────────────────────────┘ +``` + +**Issues:** +1. When `gt prime` runs, it outputs the embedded template (337 lines) +2. The local-fork content (Colony Model, etc.) is in `mayor/CLAUDE.md` +3. Claude Code reads BOTH via `CLAUDE.md` + startup hooks +4. But the embedded template and local CLAUDE.md overlap/conflict +5. Template improvements in new gt versions don't include local-fork content +6. Local-fork improvements aren't shared with other installations + +--- + +## Recommendation: Unified Override System + +**Extend the existing TOML override mechanism to support template content sections.** + +### Proposed Design + +```toml +# /roles/mayor.toml (town-level override) + +# Existing operational overrides work as-is +[health] +stuck_threshold = "2h" # Town needs longer threshold + +# NEW: Template content sections +[content] +# Append sections after the embedded template +append = """ +## The Colony Model: Why Gas Town Works + +Gas Town rejects the "super-ant" model... [rest of content] +""" + +# OR reference a file +append_file = "mayor-additions.md" + +# OR override specific sections by ID +[content.sections.escalation] +replace = """ +## Escalation Patterns: What to Handle vs Delegate +...[custom content]... +""" +``` + +### Why This Works + +1. **Single source of truth**: Embedded templates remain canonical +2. **Clean override semantics**: Town/rig can append or replace sections +3. **Existing infrastructure**: Uses the same TOML loading + merge pattern +4. **No recompile**: Content overrides are runtime files +5. **Shareable**: Town-level overrides can be committed to town repo +6. **Migrateable**: Existing local-fork content can move to `[content]` sections + +### Implementation Path + +1. **Phase 1**: Add `[content]` support to role config + - Parse `append`, `append_file`, `replace_sections` fields + - Apply after template rendering in `outputPrimeContext()` + +2. **Phase 2**: Migrate local-fork content + - Extract custom sections from `mayor/CLAUDE.md` + - Move to `/roles/mayor.toml` `[content]` section + - Reduce `mayor/CLAUDE.md` back to bootstrap pointer + +3. **Phase 3**: Document the pattern + - How to add town-specific guidance + - How to share improvements back to embedded templates + +--- + +## Alternative Considered: Pure Template Approach + +**Idea:** Move all content into embedded templates, remove local CLAUDE.md entirely. + +**Rejected because:** +- Can't support per-town customization (e.g., different escalation policies) +- Requires recompile for any content change +- Forces all installations to be identical +- Doesn't leverage existing override infrastructure + +--- + +## Files Involved + +For implementation, these files would need modification: + +| File | Change | +|------|--------| +| `internal/config/roles.go` | Add `[content]` parsing to `RoleDefinition` | +| `internal/cmd/prime_output.go` | Apply content overrides after template render | +| `internal/templates/templates.go` | Potentially add section markers for replace | +| `internal/cmd/install.go` | Update bootstrap to not create full CLAUDE.md | + +--- + +## Summary + +| Approach | Verdict | +|----------|---------| +| **Embedded templates only** | Insufficient - no customization | +| **Local-fork edits** | Current state - creates divergence | +| **TOML content overrides** | **Recommended** - unifies all customization | + +The TOML content override approach leverages existing infrastructure, provides clean semantics, and allows both standardization (embedded templates) and customization (override sections).