Auto-detect and restart daemon on version mismatch (bd-89)

Implements automatic daemon version detection and restart when client
and daemon versions are incompatible. Eliminates need for manual
'bd daemon --stop' after upgrades.

Changes:
- Check daemon version during health check in PersistentPreRun
- Auto-restart mismatched daemon or fall back to direct mode
- Check version when starting daemon, auto-stop old daemon if incompatible
- Robust restart logic: sets working dir, cleans stale sockets, reaps processes
- Uses waitForSocketReadiness helper for reliable startup detection
- Updated AGENTS.md with version management documentation

Closes bd-89

Amp-Thread-ID: https://ampcode.com/threads/T-231a3701-c9c8-49e4-a1b0-e67c94e5c365
Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
Steve Yegge
2025-10-23 23:40:13 -07:00
parent ae4869bd3b
commit f30544e148
4 changed files with 215 additions and 14 deletions

View File

@@ -85,4 +85,6 @@
{"id":"bd-86","title":"Add transaction support for atomicity in merge operations","description":"The merge operation in cmd/bd/merge.go should use transactions to ensure atomicity. Currently marked as TODO at line 143.\n\nThis would prevent partial merges if an error occurs partway through the operation.","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-10-23T19:33:34.549858-07:00","updated_at":"2025-10-23T19:35:40.620329-07:00","closed_at":"2025-10-23T19:35:40.620329-07:00"}
{"id":"bd-87","title":"Add RPC support for epic commands in daemon mode","description":"Epic status and close-eligible commands currently error out in daemon mode with a message to use --no-daemon. These commands should work with daemon RPC like other commands.\n\nLocations:\n- cmd/bd/epic.go:26 (epic status command)\n- cmd/bd/epic.go:106 (epic close-eligible command)","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-10-23T19:33:34.552261-07:00","updated_at":"2025-10-23T21:56:33.732039-07:00","closed_at":"2025-10-23T21:56:33.732039-07:00"}
{"id":"bd-88","title":"bd import reports \"0 created, 0 updated\" when successfully importing issues","description":"The `bd import` command successfully imported 125 issues but reported \"0 created, 0 updated\" in the output. The import actually worked, but the success message is incorrect/misleading.\n\nThis appears to be a bug in the reporting logic that counts and displays the number of issues created/updated during import.","status":"closed","priority":2,"issue_type":"bug","created_at":"2025-10-23T22:28:40.391453-07:00","updated_at":"2025-10-23T23:05:57.413177-07:00","closed_at":"2025-10-23T23:05:57.413177-07:00"}
{"id":"bd-89","title":"Auto-detect and kill old daemon versions","description":"When the client version doesn't match the daemon version, we get confusing behavior (auto-flush race conditions, stale data, etc.). The client should automatically detect version mismatches and handle them gracefully.\n\n**Current behavior:**\n- `bd version --daemon` shows mismatch but requires manual intervention\n- Old daemons keep running after binary upgrades\n- MCP server may connect to old daemon\n- Results in dirty working tree after commits, stale data\n\n**Proposed solution:**\n\nKey lifecycle points to check/restart daemon:\n1. **On first command after version mismatch**: Check daemon version, auto-restart if incompatible\n2. **On daemon start**: Check for existing daemons, kill old ones before starting\n3. **After brew upgrade/install**: Add post-install hook to kill old daemons\n4. **On `bd init`**: Ensure fresh daemon\n\n**Detection logic:**\n```go\n// PersistentPreRun: check daemon version\nif daemonVersion != clientVersion {\n log.Warn(\"Daemon version mismatch, restarting...\")\n killDaemon()\n startDaemon()\n}\n```\n\n**Considerations:**\n- Should we be aggressive (always kill mismatched) or conservative (warn first)?\n- What about multiple workspaces with different bd versions?\n- Should this be opt-in via config flag?\n- How to handle graceful shutdown vs force kill?\n\n**Related issues:**\n- Race condition with auto-flush (see bd-89)\n- Version mismatch confusion for users\n- Stale daemon after upgrades","notes":"## Implementation Summary\n\nImplemented automatic daemon version detection and restart in v0.16.0.\n\n### Changes Made\n\n**1. Auto-restart on version mismatch (main.go PersistentPreRun)**\n- Check daemon version during health check\n- If incompatible, automatically stop old daemon and start new one\n- Falls back to direct mode if restart fails\n- Transparent to users - no manual intervention needed\n\n**2. Auto-stop old daemon on startup (daemon.go)**\n- When starting daemon, check if existing daemon has compatible version\n- If versions are incompatible, auto-stop old daemon before starting new one\n- Prevents \"daemon already running\" errors after upgrades\n\n**3. Robust restart implementation**\n- Sets correct working directory so daemon finds right database\n- Cleans up stale socket files after force kill\n- Properly reaps child process to avoid zombies\n- Uses waitForSocketReadiness helper for reliable startup detection\n- 5-second readiness timeout\n\n### Key Features\n\n- **Automatic**: No user action required after upgrading bd\n- **Transparent**: Works with both MCP server and CLI\n- **Safe**: Falls back to direct mode if restart fails\n- **Tested**: All existing tests pass\n\n### Related\n- Addresses race conditions mentioned in bd-90\n- Uses semver compatibility checking from internal/rpc/server.go","status":"closed","priority":1,"issue_type":"feature","created_at":"2025-10-23T23:15:59.764705-07:00","updated_at":"2025-10-23T23:28:06.611221-07:00","closed_at":"2025-10-23T23:28:06.611221-07:00"}
{"id":"bd-9","title":"Test issue 2","description":"","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-21T23:53:44.31362-07:00","updated_at":"2025-10-23T19:33:21.099891-07:00","closed_at":"2025-10-21T22:06:41.257019-07:00","labels":["test-label"]}
{"id":"bd-90","title":"Race condition between git commit and auto-flush debounce","description":"When using MCP/daemon mode, operations trigger a 5-second debounced auto-flush to JSONL. This creates a race condition with git commits, leaving the working tree dirty.\n\n**Example scenario:**\n1. User closes issue via MCP → daemon schedules flush (5 sec delay)\n2. User commits code changes → JSONL appears clean\n3. Daemon flush fires → JSONL modified after commit\n4. Result: dirty working tree showing JSONL changes\n\n**Root cause:**\n- Auto-flush uses 5-second debounce to batch changes\n- Git commits happen immediately\n- No coordination between flush schedule and git operations\n\n**Possible solutions:**\n\n1. **Immediate flush before git operations**\n - Detect git commands (commit, status, push)\n - Force immediate flush if pending\n - Pros: Clean working tree guaranteed\n - Cons: Requires hooking git, may be slow\n\n2. **Commit includes pending flushes**\n - Add `bd sync` to commit workflow\n - Wait for flush to complete before committing\n - Pros: Simple, explicit\n - Cons: Requires user discipline\n\n3. **Git hooks integration**\n - pre-commit hook: `bd sync --wait`\n - Ensures JSONL is up-to-date before commit\n - Pros: Automatic, reliable\n - Cons: Requires hook installation\n\n4. **Reduce debounce delay**\n - Lower from 5s to 1s or 500ms\n - Pros: Faster sync, less likely to race\n - Cons: More frequent I/O, doesn't eliminate race\n\n5. **Lock-based coordination**\n - Daemon holds lock while flush pending\n - Git operations wait for lock\n - Pros: Guarantees ordering\n - Cons: Complex, may block operations\n\n**Recommended approach:**\nCombine #2 and #3:\n- Add `bd sync` command to explicitly flush\n- Provide git hooks in `examples/git-hooks/`\n- Document workflow in AGENTS.md\n- Keep 5s debounce for normal operations\n\n**Related:**\n- bd-89 (daemon version detection)","status":"open","priority":1,"issue_type":"bug","created_at":"2025-10-23T23:16:29.502191-07:00","updated_at":"2025-10-23T23:16:29.502191-07:00"}