- RPC Phase 1: Protocol, server, client implementation - Updated renumber.go with proper text reference updates (3-phase approach) - Clean database exported: 344 issues (bd-1 to bd-344) - Added DAEMON_DESIGN.md documentation - Updated go.mod/go.sum for RPC dependencies Amp-Thread-ID: https://ampcode.com/threads/T-456af77c-8b7f-4004-9027-c37b95e10ea5 Co-authored-by: Amp <amp@ampcode.com>
6.4 KiB
BD Daemon Architecture for Concurrent Access
Problem Statement
Multiple AI agents running concurrently (via beads-mcp) cause:
- SQLite write corruption: Counter stuck, UNIQUE constraint failures
- Git index.lock contention: All agents auto-export → all try to commit simultaneously
- Data loss risk: Concurrent SQLite writers without coordination
- Poor performance: Redundant exports, 4x git operations for same changes
Current Architecture (Broken)
Agent 1 → beads-mcp 1 → bd CLI → SQLite DB (direct write)
Agent 2 → beads-mcp 2 → bd CLI → SQLite DB (direct write) ← RACE CONDITIONS
Agent 3 → beads-mcp 3 → bd CLI → SQLite DB (direct write)
Agent 4 → beads-mcp 4 → bd CLI → SQLite DB (direct write)
↓
4x concurrent git export/commit
Proposed Architecture (Daemon-Mediated)
Agent 1 → beads-mcp 1 → bd client ──┐
Agent 2 → beads-mcp 2 → bd client ──┼──> bd daemon → SQLite DB
Agent 3 → beads-mcp 3 → bd client ──┤ (single writer) ↓
Agent 4 → beads-mcp 4 → bd client ──┘ git export
(batched,
serialized)
Key Changes
- bd daemon becomes mandatory for multi-agent scenarios
- All bd commands become RPC clients when daemon is running
- Daemon owns SQLite - single writer, no races
- Daemon batches git operations - one export cycle per interval
- Unix socket IPC - simple, fast, local-only
Implementation Plan
Phase 1: RPC Infrastructure
New files:
internal/rpc/protocol.go- Request/response typesinternal/rpc/server.go- Unix socket server in daemoninternal/rpc/client.go- Client detection & dispatch
Operations to support:
type Request struct {
Operation string // "create", "update", "list", "close", etc.
Args json.RawMessage // Operation-specific args
}
type Response struct {
Success bool
Data json.RawMessage // Operation result
Error string
}
Socket location: ~/.beads/bd.sock or .beads/bd.sock (per-repo)
Phase 2: Client Auto-Detection
bd command behavior:
- Check if daemon socket exists & responsive
- If yes: Send RPC request, print response
- If no: Run command directly (backward compatible)
Example:
func main() {
if client := rpc.TryConnect(); client != nil {
// Daemon is running - use RPC
resp := client.Execute(cmd, args)
fmt.Println(resp)
return
}
// No daemon - run directly (current behavior)
executeLocally(cmd, args)
}
Phase 3: Daemon SQLite Ownership
Daemon startup:
- Open SQLite connection (exclusive)
- Start RPC server on Unix socket
- Start git sync loop (existing functionality)
- Process RPC requests serially
Git operations:
- Batch exports every 5 seconds (not per-operation)
- Single commit with all changes
- Prevent concurrent git operations entirely
Phase 4: Atomic Operations
ID generation:
// In daemon process only
func (d *Daemon) generateID(prefix string) (string, error) {
d.mu.Lock()
defer d.mu.Unlock()
// No races - daemon is single writer
return d.storage.NextID(prefix)
}
Transaction support:
// RPC can request multi-operation transactions
type BatchRequest struct {
Operations []Request
Atomic bool // All-or-nothing
}
Migration Strategy
Stage 1: Opt-In (v0.10.0)
- Daemon RPC code implemented
- bd commands detect daemon, fall back to direct
- Users can
bd daemon startfor multi-agent scenarios - No breaking changes - direct mode still works
Stage 2: Recommended (v0.11.0)
- Document multi-agent workflow requires daemon
- MCP server README says "start daemon for concurrent agents"
- Detection warning: "Multiple bd processes detected, consider using daemon"
Stage 3: Required for Multi-Agent (v1.0.0)
- bd detects concurrent access patterns
- Refuses to run without daemon if lock contention detected
- Error: "Concurrent access detected. Start daemon:
bd daemon start"
Benefits
✅ No SQLite corruption - single writer ✅ No git lock contention - batched, serialized operations ✅ Atomic ID generation - no counter corruption ✅ Better performance - fewer redundant exports ✅ Backward compatible - graceful fallback to direct mode ✅ Simple protocol - Unix sockets, JSON payloads
Trade-offs
⚠️ Daemon must be running for multi-agent workflows
⚠️ One more process to manage (bd daemon start/stop)
⚠️ Complexity - RPC layer adds code & maintenance
⚠️ Single point of failure - if daemon crashes, all agents blocked
Open Questions
-
Per-repo or global daemon?
- Per-repo:
.beads/bd.sock(supports multiple repos) - Global:
~/.beads/bd.sock(simpler, but only one repo at a time) - Recommendation: Per-repo, use
--dbpath to determine socket location
- Per-repo:
-
Daemon crash recovery?
- Client auto-starts daemon if socket missing?
- Or require manual
bd daemon start? - Recommendation: Auto-start with exponential backoff
-
Concurrent read optimization?
- Reads could bypass daemon (SQLite supports concurrent readers)
- But complex: need to detect read-only vs read-write commands
- Recommendation: Start simple, all ops through daemon
-
Transaction API for clients?
- MCP tools often do multi-step operations
- Would benefit from
BEGIN/COMMITstyle transactions - Recommendation: Phase 4 feature, not MVP
Success Metrics
- ✅ 4 concurrent agents can run without errors
- ✅ No UNIQUE constraint failures on ID generation
- ✅ No git index.lock errors
- ✅ SQLite counter stays in sync with actual issues
- ✅ Graceful fallback when daemon not running
Related Issues
- bd-668: Git index.lock contention (root cause)
- bd-670: ID generation retry on UNIQUE constraint
- bd-654: Concurrent tmp file collisions (already fixed)
- bd-477: Phase 1 daemon command (git sync only - now expanded)
- bd-279: Tests for concurrent scenarios
- bd-271: Epic for multi-device support
Next Steps
- Ultrathink: Validate this design with user
- File epic: Create bd-??? for daemon RPC architecture
- Break down work: Phase 1 subtasks (protocol, server, client)
- Start implementation: Begin with protocol.go