Commit Graph

9 Commits

Author SHA1 Message Date
Steve Yegge
34cf361b2b Add telemetry and observability to daemon (bd-153)
Implement comprehensive metrics collection for the daemon with zero-overhead design:

Features:
- Request metrics: counts, latency percentiles (p50, p95, p99), error rates
- Cache metrics: hit/miss ratios, eviction counts, database connections
- Connection metrics: total, active, rejected connections
- System metrics: memory usage, goroutine count, uptime

Implementation:
- New internal/rpc/metrics.go with Metrics collector
- OpMetrics RPC operation for programmatic access
- 'bd daemon --metrics' command (human-readable and JSON output)
- Lock-free atomic operations for cache/connection metrics
- Copy-and-compute pattern in Snapshot to minimize lock contention
- Deferred metrics recording ensures all requests are tracked

Improvements from code review:
- JSON types use float64 for ms/seconds (not time.Duration)
- Snapshot copies data under short lock, computes outside
- Union of operations from counts and errors maps
- Defensive clamping in percentile calculation
- Defer pattern ensures metrics recorded even on early returns

Documentation updated in README.md with usage examples.

Closes bd-153

Amp-Thread-ID: https://ampcode.com/threads/T-20213187-65c7-47f7-ba21-5234c9e52e26
Co-authored-by: Amp <amp@ampcode.com>
2025-10-19 15:55:55 -07:00
Steve Yegge
c71da4c267 Add resource limits to daemon (bd-152)
Implemented connection limiting, request timeouts, and memory pressure detection:

- Connection limiting with semaphore pattern (default 100 max connections)
- Request timeout enforcement on read/write (default 30s)
- Memory pressure detection with aggressive cache eviction (default 500MB threshold)
- Configurable via environment variables:
  - BEADS_DAEMON_MAX_CONNS
  - BEADS_DAEMON_REQUEST_TIMEOUT
  - BEADS_DAEMON_MEMORY_THRESHOLD_MB
- Health endpoint now exposes active/max connections and memory usage
- Comprehensive test coverage for all limits

This prevents resource exhaustion under heavy load or attack scenarios.

Amp-Thread-ID: https://ampcode.com/threads/T-44d1817a-3709-4f1d-a27a-78bb2fa4d3dc
Co-authored-by: Amp <amp@ampcode.com>
2025-10-19 13:22:23 -07:00
Steve Yegge
22daa12665 Add daemon fallback visibility and version compatibility checks
Implemented bd-150: Improve daemon fallback visibility and user feedback
- Added DaemonStatus struct to track connection state
- Enhanced BD_DEBUG logging with detailed diagnostics and timing
- Added BD_VERBOSE mode with actionable warnings when falling back
- Implemented health checks before using daemon
- Clear fallback reasons: connect_failed, health_failed, auto_start_disabled, auto_start_failed, flag_no_daemon
- Updated documentation

Implemented bd-151: Add version compatibility checks for daemon RPC protocol
- Added ClientVersion field to RPC Request struct
- Client sends version (0.9.10) in all requests
- Server validates version compatibility using semver:
  - Major version must match
  - Daemon >= client for backward compatibility
  - Clear error messages with directional hints (upgrade daemon vs upgrade client)
- Added ClientVersion and Compatible fields to HealthResponse
- Implemented 'bd version --daemon' command to check compatibility
- Fixed batch operations to propagate ClientVersion for proper checks
- Updated documentation with version compatibility section

Code review improvements:
- Propagate ClientVersion in batch sub-requests
- Directional error messages based on which side is older
- Made ServerVersion a var for future unification

Amp-Thread-ID: https://ampcode.com/threads/T-b5fe36b8-c065-44a9-a55b-582573671609
Co-authored-by: Amp <amp@ampcode.com>
2025-10-19 08:04:48 -07:00
Steve Yegge
9e2ee1889f Add daemon health check endpoint (bd-146)
- Add OpHealth RPC operation to protocol
- Implement handleHealth() with DB ping and 1s timeout
- Returns status (healthy/degraded/unhealthy), uptime, cache metrics
- Update TryConnect() to use health check instead of ping
- Add 'bd daemon --health' CLI command with JSON output
- Track cache hits/misses for metrics
- Unhealthy daemon triggers automatic fallback to direct mode
- Health check completes in <2 seconds

Amp-Thread-ID: https://ampcode.com/threads/T-1a4889f3-77cf-433a-a704-e1c383929f48
Co-authored-by: Amp <amp@ampcode.com>
2025-10-18 13:41:06 -07:00
Steve Yegge
fb9b5864af feat: Add bd repos multi-repo commands and fix bd ready for in_progress issues
- Add 'bd repos' command for multi-repository management (bd-123)
  - bd repos list: show all cached repositories
  - bd repos ready: aggregate ready work across repos
  - bd repos stats: combined statistics across repos
  - bd repos clear-cache: clear repository cache
  - Requires global daemon (bd daemon --global)

- Fix bd ready to show in_progress issues (bd-165)
  - bd ready now shows both 'open' and 'in_progress' issues with no blockers
  - Allows epics/tasks ready to close to appear in ready work
  - Critical P0 bug fix for workflow

- Apply code review improvements to repos implementation
  - Use strongly typed RPC responses (remove interface{})
  - Fix clear-cache lock handling (close connections outside lock)
  - Add error collection for per-repo failures
  - Add context timeouts (1-2s) to prevent hangs
  - Add lock strategy comments

- Update documentation (README.md, AGENTS.md)
- Add comprehensive tests for both features

Amp-Thread-ID: https://ampcode.com/threads/T-1de989a1-1890-492c-9847-a34144259e0f
Co-authored-by: Amp <amp@ampcode.com>
2025-10-18 00:37:27 -07:00
Steve Yegge
b40de9bc41 Implement daemon RPC with per-request context routing (bd-115)
- Added per-request storage routing in daemon server
  - Server now supports Cwd field in requests for database discovery
  - Tree-walking to find .beads/*.db from any working directory
  - Storage caching for performance across requests

- Created Python daemon client (bd_daemon_client.py)
  - RPC over Unix socket communication
  - Implements full BdClientBase interface
  - Auto-discovery of daemon socket from working directory

- Refactored bd_client.py with abstract interface
  - BdClientBase abstract class for common interface
  - BdCliClient for CLI-based operations (renamed from BdClient)
  - create_bd_client() factory with daemon/CLI fallback
  - Backwards-compatible BdClient alias

Next: Update MCP server to use daemon client when available
2025-10-17 16:28:29 -07:00
Steve Yegge
15b60b4ad0 Phase 4: Atomic operations and stress testing (bd-114, bd-110)
Completes daemon architecture implementation:

Features:
- Batch/transaction API (OpBatch) for multi-step atomic operations
- Request timeout and cancellation support (30s default, configurable)
- Comprehensive stress tests (4-10 concurrent agents, 800-1000 ops)
- Performance benchmarks (daemon 2x faster than direct mode)

Results:
- Zero ID collisions across 1000+ concurrent creates
- All acceptance criteria validated for bd-110
- Create: 2.4ms (daemon) vs 4.7ms (direct)
- Update/List: similar 2x improvement

Tests Added:
- TestStressConcurrentAgents (8 agents, 800 creates)
- TestStressBatchOperations (4 agents, 400 batch ops)
- TestStressMixedOperations (6 agents, mixed read/write)
- TestStressNoUniqueConstraintViolations (10 agents, 1000 creates)
- BenchmarkDaemonCreate/Update/List/Latency
- Fixed flaky TestConcurrentRequests (shared client issue)

Files:
- internal/rpc/protocol.go - Added OpBatch, BatchArgs, BatchResponse
- internal/rpc/server.go - Implemented handleBatch with stop-on-failure
- internal/rpc/client.go - Added SetTimeout and Batch methods
- internal/rpc/stress_test.go - All stress tests
- internal/rpc/bench_test.go - Performance benchmarks
- DAEMON_STRESS_TEST.md - Complete documentation

Closes bd-114, bd-110

Amp-Thread-ID: https://ampcode.com/threads/T-1c07c140-0420-49fe-add1-b0b83b1bdff5
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 23:46:12 -07:00
Steve Yegge
5c0fac6e17 feat: Phase 1 RPC protocol infrastructure for daemon architecture (bd-111)
Implemented Unix socket RPC foundation to enable daemon-based concurrent access:

New files:
- internal/rpc/protocol.go: Request/Response types with 13 operations
- internal/rpc/server.go: Unix socket server with storage adapter
- internal/rpc/client.go: Client with auto-detection and typed methods
- internal/rpc/rpc_test.go: Integration tests

Features:
- JSON-based protocol over Unix sockets
- Adapter pattern for context/actor propagation to storage API
- Ping/health checks for daemon detection
- All core operations: create, update, close, list, show, ready, stats, deps, labels
- Graceful socket cleanup and signal handling
- Concurrent request support

Tests: 49.3% coverage, all passing

Related issues:
- bd-110: Daemon architecture epic
- bd-111: Phase 1 (completed)
- bd-112: Phase 2 (client auto-detection)
- bd-113: Phase 3 (daemon command)
- bd-114: Phase 4 (atomic operations)

Amp-Thread-ID: https://ampcode.com/threads/T-796c62e6-93b6-41c7-9cb5-8acc4a35ba9a
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 22:49:19 -07:00
Steve Yegge
872f203c57 Add RPC infrastructure and updated database
- RPC Phase 1: Protocol, server, client implementation
- Updated renumber.go with proper text reference updates (3-phase approach)
- Clean database exported: 344 issues (bd-1 to bd-344)
- Added DAEMON_DESIGN.md documentation
- Updated go.mod/go.sum for RPC dependencies

Amp-Thread-ID: https://ampcode.com/threads/T-456af77c-8b7f-4004-9027-c37b95e10ea5
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 20:36:23 -07:00