Add telemetry and observability to daemon (bd-153)

Implement comprehensive metrics collection for the daemon with zero-overhead design:

Features:
- Request metrics: counts, latency percentiles (p50, p95, p99), error rates
- Cache metrics: hit/miss ratios, eviction counts, database connections
- Connection metrics: total, active, rejected connections
- System metrics: memory usage, goroutine count, uptime

Implementation:
- New internal/rpc/metrics.go with Metrics collector
- OpMetrics RPC operation for programmatic access
- 'bd daemon --metrics' command (human-readable and JSON output)
- Lock-free atomic operations for cache/connection metrics
- Copy-and-compute pattern in Snapshot to minimize lock contention
- Deferred metrics recording ensures all requests are tracked

Improvements from code review:
- JSON types use float64 for ms/seconds (not time.Duration)
- Snapshot copies data under short lock, computes outside
- Union of operations from counts and errors maps
- Defensive clamping in percentile calculation
- Defer pattern ensures metrics recorded even on early returns

Documentation updated in README.md with usage examples.

Closes bd-153

Amp-Thread-ID: https://ampcode.com/threads/T-20213187-65c7-47f7-ba21-5234c9e52e26
Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
Steve Yegge
2025-10-19 15:55:55 -07:00
parent 932c8e292f
commit 34cf361b2b
7 changed files with 458 additions and 19 deletions

View File

@@ -940,6 +940,8 @@ bd daemon --auto-commit # Auto-commit changes
bd daemon --auto-push # Auto-push commits (requires auto-commit)
bd daemon --log /var/log/bd.log # Custom log file path
bd daemon --status # Show daemon status
bd daemon --health # Check daemon health
bd daemon --metrics # Show detailed performance metrics
bd daemon --stop # Stop running daemon
bd daemon --global # Run as global daemon (see below)
bd daemon --migrate-to-global # Migrate from local to global daemon
@@ -962,6 +964,29 @@ The daemon is ideal for:
The daemon gracefully shuts down on SIGTERM and maintains a PID file at `.beads/daemon.pid` for process management.
##### Monitoring & Observability
Check daemon health and performance with built-in metrics:
```bash
# Quick health check
bd daemon --health
# Detailed performance metrics
bd daemon --metrics
# JSON output for programmatic access
bd daemon --metrics --json
```
Metrics include:
- **Request metrics**: Operation counts, latency percentiles (p50, p95, p99), error rates
- **Cache metrics**: Hit/miss ratios, eviction counts, active database connections
- **Connection metrics**: Total connections, active connections, rejected connections
- **System metrics**: Memory usage, goroutine count, uptime
All metrics are collected with zero overhead using lock-free atomic operations and efficient ring buffers for latency tracking.
#### Global Daemon for Multiple Projects
**New in v0.9.11:** Run a single daemon to serve all your projects system-wide: