Add health checks and reconnection logic for stale daemon sockets (bd-137)

- Add ping() and health() methods to BdDaemonClient for connection verification
- Implement _health_check_client() to verify cached client connections
- Add _reconnect_client() with exponential backoff (0.1s, 0.2s, 0.4s, max 3 retries)
- Update _get_client() to health-check before returning cached clients
- Automatically detect and remove stale connections from pool
- Add comprehensive test suite with 14 tests covering all scenarios
- Handle daemon restarts, upgrades, and long-idle connections gracefully

Amp-Thread-ID: https://ampcode.com/threads/T-2366ef1b-389c-4293-8145-7613037c9dfa
Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
Steve Yegge
2025-10-25 17:39:21 -07:00
parent a91467d2fb
commit 744563e87f
4 changed files with 405 additions and 4 deletions

View File

@@ -200,10 +200,33 @@ class BdDaemonClient(BdClientBase):
Raises:
DaemonNotRunningError: If daemon is not running
DaemonConnectionError: If connection fails
DaemonError: If request fails
"""
data = await self._send_request("ping", {})
return json.loads(data) if isinstance(data, str) else data
async def health(self) -> Dict[str, Any]:
"""Get daemon health status.
Returns:
Dict with health info including:
- status: "healthy" | "degraded" | "unhealthy"
- version: daemon version string
- uptime: uptime in seconds
- cache_size: number of cached databases
- db_response_time_ms: database ping time
- active_connections: number of active connections
- memory_bytes: memory usage
Raises:
DaemonNotRunningError: If daemon is not running
DaemonConnectionError: If connection fails
DaemonError: If request fails
"""
data = await self._send_request("health", {})
return json.loads(data) if isinstance(data, str) else data
async def init(self, params: Optional[InitParams] = None) -> str:
"""Initialize new beads database (not typically used via daemon).