feat: Add automatic orphaned claude process cleanup (#588)
* feat: Add automatic orphaned claude process cleanup Claude Code's Task tool spawns subagent processes that sometimes don't clean up properly after completion. These accumulate and consume significant memory (observed: 17 processes using ~6GB RAM). This change adds automatic cleanup in two places: 1. **Deacon patrol** (primary): New patrol step "orphan-process-cleanup" runs `gt deacon cleanup-orphans` early in each cycle. More responsive (~30s). 2. **Daemon heartbeat** (fallback): Runs cleanup every 3 minutes as safety net when deacon is down. Detection uses TTY column - processes with TTY "?" have no controlling terminal. This is safe because: - Processes in terminals (user sessions) have a TTY like "pts/0" - untouched - Only kills processes with no controlling terminal - Orphaned subagents are children of tmux server with no TTY New files: - internal/util/orphan.go: FindOrphanedClaudeProcesses, CleanupOrphanedClaudeProcesses - internal/util/orphan_test.go: Tests for orphan detection New command: - `gt deacon cleanup-orphans`: Manual/patrol-triggered cleanup Fixes #587 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(orphan): add Windows build tag and minimum age check Addresses review feedback on PR #588: 1. Add //go:build !windows to orphan.go and orphan_test.go - The code uses Unix-specific syscalls (SIGTERM, ESRCH) and ps command options that don't exist on Windows 2. Add minimum age check (60 seconds) to prevent false positives - Prevents race conditions with newly spawned subagents - Addresses reviewer concern about cron/systemd processes - Uses portable etime format instead of Linux-only etimes 3. Add parseEtime helper with comprehensive tests - Parses [[DD-]HH:]MM:SS format (works on both Linux and macOS) - etimes (seconds) is Linux-specific, etime is portable Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(orphan): add proper SIGTERM→SIGKILL escalation with state tracking Previous approach used process age which doesn't work: a Task subagent runs without TTY from birth, so a long-running legitimate subagent that later fails to exit would be immediately SIGKILLed without trying SIGTERM. New approach uses a state file to track signal history: 1. First encounter → SIGTERM, record PID + timestamp in state file 2. Next cycle (after 60s grace period) → if still alive, SIGKILL 3. Next cycle → if survived SIGKILL, log as unkillable and remove State file: $XDG_RUNTIME_DIR/gastown-orphan-state (or /tmp/) Format: "<pid> <signal> <unix_timestamp>" per line The state file is automatically cleaned up: - Dead processes removed on load - Unkillable processes removed after logging Also updates callers to use new CleanupResult type which includes the signal sent (SIGTERM, SIGKILL, or UNKILLABLE). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -84,10 +84,46 @@ Callbacks may spawn new polecats, update issue state, or trigger other actions.
|
||||
**Hygiene principle**: Archive messages after they're fully processed.
|
||||
Keep inbox near-empty - only unprocessed items should remain."""
|
||||
|
||||
[[steps]]
|
||||
id = "orphan-process-cleanup"
|
||||
title = "Clean up orphaned claude subagent processes"
|
||||
needs = ["inbox-check"]
|
||||
description = """
|
||||
Clean up orphaned claude subagent processes.
|
||||
|
||||
Claude Code's Task tool spawns subagent processes that sometimes don't clean up
|
||||
properly after completion. These accumulate and consume significant memory.
|
||||
|
||||
**Detection method:**
|
||||
Orphaned processes have no controlling terminal (TTY = "?"). Legitimate claude
|
||||
instances in terminals have a TTY like "pts/0".
|
||||
|
||||
**Run cleanup:**
|
||||
```bash
|
||||
gt deacon cleanup-orphans
|
||||
```
|
||||
|
||||
This command:
|
||||
1. Lists all claude/codex processes with `ps -eo pid,tty,comm`
|
||||
2. Filters for TTY = "?" (no controlling terminal)
|
||||
3. Sends SIGTERM to each orphaned process
|
||||
4. Reports how many were killed
|
||||
|
||||
**Why this is safe:**
|
||||
- Processes in terminals (your personal sessions) have a TTY - they won't be touched
|
||||
- Only kills processes that have no controlling terminal
|
||||
- These orphans are children of the tmux server with no TTY, indicating they're
|
||||
detached subagents that failed to exit
|
||||
|
||||
**If cleanup fails:**
|
||||
Log the error but continue patrol - this is best-effort cleanup.
|
||||
|
||||
**Exit criteria:** Orphan cleanup attempted (success or logged failure)."""
|
||||
|
||||
[[steps]]
|
||||
id = "trigger-pending-spawns"
|
||||
title = "Nudge newly spawned polecats"
|
||||
needs = ["inbox-check"]
|
||||
needs = ["orphan-process-cleanup"]
|
||||
description = """
|
||||
Nudge newly spawned polecats that are ready for input.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user