Audit of test coverage identifying: - 10 packages with 0 test coverage (2,452 lines) - Priority list for new tests (internal/lock is P0) - 1 flaky test candidate (feed/curator_test.go) - Test quality analysis and recommendations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.2 KiB
5.2 KiB
Test Coverage and Quality Review
Reviewed by: polecat/gus Date: 2026-01-04 Issue: gt-a02fj.9
Executive Summary
- 80 test files covering 32 out of 42 packages (76% package coverage)
- 631 test functions with 192 subtests (30% use table-driven pattern)
- 10 packages with 0 test coverage (2,452 lines)
- 1 confirmed flaky test candidate
- Test quality is generally good with moderate mocking
Coverage Gap Inventory
Packages Without Tests (Priority Order)
| Priority | Package | Lines | Risk | Notes |
|---|---|---|---|---|
| P0 | internal/lock |
402 | CRITICAL | Multi-agent lock management. Bugs cause worker collisions. Already has execCommand mockable for testing. |
| P1 | internal/events |
295 | HIGH | Event bus for audit trail. Mutex-protected writes. Core observability. |
| P1 | internal/boot |
242 | HIGH | Boot watchdog lifecycle. Spawns tmux sessions. |
| P1 | internal/checkpoint |
216 | HIGH | Session crash recovery. Critical for polecat continuity. |
| P2 | internal/tui/convoy |
601 | MEDIUM | TUI component. Harder to test but user-facing. |
| P2 | internal/constants |
221 | LOW | Mostly configuration constants. Low behavioral risk. |
| P3 | internal/style |
331 | LOW | Output formatting. Visual only. |
| P3 | internal/claude |
80 | LOW | Claude settings parsing. |
| P3 | internal/wisp |
52 | LOW | Ephemeral molecule I/O. Small surface. |
| P4 | cmd/gt |
12 | TRIVIAL | Main entry point. Minimal code. |
Total untested lines: 2,452
Flaky Test Candidates
Confirmed: internal/feed/curator_test.go
Issue: Uses time.Sleep() for synchronization (lines 59, 71, 119, 138)
// Give curator time to start
time.Sleep(50 * time.Millisecond)
...
// Wait for processing
time.Sleep(300 * time.Millisecond)
Risk: Flaky under load, CI delays, or slow machines.
Fix: Replace with channel-based synchronization or polling with timeout:
// Wait for condition with timeout
deadline := time.Now().Add(time.Second)
for time.Now().Before(deadline) {
if conditionMet() {
break
}
time.Sleep(10 * time.Millisecond)
}
Test Quality Analysis
Strengths
- Table-driven tests: 30% of tests use
t.Run()(192/631) - Good isolation: Only 2 package-level test variables
- Dedicated integration tests: 15 files with explicit integration/e2e naming
- Error handling: 316 uses of
if err != nilin tests - No random data: No
rand.usage in tests (deterministic) - Environment safety: Uses
t.Setenv()for clean env var handling
Areas for Improvement
testing.Short(): Only 1 usage. Long-running tests should check this.- External dependencies: 26 tests skip when
bdortmuxunavailable - consider mocking more. - time.Sleep usage: Found in
curator_test.go- should be eliminated.
Test Smells (Minor)
| Smell | Location | Severity | Notes |
|---|---|---|---|
| Sleep-based sync | feed/curator_test.go |
HIGH | See flaky section |
| External dep skips | Multiple files | LOW | Reasonable for integration tests |
| Skip-heavy file | tmux/tmux_test.go |
LOW | Acceptable - tmux not always available |
Priority List for New Tests
Immediate (P0)
internal/lock- Critical path- Test
Acquire()with stale lock cleanup - Test
Check()with live/dead PIDs - Test
CleanStaleLocks()with mock tmux sessions - Test
DetectCollisions() - Test concurrent lock acquisition (race detection)
- Test
High Priority (P1)
-
internal/events- Test
Log()file creation and append - Test
write()mutex behavior - Test payload helpers
- Test graceful handling when not in workspace
- Test
-
internal/boot- Test
IsRunning()with stale markers - Test
AcquireLock()/ReleaseLock()cycle - Test
SaveStatus()/LoadStatus()round-trip - Test degraded mode path
- Test
-
internal/checkpoint- Test
Read()/Write()round-trip - Test
Capture()git state extraction - Test
IsStale()with various durations - Test
Summary()output
- Test
Medium Priority (P2)
internal/tui/convoy- Consider golden file tests for view outputinternal/constants- Test any validation logic
Missing Test Types
| Type | Current State | Recommendation |
|---|---|---|
| Unit tests | Good coverage where present | Add for P0-P1 packages |
| Integration tests | 15 dedicated files | Adequate |
| E2E tests | browser_e2e_test.go |
Consider more CLI E2E |
| Fuzz tests | None | Consider for parsers (formula/parser.go) |
| Benchmark tests | None visible | Add for hot paths (lock, events) |
Actionable Next Steps
- Fix flaky test: Refactor
feed/curator_test.goto use channels/polling - Add lock tests: Highest priority - bugs here break multi-agent
- Add events tests: Core observability must be tested
- Add checkpoint tests: Session recovery is critical path
- Run with race detector:
go test -race ./...to catch data races - Consider
-shortflag: Addtesting.Short()checks to slow tests