DOCTOR IMPROVEMENTS: visual improvements/grouping + add comprehensive tests + fix gosec warnings (#656)

* test(doctor): add comprehensive tests for fix and check functions Add edge case tests, e2e tests, and improve test coverage for: - database_test.go: database integrity and sync checks - git_test.go: git hooks, merge driver, sync branch tests - gitignore_test.go: gitignore validation - prefix_test.go: ID prefix handling - fix/fix_test.go: fix operations - fix/e2e_test.go: end-to-end fix scenarios - fix/fix_edge_cases_test.go: edge case handling * docs: add testing philosophy and anti-patterns guide - Create TESTING_PHILOSOPHY.md covering test pyramid, priority matrix, what NOT to test, and 5 anti-patterns with code examples - Add cross-reference from README_TESTING.md - Document beads-specific guidance (well-covered areas vs gaps) - Include target metrics (test-to-code ratio, execution time targets) * chore: revert .beads/ to upstream/main state * refactor(doctor): add category grouping and Ayu theme colors - Add Category field to DoctorCheck for organizing checks by type - Define category constants: Core, Git, Runtime, Data, Integration, Metadata - Update thanks command to use shared Ayu color palette from internal/ui - Simplify test fixtures by removing redundant test cases * fix(doctor): prevent test fork bomb and fix test failures - Add ErrTestBinary guard in getBdBinary() to prevent tests from recursively executing the test binary when calling bd subcommands - Update claude_test.go to use new check names (CLI Availability, Prime Documentation) - Fix syncbranch test path comparison by resolving symlinks (/var vs /private/var on macOS) - Fix permissions check to use exact comparison instead of bitmask - Fix UntrackedJSONL to use git commit --only to preserve staged changes - Fix MergeDriver edge case test by making both .git dir and config read-only - Add skipIfTestBinary helper for E2E tests that need real bd binary * test(doctor): skip read-only config test in CI environments GitHub Actions containers may have CAP_DAC_OVERRIDE or similar capabilities that allow writing to read-only files, causing the test to fail. Skip the test when CI=true or GITHUB_ACTIONS=true.
2025-12-20 03:10:06 -08:00
parent f3b0fb780f
commit 3c08e5eb9d
23 changed files with 6248 additions and 404 deletions
--- a/docs/TESTING_PHILOSOPHY.md
+++ b/docs/TESTING_PHILOSOPHY.md
@@ -0,0 +1,260 @@
+# Testing Philosophy
+
+This document covers **what to test** and **what not to test**. For how to run tests, see [TESTING.md](TESTING.md).
+
+## The Test Pyramid
+
+```
+                  ┌─────────────────┐
+                  │   E2E Tests     │  ← PR/Deploy only (slow, expensive)
+                  │   ~5% of tests  │
+                  └────────┬────────┘
+                           │
+            ┌──────────────┴──────────────┐
+            │     Integration Tests       │  ← PR gate (moderate)
+            │       ~15% of tests         │
+            └──────────────┬──────────────┘
+                           │
+  ┌────────────────────────┴────────────────────────┐
+  │              Unit Tests (Fast)                  │  ← Every save/commit
+  │                 ~80% of tests                   │
+  └─────────────────────────────────────────────────┘
+```
+
+### Tier 1: Fast Tests (< 5 seconds total)
+
+**When**: Every file save, pre-commit hooks, continuous during development
+
+- Pure function tests (no I/O)
+- In-memory data structure tests
+- Business logic validation
+- Mock all external dependencies
+
+**In beads**: Core logic tests using `newTestStore()` with in-memory SQLite
+
+### Tier 2: Integration Tests (< 30 seconds)
+
+**When**: Pre-push, PR checks
+
+- Real file system operations
+- Git operations with temp repos
+- Config file parsing
+- CLI argument handling
+
+**In beads**: Tests tagged with `//go:build integration`, daemon tests
+
+### Tier 3: E2E / Smoke Tests (1-5 minutes)
+
+**When**: PR merge, pre-deploy, nightly
+
+- Full `bd init` → `bd doctor` → `bd doctor --fix` workflow
+- Real API calls (to staging)
+- Cross-platform verification
+
+---
+
+## What Makes a Test "Right"
+
+A good test:
+
+1. **Catches a bug you'd actually ship** - not theoretical edge cases
+2. **Documents expected behavior** - serves as living documentation
+3. **Runs fast enough to not skip** - slow tests get disabled
+4. **Isn't duplicated elsewhere** - tests one thing, one way
+
+---
+
+## What to Test (Priority Matrix)
+
+| Priority | What | Why | Examples in beads |
+|----------|------|-----|-------------------|
+| **High** | Core business logic | This is what users depend on | `sync`, `doctor`, `export`, `import` |
+| **High** | Error paths that could corrupt data | Data loss is catastrophic | Config handling, git operations, JSONL integrity |
+| **Medium** | Edge cases from production bugs | Discovered through real issues | Orphan handling, ID collision detection |
+| **Low** | Display/formatting | Visual output, can be manually verified | Table formatting, color output |
+
+---
+
+## What NOT to Test Extensively
+
+### Simple utility functions
+Trust the language. Don't test that `strings.TrimSpace` works.
+
+### Every permutation of inputs
+Use table-driven tests with representative cases instead of exhaustive permutations.
+
+```go
+// BAD: 10 separate test functions
+func TestPriority0(t *testing.T) { ... }
+func TestPriority1(t *testing.T) { ... }
+func TestPriority2(t *testing.T) { ... }
+
+// GOOD: One table-driven test
+func TestPriorityMapping(t *testing.T) {
+    cases := []struct{ in, want int }{
+        {0, 4}, {1, 0}, {5, 3}, // includes boundary
+    }
+    for _, tc := range cases {
+        t.Run(fmt.Sprintf("priority_%d", tc.in), func(t *testing.T) {
+            got := mapPriority(tc.in)
+            if got != tc.want { t.Errorf(...) }
+        })
+    }
+}
+```
+
+### Obvious behavior
+Don't test "if file exists, return true" - trust the implementation.
+
+### Same logic through different entry points
+If you test a function directly, don't also test it through every caller.
+
+---
+
+## Anti-Patterns to Avoid
+
+### 1. Trivial Assertions
+
+Testing obvious happy paths that would pass with trivial implementations.
+
+```go
+// BAD: What bug would this catch?
+func TestValidateBeadsWorkspace(t *testing.T) {
+    dir := setupTestWorkspace(t)
+    if err := validateBeadsWorkspace(dir); err != nil {
+        t.Errorf("expected no error, got: %v", err)
+    }
+}
+
+// GOOD: Test the interesting error cases
+func TestValidateBeadsWorkspace(t *testing.T) {
+    cases := []struct{
+        name    string
+        setup   func(t *testing.T) string
+        wantErr string
+    }{
+        {"missing .beads dir", setupNoBeadsDir, "not a beads workspace"},
+        {"corrupted db", setupCorruptDB, "database is corrupted"},
+        {"permission denied", setupNoReadAccess, "permission denied"},
+    }
+    // ...
+}
+```
+
+### 2. Duplicate Error Path Testing
+
+Testing the same logic multiple ways instead of once with table-driven tests.
+
+```go
+// BAD: Repetitive individual assertions
+if config.PriorityMap["0"] != 4 { t.Errorf(...) }
+if config.PriorityMap["1"] != 0 { t.Errorf(...) }
+if config.PriorityMap["2"] != 1 { t.Errorf(...) }
+
+// GOOD: Table-driven
+for k, want := range expectedMap {
+    if got := config.PriorityMap[k]; got != want {
+        t.Errorf("PriorityMap[%q] = %d, want %d", k, got, want)
+    }
+}
+```
+
+### 3. I/O Heavy Tests Without Mocking
+
+Unit tests that execute real commands or heavy I/O when they could mock.
+
+```go
+// BAD: Actually executes bd killall in unit test
+func TestDaemonFix(t *testing.T) {
+    exec.Command("bd", "killall").Run()
+    // ...
+}
+
+// GOOD: Mock the execution or use integration test tag
+func TestDaemonFix(t *testing.T) {
+    executor := &mockExecutor{}
+    fix := NewDaemonFix(executor)
+    // ...
+}
+```
+
+### 4. Testing Implementation, Not Behavior
+
+Tests that break when you refactor, even though behavior is unchanged.
+
+```go
+// BAD: Tests internal state
+if len(daemon.connectionPool) != 3 { t.Error(...) }
+
+// GOOD: Tests observable behavior
+if resp, err := daemon.HandleRequest(req); err != nil { t.Error(...) }
+```
+
+### 5. Missing Boundary Tests
+
+Testing known-good values but not boundaries and invalid inputs.
+
+```go
+// BAD: Only tests middle values
+TestPriority(1)  // works
+TestPriority(2)  // works
+
+// GOOD: Tests boundaries and invalid
+TestPriority(-1) // invalid - expect error
+TestPriority(0)  // boundary - min valid
+TestPriority(4)  // boundary - max valid
+TestPriority(5)  // boundary - first invalid
+```
+
+---
+
+## Target Metrics
+
+| Metric | Target | Current (beads) | Status |
+|--------|--------|-----------------|--------|
+| Test-to-code ratio | 0.5:1 - 1.5:1 | 0.85:1 | Healthy |
+| Fast test suite | < 5 seconds | 3.8 seconds | Good |
+| Integration tests | < 30 seconds | ~15 seconds | Good |
+| Compilation overhead | Minimize | 180 seconds | Bottleneck |
+
+### Interpretation
+
+- **0.5:1** - Light coverage, fast iteration (acceptable for utilities)
+- **1:1** - Solid coverage for most projects (our target)
+- **1.5:1** - Heavy coverage for critical systems
+- **2:1+** - Over-engineered, maintenance burden
+
+---
+
+## Beads-Specific Guidance
+
+### Well-Covered (Maintain)
+
+| Area | Why It's Well-Tested |
+|------|---------------------|
+| Sync/Export/Import | Data integrity critical - comprehensive edge cases |
+| SQLite transactions | Rollback safety, atomicity guarantees |
+| Merge operations | 3-way merge with conflict resolution |
+| Daemon locking | Prevents corruption from multiple instances |
+
+### Needs Attention
+
+| Area | Gap | Priority |
+|------|-----|----------|
+| Daemon lifecycle | Shutdown/signal handling | Medium |
+| Concurrent operations | Stress testing under load | Medium |
+| Boundary validation | Edge inputs in mapping functions | Low |
+
+### Skip These
+
+- Display formatting tests (manually verify)
+- Simple getters/setters
+- Tests that duplicate SQLite's guarantees
+
+---
+
+## Related Docs
+
+- [TESTING.md](TESTING.md) - How to run tests
+- [README_TESTING.md](README_TESTING.md) - Fast vs integration test strategy
+- [dev-notes/TEST_SUITE_AUDIT.md](dev-notes/TEST_SUITE_AUDIT.md) - Test refactoring progress