Implement Tier 1 compaction logic (bd-257)

- Add Compactor with CompactTier1 and CompactTier1Batch methods
- Single issue and batch compaction with 5 concurrent workers
- Dry-run mode for testing without API calls
- Smart size checking: keeps original if summary is longer
- Improved Haiku prompts to emphasize compression
- Add ApplyCompaction method for setting compaction metadata
- Comprehensive tests including API integration tests
- All tests passing
This commit is contained in:
Steve Yegge
2025-10-15 23:31:43 -07:00
parent 5f6aac5fb1
commit 0da81371b4
5 changed files with 683 additions and 5 deletions

View File

@@ -172,7 +172,7 @@
{"id":"bd-254","title":"Implement candidate identification queries","description":"Write SQL queries to identify issues eligible for Tier 1 and Tier 2 compaction based on closure time and dependency status.","design":"Create `internal/storage/sqlite/compact.go` with:\n\n```go\ntype CompactionCandidate struct {\n IssueID string\n ClosedAt time.Time\n OriginalSize int\n EstimatedSize int\n DependentCount int\n}\n\nfunc (s *SQLiteStorage) GetTier1Candidates(ctx context.Context) ([]*CompactionCandidate, error)\nfunc (s *SQLiteStorage) GetTier2Candidates(ctx context.Context) ([]*CompactionCandidate, error)\nfunc (s *SQLiteStorage) CheckEligibility(ctx context.Context, issueID string, tier int) (bool, string, error)\n```\n\nUse recursive CTE for dependency depth checking (similar to ready_issues view).","acceptance_criteria":"- Tier 1 query filters by days and dependency depth\n- Tier 2 query includes commit/issue count checks\n- Dependency checking handles circular deps gracefully\n- Performance: \u003c100ms for 10,000 issue database\n- Tests cover edge cases (no deps, circular deps, mixed status)","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.225835-07:00","updated_at":"2025-10-15T22:16:45.517562-07:00","closed_at":"2025-10-15T22:16:45.517562-07:00"}
{"id":"bd-255","title":"Create Haiku client and prompt templates","description":"Implement Claude Haiku API client with template-based prompts for Tier 1 and Tier 2 summarization.","design":"Create `internal/compact/haiku.go`:\n\n```go\ntype HaikuClient struct {\n client *anthropic.Client\n model string\n}\n\nfunc NewHaikuClient(apiKey string) (*HaikuClient, error)\nfunc (h *HaikuClient) SummarizeTier1(ctx context.Context, issue *types.Issue) (string, error)\nfunc (h *HaikuClient) SummarizeTier2(ctx context.Context, issue *types.Issue) (string, error)\n```\n\nUse text/template for prompt rendering.\n\nTier 1 output format:\n```\n**Summary:** [2-3 sentences]\n**Key Decisions:** [bullet points]\n**Resolution:** [outcome]\n```\n\nTier 2 output format:\n```\nSingle paragraph ≤150 words covering what was built, why it mattered, lasting impact.\n```","acceptance_criteria":"- API key from env var or config (env takes precedence)\n- Prompts render correctly with templates\n- Rate limiting handled gracefully (exponential backoff)\n- Network errors retry up to 3 times\n- Mock tests for API calls","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.229702-07:00","updated_at":"2025-10-15T22:32:51.491798-07:00","closed_at":"2025-10-15T22:32:51.491798-07:00"}
{"id":"bd-256","title":"Implement snapshot creation and restoration","description":"Implement snapshot creation before compaction and restoration capability to undo compaction.","design":"Add to `internal/storage/sqlite/compact.go`:\n\n```go\nfunc (s *SQLiteStorage) CreateSnapshot(ctx context.Context, issue *types.Issue, level int) error\nfunc (s *SQLiteStorage) RestoreFromSnapshot(ctx context.Context, issueID string, level int) error\nfunc (s *SQLiteStorage) GetSnapshots(ctx context.Context, issueID string) ([]*Snapshot, error)\n```\n\nSnapshot JSON structure:\n```json\n{\n \"description\": \"...\",\n \"design\": \"...\",\n \"notes\": \"...\",\n \"acceptance_criteria\": \"...\",\n \"title\": \"...\"\n}\n```","acceptance_criteria":"- Snapshot created atomically with compaction\n- Restore returns exact original content\n- Multiple snapshots per issue supported (Tier 1 → Tier 2)\n- JSON encoding handles UTF-8 and special characters\n- Size calculation is accurate (UTF-8 bytes)","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.231906-07:00","updated_at":"2025-10-15T23:11:31.076796-07:00","closed_at":"2025-10-15T23:11:31.076796-07:00"}
{"id":"bd-257","title":"Implement Tier 1 compaction logic","description":"Implement the core Tier 1 compaction process: snapshot → summarize → update.","design":"Add to `internal/compact/compactor.go`:\n\n```go\ntype Compactor struct {\n store storage.Storage\n haiku *HaikuClient\n config *CompactConfig\n}\n\nfunc New(store storage.Storage, apiKey string, config *CompactConfig) (*Compactor, error)\nfunc (c *Compactor) CompactTier1(ctx context.Context, issueID string) error\nfunc (c *Compactor) CompactTier1Batch(ctx context.Context, issueIDs []string) error\n```\n\nProcess:\n1. Verify eligibility\n2. Calculate original size\n3. Create snapshot\n4. Call Haiku for summary\n5. Update issue (description=summary, clear design/notes/criteria)\n6. Set compaction_level=1, compacted_at=now, original_size\n7. Record EventCompacted\n8. Mark dirty for export","acceptance_criteria":"- Single issue compaction works end-to-end\n- Batch processing with parallel workers (5 concurrent)\n- Errors don't corrupt database (transaction rollback)\n- EventCompacted includes size savings\n- Dry-run mode (identify + size estimate only, no API calls)","status":"open","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.23391-07:00","updated_at":"2025-10-15T21:51:23.23391-07:00"}
{"id":"bd-257","title":"Implement Tier 1 compaction logic","description":"Implement the core Tier 1 compaction process: snapshot → summarize → update.","design":"Add to `internal/compact/compactor.go`:\n\n```go\ntype Compactor struct {\n store storage.Storage\n haiku *HaikuClient\n config *CompactConfig\n}\n\nfunc New(store storage.Storage, apiKey string, config *CompactConfig) (*Compactor, error)\nfunc (c *Compactor) CompactTier1(ctx context.Context, issueID string) error\nfunc (c *Compactor) CompactTier1Batch(ctx context.Context, issueIDs []string) error\n```\n\nProcess:\n1. Verify eligibility\n2. Calculate original size\n3. Create snapshot\n4. Call Haiku for summary\n5. Update issue (description=summary, clear design/notes/criteria)\n6. Set compaction_level=1, compacted_at=now, original_size\n7. Record EventCompacted\n8. Mark dirty for export","acceptance_criteria":"- Single issue compaction works end-to-end\n- Batch processing with parallel workers (5 concurrent)\n- Errors don't corrupt database (transaction rollback)\n- EventCompacted includes size savings\n- Dry-run mode (identify + size estimate only, no API calls)","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.23391-07:00","updated_at":"2025-10-15T23:30:31.967874-07:00","closed_at":"2025-10-15T23:30:31.967874-07:00"}
{"id":"bd-258","title":"Implement Tier 2 compaction logic","description":"Implement Tier 2 ultra-compression: more aggressive summarization and optional event pruning.","design":"Add to `internal/compact/compactor.go`:\n\n```go\nfunc (c *Compactor) CompactTier2(ctx context.Context, issueID string) error\nfunc (c *Compactor) CompactTier2Batch(ctx context.Context, issueIDs []string) error\n```\n\nProcess:\n1. Verify issue is at compaction_level = 1\n2. Check Tier 2 eligibility (days, deps, commits/issues)\n3. Create Tier 2 snapshot\n4. Call Haiku with ultra-compression prompt\n5. Update issue (description = single paragraph, clear all other fields)\n6. Set compaction_level = 2\n7. Optionally prune events (keep created/closed, archive rest to snapshot)","acceptance_criteria":"- Requires existing Tier 1 compaction\n- Git commit counting works (with fallback to issue counter)\n- Events optionally pruned (config: compact_events_enabled)\n- Archived events stored in snapshot JSON\n- Size reduction 90-95%","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-15T21:51:23.23586-07:00","updated_at":"2025-10-15T21:51:23.23586-07:00"}
{"id":"bd-259","title":"Add `bd compact` CLI command","description":"Implement the `bd compact` command with dry-run, batch processing, and progress reporting.","design":"Create `cmd/bd/compact.go`:\n\n```go\nvar compactCmd = \u0026cobra.Command{\n Use: \"compact\",\n Short: \"Compact old closed issues to save space\",\n}\n\nFlags:\n --dry-run Preview without compacting\n --tier int Compaction tier (1 or 2, default: 1)\n --all Process all candidates\n --id string Compact specific issue\n --force Force compact (bypass checks, requires --id)\n --batch-size int Issues per batch\n --workers int Parallel workers\n --json JSON output\n```","acceptance_criteria":"- `--dry-run` shows accurate preview with size estimates\n- `--all` processes all candidates\n- `--id` compacts single issue\n- `--force` bypasses eligibility checks (only with --id)\n- Progress bar for batches (e.g., [████████] 47/47)\n- JSON output with `--json`\n- Exit codes: 0=success, 1=error\n- Shows summary: count, size saved, cost, time","status":"open","priority":1,"issue_type":"task","created_at":"2025-10-15T21:51:23.238373-07:00","updated_at":"2025-10-15T21:51:23.238373-07:00"}
{"id":"bd-26","title":"Optimize reference updates to avoid loading all issues into memory","description":"In updateReferences(), we call SearchIssues with no filter to get ALL issues for updating references. For large databases (10k+ issues), this loads everything into memory. Options: 1) Use batched processing with LIMIT/OFFSET, 2) Use SQL UPDATE with REPLACE() directly, 3) Stream results instead of loading all at once. Located in collision.go:266","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-14T14:43:06.911497-07:00","updated_at":"2025-10-15T16:27:22.001829-07:00"}