From 61d70f208f888f7121b4a96090c64e97b63e69fc Mon Sep 17 00:00:00 2001 From: Steve Yegge Date: Wed, 15 Oct 2025 21:16:52 -0700 Subject: [PATCH] removed obsolete docs --- ULTRATHINK_BD222.md | 952 -------------------------------------------- ULTRATHINK_BD224.md | 373 ----------------- 2 files changed, 1325 deletions(-) delete mode 100644 ULTRATHINK_BD222.md delete mode 100644 ULTRATHINK_BD224.md diff --git a/ULTRATHINK_BD222.md b/ULTRATHINK_BD222.md deleted file mode 100644 index 64d969d4..00000000 --- a/ULTRATHINK_BD222.md +++ /dev/null @@ -1,952 +0,0 @@ -# Ultrathink: Batching API for Bulk Issue Creation (bd-222) - -**Date**: 2025-10-15 -**Context**: Individual devs, small teams, future agent swarms, bulk imports -**Problem**: CreateIssue acquires dedicated connection per call, inefficient for bulk operations - -## Executive Summary - -**Recommended Solution**: **Hybrid approach - Add CreateIssues + Keep existing CreateIssue unchanged** - -Provides high-performance batch path for bulk operations while maintaining simple single-issue API for typical use. - ---- - -## Dependencies & Implementation Order - -### Critical Dependency: bd-224 (status/closed_at invariant) - -**bd-224 MUST be implemented before bd-222** - -**Why**: Both issues modify the same code paths: -- bd-224: Fixes `import.go` to enforce `closed_at` invariant (status='closed' ⟺ closed_at != NULL) -- bd-222: Changes `import.go` to use `CreateIssues` instead of `CreateIssue` loop - -**The Problem**: -If we implement bd-222 first: -1. `CreateIssues` won't enforce the closed_at invariant (inherits bug from CreateIssue) -2. Import switches to use `CreateIssues` -3. Import can still create inconsistent data (bd-224's bug persists) -4. Later bd-224 fix requires modifying BOTH CreateIssue AND CreateIssues - -**The Solution**: -If we implement bd-224 first: -1. Add CHECK constraint: `(status = 'closed') = (closed_at IS NOT NULL)` -2. Fix `UpdateIssue` to manage closed_at automatically -3. Fix `import.go` to enforce invariant before calling CreateIssue -4. **Then** implement bd-222's `CreateIssues` with invariant already enforced: - - Database constraint rejects bad data - - Issue.Validate() checks the invariant (per bd-224) - - Import code already normalizes before calling CreateIssues - - No new code needed in CreateIssues - it's correct by construction! - -### Implementation Impact - -**CreateIssues must validate closed_at invariant** (from bd-224): - -```go -// Phase 1: Validation -for i, issue := range issues { - if err := issue.Validate(); err != nil { // ← Validates invariant (bd-224) - return fmt.Errorf("validation failed for issue %d: %w", i, err) - } -} -``` - -After bd-224 is complete, `Issue.Validate()` will check: -```go -func (i *Issue) Validate() error { - // ... existing validation ... - - // Enforce closed_at invariant (bd-224) - if i.Status == StatusClosed && i.ClosedAt == nil { - return fmt.Errorf("closed issues must have closed_at timestamp") - } - if i.Status != StatusClosed && i.ClosedAt != nil { - return fmt.Errorf("non-closed issues cannot have closed_at timestamp") - } - - return nil -} -``` - -This means `CreateIssues` automatically enforces the invariant through validation, with the database CHECK constraint as final defense. - -### Import Code Simplification - -**Before bd-224** (current import.go): -```go -for _, issue := range issues { - // Complex logic to handle status/closed_at independently - updates := make(map[string]interface{}) - if _, ok := rawData["status"]; ok { - updates["status"] = issue.Status // ← Doesn't manage closed_at - } - // ... more complex update logic - store.CreateIssue(ctx, issue, "import") -} -``` - -**After bd-224** (import.go enforces invariant): -```go -for _, issue := range issues { - // Normalize closed_at based on status BEFORE creating - if issue.Status == types.StatusClosed { - if issue.ClosedAt == nil { - now := time.Now() - issue.ClosedAt = &now - } - } else { - issue.ClosedAt = nil // ← Clear if not closed - } - store.CreateIssue(ctx, issue, "import") -} -``` - -**After bd-222** (import.go uses batch): -```go -// Normalize all issues -for _, issue := range issues { - if issue.Status == types.StatusClosed { - if issue.ClosedAt == nil { - now := time.Now() - issue.ClosedAt = &now - } - } else { - issue.ClosedAt = nil - } -} - -// Single batch call (5-15x faster!) -store.CreateIssues(ctx, issues, "import") -``` - -Much simpler: normalize once, call batch API, database constraint enforces correctness. - -### Recommended Implementation Sequence - -1. ✅ **Implement bd-224 first** (P1 bug fix) - - Add database CHECK constraint - - Add validation to `Issue.Validate()` - - Fix `UpdateIssue` to auto-manage closed_at - - Fix `import.go` to normalize closed_at before creating - -2. ✅ **Then implement bd-222** (P2 performance enhancement) - - Add `CreateIssues` method (inherits bd-224's validation) - - Update `import.go` to use `CreateIssues` - - Import code is simpler (no per-issue loop, just normalize + batch) - -3. ✅ **Benefits of this order**: - - bd-224 fixes data integrity bug (higher priority) - - bd-222 builds on correct foundation - - No duplicate invariant enforcement code - - Database constraint + validation = defense in depth - - CreateIssues is correct by construction - ---- - -## Current State Analysis - -### How CreateIssue Works (sqlite.go:315-453) - -```go -func (s *SQLiteStorage) CreateIssue(ctx, issue, actor) error { - // 1. Acquire dedicated connection - conn, err := s.db.Conn(ctx) - defer conn.Close() - - // 2. BEGIN IMMEDIATE transaction (acquires write lock) - conn.ExecContext(ctx, "BEGIN IMMEDIATE") - - // 3. Generate ID atomically if needed - // - Query issue_counters - // - Update counter with MAX(existing, calculated) + 1 - - // 4. Insert issue - // 5. Record creation event - // 6. Mark dirty for export - // 7. COMMIT -} -``` - -### Performance Characteristics - -**Single Issue Creation**: -- Connection acquisition: ~1ms -- BEGIN IMMEDIATE: ~1-5ms (lock acquisition) -- ID generation: ~2-3ms (subquery + update) -- Insert + event + dirty: ~2-3ms -- COMMIT: ~1-2ms -- **Total: ~7-14ms per issue** - -**Bulk Creation (100 issues, sequential)**: -- 100 connections: ~100ms -- 100 transactions: ~100-500ms (lock contention!) -- 100 ID generations: ~200-300ms -- 100 inserts: ~200-300ms -- **Total: ~600ms-1.2s** - -**With Batching (estimated)**: -- 1 connection: ~1ms -- 1 transaction: ~1-5ms -- ID generation batch: ~10-20ms (one query for range) -- Bulk insert: ~50-100ms (prepared stmt, multiple VALUES) -- **Total: ~60-130ms (5-10x faster)** - -### When Does This Matter? - -**Low Impact** (current approach is fine): -- Interactive CLI use: `bd create "Fix bug"` -- Individual agent creating 1-5 issues -- Typical development workflow - -**High Impact** (batching helps): -- ✅ Bulk import from JSONL (10-1000+ issues) -- ✅ Agent workflows generating issue decompositions (10-50 issues) -- ✅ Migrating from other systems (100-10000+ issues) -- ✅ Template instantiation (creating epic + subtasks) -- ✅ Test data generation - ---- - -## Solution Options - -### Option A: Simple All-or-Nothing Batch ⭐ **RECOMMENDED** - -```go -// CreateIssues creates multiple issues atomically in a single transaction -func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error -``` - -**Semantics**: -- All issues created, or none created (atomicity) -- Single transaction, single connection -- Returns error if ANY issue fails validation or insertion -- IDs generated atomically as a range - -**Pros**: -- ✅ Simple mental model (atomic batch) -- ✅ Clear error handling (one error = whole batch fails) -- ✅ Matches database transaction semantics -- ✅ Easy to implement (similar to CreateIssue) -- ✅ No partial state in database -- ✅ Safe for concurrent access (IMMEDIATE transaction) -- ✅ **5-10x faster for bulk operations** - -**Cons**: -- ⚠️ If one issue is invalid, whole batch fails -- ⚠️ Caller must retry entire batch on error -- ⚠️ No indication of WHICH issue failed - -**Mitigation**: Add validation-only mode to pre-check batch - -**Verdict**: Best for most use cases (import, migrations, agent workflows) - -### Option B: Partial Success with Error Details - -```go -type CreateResult struct { - ID string - Error error -} - -func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) ([]CreateResult, error) -``` - -**Semantics**: -- Best-effort creation -- Returns results for each issue (ID or error) -- Transaction commits even if some issues fail -- Complex rollback semantics - -**Pros**: -- ✅ Caller knows exactly which issues failed -- ✅ Partial progress on errors -- ✅ Good for unreliable input data - -**Cons**: -- ❌ **Complex transaction semantics**: Which failures abort transaction? -- ❌ **Partial state in database**: Caller must track what succeeded -- ❌ **ID generation complexity**: Skip failed issues in counter? -- ❌ **Dirty tracking complexity**: Which issues to mark dirty? -- ❌ **Event recording**: Record events for succeeded issues only? -- ❌ More complex API for common case -- ❌ Caller must handle partial state - -**Verdict**: Too complex, doesn't match database atomicity model - -### Option C: Batch with Configurable Strategy - -```go -type BatchOptions struct { - FailFast bool // Stop on first error (default) - ContinueOnError bool // Best effort - ValidateOnly bool // Dry run -} - -func (s *SQLiteStorage) CreateIssues(ctx, issues, actor, opts) ([]CreateResult, error) -``` - -**Pros**: -- ✅ Flexible for different use cases -- ✅ Can support both atomic and partial modes - -**Cons**: -- ❌ **Too much complexity** for the benefit -- ❌ Multiple code paths = more bugs -- ❌ Unclear which mode to use when -- ❌ Doesn't solve the core problem (connection overhead) - -**Verdict**: Over-engineered for current needs - -### Option D: Internal Optimization Only (No API Change) - -Optimize CreateIssue internally to batch operations without changing API. - -**Approach**: -- Connection pooling improvements -- Prepared statement caching -- WAL optimization - -**Pros**: -- ✅ No API changes -- ✅ Benefits all callers automatically - -**Cons**: -- ❌ **Can't eliminate transaction overhead** (still N transactions) -- ❌ **Can't eliminate ID generation overhead** (still N counter updates) -- ❌ **Limited improvement** (maybe 20-30% faster, not 5-10x) -- ❌ Doesn't address root cause - -**Verdict**: Good to do anyway, but doesn't solve the problem - ---- - -## Recommended Solution: **Simple All-or-Nothing Batch (Option A)** - -### API Design - -```go -// CreateIssues creates multiple issues atomically in a single transaction. -// All issues are created or none are created. Returns error if any issue -// fails validation or insertion. -// -// Performance: ~10x faster than calling CreateIssue in a loop for large batches. -// Use this for bulk imports, migrations, or agent workflows creating many issues. -// -// Issues with empty IDs will have IDs generated atomically. Issues with -// explicit IDs are used as-is (caller responsible for avoiding collisions). -func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error -``` - -### Implementation Strategy - -#### Phase 1: Validation - -```go -// Validate all issues first (fail-fast) -for i, issue := range issues { - if err := issue.Validate(); err != nil { - return fmt.Errorf("validation failed for issue %d: %w", i, err) - } -} -``` - -#### Phase 2: Connection & Transaction - -```go -// Acquire dedicated connection (same as CreateIssue) -conn, err := s.db.Conn(ctx) -if err != nil { - return fmt.Errorf("failed to acquire connection: %w", err) -} -defer conn.Close() - -// BEGIN IMMEDIATE (same as CreateIssue) -if _, err := conn.ExecContext(ctx, "BEGIN IMMEDIATE"); err != nil { - return fmt.Errorf("failed to begin immediate transaction: %w", err) -} - -committed := false -defer func() { - if !committed { - conn.ExecContext(context.Background(), "ROLLBACK") - } -}() -``` - -#### Phase 3: Batch ID Generation - -**Key Insight**: Generate ID range atomically, then assign sequentially - -```go -// Count how many issues need IDs -needIDCount := 0 -for _, issue := range issues { - if issue.ID == "" { - needIDCount++ - } -} - -// Generate ID range atomically (if needed) -var nextID int -var prefix string -if needIDCount > 0 { - // Get prefix from config - err := conn.QueryRowContext(ctx, - `SELECT value FROM config WHERE key = ?`, - "issue_prefix").Scan(&prefix) - if err == sql.ErrNoRows || prefix == "" { - prefix = "bd" - } else if err != nil { - return fmt.Errorf("failed to get config: %w", err) - } - - // Atomically reserve ID range: [nextID, nextID+needIDCount) - // This is the KEY optimization - one counter update instead of N - err = conn.QueryRowContext(ctx, ` - INSERT INTO issue_counters (prefix, last_id) - SELECT ?, COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0) + ? - FROM issues - WHERE id LIKE ? || '-%' - AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*' - ON CONFLICT(prefix) DO UPDATE SET - last_id = MAX( - last_id, - (SELECT COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0) - FROM issues - WHERE id LIKE ? || '-%' - AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*') - ) + ? - RETURNING last_id - `, prefix, prefix, needIDCount, prefix, prefix, prefix, prefix, prefix, needIDCount).Scan(&nextID) - if err != nil { - return fmt.Errorf("failed to generate ID range: %w", err) - } - - // Assign IDs sequentially - currentID := nextID - needIDCount + 1 - for i := range issues { - if issues[i].ID == "" { - issues[i].ID = fmt.Sprintf("%s-%d", prefix, currentID) - currentID++ - } - } -} -``` - -#### Phase 4: Bulk Insert Issues - -**Two approaches**: - -**Approach A: Prepared Statement + Loop** (simpler, still fast) -```go -stmt, err := conn.PrepareContext(ctx, ` - INSERT INTO issues ( - id, title, description, design, acceptance_criteria, notes, - status, priority, issue_type, assignee, estimated_minutes, - created_at, updated_at, closed_at, external_ref - ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) -`) -if err != nil { - return fmt.Errorf("failed to prepare statement: %w", err) -} -defer stmt.Close() - -now := time.Now() -for _, issue := range issues { - issue.CreatedAt = now - issue.UpdatedAt = now - - _, err = stmt.ExecContext(ctx, - issue.ID, issue.Title, issue.Description, issue.Design, - issue.AcceptanceCriteria, issue.Notes, issue.Status, - issue.Priority, issue.IssueType, issue.Assignee, - issue.EstimatedMinutes, issue.CreatedAt, issue.UpdatedAt, - issue.ClosedAt, issue.ExternalRef, - ) - if err != nil { - return fmt.Errorf("failed to insert issue %s: %w", issue.ID, err) - } -} -``` - -**Approach B: Multi-VALUE INSERT** (fastest, more complex) -```go -// Build multi-value INSERT -// INSERT INTO issues VALUES (...), (...), (...) -// More complex string building but ~2x faster for large batches -// Defer to performance testing phase -``` - -**Decision**: Start with Approach A (prepared statement), optimize to Approach B if benchmarks show need - -#### Phase 5: Bulk Record Events - -```go -// Prepare event statement -eventStmt, err := conn.PrepareContext(ctx, ` - INSERT INTO events (issue_id, event_type, actor, new_value) - VALUES (?, ?, ?, ?) -`) -if err != nil { - return fmt.Errorf("failed to prepare event statement: %w", err) -} -defer eventStmt.Close() - -for _, issue := range issues { - eventData, err := json.Marshal(issue) - if err != nil { - eventData = []byte(fmt.Sprintf(`{"id":"%s","title":"%s"}`, issue.ID, issue.Title)) - } - - _, err = eventStmt.ExecContext(ctx, issue.ID, types.EventCreated, actor, string(eventData)) - if err != nil { - return fmt.Errorf("failed to record event for %s: %w", issue.ID, err) - } -} -``` - -#### Phase 6: Bulk Mark Dirty - -```go -// Bulk insert dirty markers -dirtyStmt, err := conn.PrepareContext(ctx, ` - INSERT INTO dirty_issues (issue_id, marked_at) - VALUES (?, ?) - ON CONFLICT (issue_id) DO UPDATE SET marked_at = excluded.marked_at -`) -if err != nil { - return fmt.Errorf("failed to prepare dirty statement: %w", err) -} -defer dirtyStmt.Close() - -dirtyTime := time.Now() -for _, issue := range issues { - _, err = dirtyStmt.ExecContext(ctx, issue.ID, dirtyTime) - if err != nil { - return fmt.Errorf("failed to mark dirty %s: %w", issue.ID, err) - } -} -``` - -#### Phase 7: Commit - -```go -if _, err := conn.ExecContext(ctx, "COMMIT"); err != nil { - return fmt.Errorf("failed to commit transaction: %w", err) -} -committed = true -return nil -``` - ---- - -## Design Decisions & Tradeoffs - -### Decision 1: All-or-Nothing Atomicity ✅ - -**Rationale**: Matches database transaction semantics, simpler mental model - -**Tradeoff**: Batch fails if ANY issue is invalid -- **Mitigation**: Pre-validate all issues before starting transaction -- **Alternative**: Caller can retry with smaller batches or individual issues - -### Decision 2: Same Transaction Semantics as CreateIssue ✅ - -Use BEGIN IMMEDIATE, not DEFERRED or EXCLUSIVE - -**Rationale**: -- Consistency with existing CreateIssue -- Prevents race conditions in ID generation -- Serializes batch operations (which is fine - they're rare) - -**Tradeoff**: Batches serialize (only one concurrent batch writer) -- **Impact**: Low - batch operations are rare (import, migration) -- **Benefit**: Simple, correct, no race conditions - -### Decision 3: Atomic ID Range Reservation ✅ - -Generate range [nextID, nextID+N) in single counter update - -**Rationale**: KEY optimization - avoids N counter updates - -**Implementation**: -```sql --- Old approach (CreateIssue): N updates -UPDATE issue_counters SET last_id = last_id + 1 RETURNING last_id; -- N times - --- New approach (CreateIssues): 1 update -UPDATE issue_counters SET last_id = last_id + N RETURNING last_id; -- Once -``` - -**Correctness**: Safe because BEGIN IMMEDIATE serializes batches - -### Decision 4: Support Mixed ID Assignment ✅ - -Some issues can have explicit IDs, others auto-generated - -**Use Case**: Import with some external IDs, some new issues - -```go -issues := []*Issue{ - {ID: "ext-123", Title: "External issue"}, // Keep ID - {ID: "", Title: "New issue"}, // Generate ID - {ID: "bd-999", Title: "Explicit ID"}, // Keep ID -} -``` - -**Rationale**: Flexible for import scenarios - -**Complexity**: Low - just count issues needing IDs - -### Decision 5: Prepared Statements Over Multi-VALUE INSERT ✅ - -Start with prepared statement loop, optimize later if needed - -**Rationale**: -- Simpler implementation -- Still much faster than N transactions (5-10x) -- Multi-VALUE INSERT only ~2x faster than prepared stmt -- Can optimize later if profiling shows need - -### Decision 6: Keep CreateIssue Unchanged ✅ - -Don't modify existing CreateIssue implementation - -**Rationale**: -- Backward compatibility -- No risk to existing callers -- Additive change only -- Different use cases (single vs batch) - ---- - -## When to Use Which API - -### Use CreateIssue (existing) - -- ✅ Interactive CLI: `bd create "Title"` -- ✅ Single issue creation -- ✅ Agent creating 1-3 issues -- ✅ When simplicity matters -- ✅ When you want per-issue error handling - -### Use CreateIssues (new) - -- ✅ Bulk import from JSONL (10-1000+ issues) -- ✅ Migration from other systems (100-10000+ issues) -- ✅ Agent decomposing work into 10-50 issues -- ✅ Template instantiation (epic + subtasks) -- ✅ Test data generation -- ✅ When performance matters - -**Rule of Thumb**: Use CreateIssues for N > 5 issues - ---- - -## Implementation Checklist - -### Phase 1: Core Implementation ✅ -- [ ] Add `CreateIssues` to Storage interface (storage/storage.go) -- [ ] Implement SQLiteStorage.CreateIssues (storage/sqlite/sqlite.go) -- [ ] Add comprehensive unit tests -- [ ] Add concurrency tests (multiple batch writers) -- [ ] Add performance benchmarks - -### Phase 2: CLI Integration -- [ ] Add `bd create-batch` command (or internal use only?) -- [ ] Update import.go to use CreateIssues for bulk imports -- [ ] Test with real JSONL imports - -### Phase 3: Documentation -- [ ] Document CreateIssues API (godoc) -- [ ] Add batch import example -- [ ] Update EXTENDING.md with batch usage -- [ ] Performance notes in README - -### Phase 4: Optimization (if needed) -- [ ] Profile CreateIssues with 100, 1000, 10000 issues -- [ ] Optimize to multi-VALUE INSERT if needed -- [ ] Consider batch size limits (split large batches) - ---- - -## Testing Strategy - -### Unit Tests - -```go -func TestCreateIssues_Empty(t *testing.T) -func TestCreateIssues_Single(t *testing.T) -func TestCreateIssues_Multiple(t *testing.T) -func TestCreateIssues_WithExplicitIDs(t *testing.T) -func TestCreateIssues_MixedIDs(t *testing.T) -func TestCreateIssues_ValidationError(t *testing.T) -func TestCreateIssues_DuplicateID(t *testing.T) -func TestCreateIssues_RollbackOnError(t *testing.T) -``` - -### Concurrency Tests - -```go -func TestCreateIssues_Concurrent(t *testing.T) { - // 10 goroutines each creating 100 issues - // Verify no ID collisions - // Verify all issues created -} - -func TestCreateIssues_MixedWithCreateIssue(t *testing.T) { - // Concurrent CreateIssue + CreateIssues - // Verify no ID collisions -} -``` - -### Performance Benchmarks - -```go -func BenchmarkCreateIssue_Sequential(b *testing.B) -func BenchmarkCreateIssues_Batch(b *testing.B) - -// Expected results (100 issues): -// CreateIssue x100: ~600-1200ms -// CreateIssues: ~60-130ms -// Speedup: 5-10x -``` - -### Integration Tests - -```go -func TestImport_LargeJSONL(t *testing.T) { - // Import 1000 issues from JSONL - // Verify all created correctly - // Verify performance < 1s -} -``` - ---- - -## Migration Plan - -### Step 1: Add Interface Method (Non-Breaking) - -```go -// storage/storage.go -type Storage interface { - CreateIssue(ctx context.Context, issue *types.Issue, actor string) error - CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error // NEW - // ... rest unchanged -} -``` - -### Step 2: Implement SQLiteStorage.CreateIssues - -Follow implementation strategy above - -### Step 3: Add Tests - -Comprehensive unit + concurrency + benchmark tests - -### Step 4: Update Import (Optional) - -```go -// cmd/bd/import.go - replace loop with batch -func importIssues(store Storage, issues []*Issue) error { - // Old: - // for _, issue := range issues { - // store.CreateIssue(ctx, issue, "import") - // } - - // New: - return store.CreateIssues(ctx, issues, "import") -} -``` - -**Note**: Start with internal use (import), expose CLI later if needed - -### Step 5: Performance Testing - -```bash -# Generate test JSONL -bd export > backup.jsonl - -# Duplicate 100x for stress test -cat backup.jsonl backup.jsonl ... > large_test.jsonl - -# Test import performance -time bd import large_test.jsonl -``` - ---- - -## Future Enhancements (NOT for bd-222) - -### Batch Size Limits - -If very large batches cause memory issues: - -```go -func (s *SQLiteStorage) CreateIssues(ctx, issues, actor) error { - const maxBatchSize = 1000 - - for i := 0; i < len(issues); i += maxBatchSize { - end := min(i+maxBatchSize, len(issues)) - batch := issues[i:end] - - if err := s.createIssuesBatch(ctx, batch, actor); err != nil { - return fmt.Errorf("batch %d-%d failed: %w", i, end, err) - } - } - return nil -} -``` - -**Decision**: Don't implement until we see issues with large batches (>1000) - -### Validation-Only Mode - -Pre-validate batch without creating: - -```go -func (s *SQLiteStorage) ValidateIssues(ctx, issues) error -``` - -**Use Case**: Dry-run before bulk import - -**Decision**: Add if import workflows request it - -### Progress Callbacks - -Report progress for long-running batches: - -```go -type BatchProgress func(completed, total int) - -func (s *SQLiteStorage) CreateIssuesWithProgress(ctx, issues, actor, progress) error -``` - -**Decision**: Add if agent workflows request it (likely for 1000+ issue batches) - ---- - -## Performance Analysis - -### Baseline (CreateIssue loop) - -For 100 issues: -``` -Connection overhead: 100ms (1ms × 100) -Transaction overhead: 300ms (3ms × 100, with lock contention) -ID generation: 250ms (2.5ms × 100) -Insert + event: 250ms (2.5ms × 100) -Total: 900ms -``` - -### With CreateIssues - -For 100 issues: -``` -Connection overhead: 1ms (1 connection) -Transaction overhead: 5ms (1 transaction) -ID range generation: 15ms (1 query, more complex) -Bulk insert (prep): 50ms (prepared stmt × 100) -Bulk events (prep): 30ms (prepared stmt × 100) -Bulk dirty (prep): 20ms (prepared stmt × 100) -Commit: 5ms -Total: 126ms (7x faster) -``` - -### Scalability - -| Issues | CreateIssue Loop | CreateIssues | Speedup | -|--------|------------------|--------------|---------| -| 10 | 90ms | 30ms | 3x | -| 100 | 900ms | 126ms | 7x | -| 1000 | 9s | 800ms | 11x | -| 10000 | 90s | 6s | 15x | - -**Key Insight**: Speedup increases with batch size due to fixed overhead amortization - ---- - -## Why This Solution Wins - -### For Individual Devs & Small Teams -- **Zero impact on normal workflow**: CreateIssue unchanged -- **Fast imports**: 1000 issues in <1s instead of 10s -- **Simple mental model**: All-or-nothing batch -- **No new concepts**: Same semantics as CreateIssue, just faster - -### For Agent Swarms -- **Efficient decomposition**: Agent creates 50 subtasks in one call -- **Atomic work generation**: All issues created or none -- **No connection exhaustion**: One connection per batch -- **Safe concurrency**: BEGIN IMMEDIATE prevents races - -### For New Codebase -- **Non-breaking change**: Additive API only -- **Performance win**: 5-15x faster for bulk operations -- **Simple implementation**: ~200 LOC, similar to CreateIssue -- **Battle-tested pattern**: Same transaction semantics as CreateIssue - ---- - -## Alternatives Considered and Rejected - -### Alternative 1: Auto-Batch in CreateIssue - -Automatically detect rapid CreateIssue calls and batch them. - -**Why Rejected**: -- ❌ Magical behavior (implicit batching) -- ❌ Complex implementation (goroutine + timer + coordination) -- ❌ Race conditions and edge cases -- ❌ Unpredictable performance (when does batch trigger?) -- ❌ Can't guarantee atomicity across auto-batch boundary - -### Alternative 2: Separate Import API - -Add ImportIssues specifically for JSONL import, not general-purpose. - -**Why Rejected**: -- ❌ Limits use cases (what about agent workflows?) -- ❌ Name doesn't match behavior (it's just batch create) -- ❌ CreateIssues is more discoverable and general - -### Alternative 3: Streaming API - -```go -type IssueStream interface { - Send(*Issue) error - CloseAndCommit() error -} -func (s *SQLiteStorage) CreateIssueStream(ctx, actor) (IssueStream, error) -``` - -**Why Rejected**: -- ❌ More complex API (stateful stream object) -- ❌ Error handling complexity (partial writes?) -- ❌ Doesn't match Go/SQL idioms -- ❌ Caller must manage stream lifecycle -- ❌ Simple slice is easier to work with - ---- - -## Conclusion - -The **simple all-or-nothing batch API** (CreateIssues) is the best solution because: - -1. **Significant performance win**: 5-15x faster for bulk operations -2. **Simple API**: Just like CreateIssue but with slice -3. **Safe**: Atomic transaction, no partial state -4. **Non-breaking**: Existing CreateIssue unchanged -5. **Flexible**: Supports mixed ID assignment (auto + explicit) -6. **Proven pattern**: Same transaction semantics as CreateIssue - -The key insight is **atomic ID range reservation** - updating the counter once for N issues instead of N times. Combined with a single transaction and prepared statements, this provides major performance improvements without complexity. - -This aligns perfectly with beads' goals: simple for individual devs, efficient for bulk operations, robust for agent swarms. - -**Implementation size**: ~200 LOC + ~400 LOC tests = manageable, low-risk change -**Expected performance**: 5-15x faster for bulk operations (N > 10) -**Risk**: Low (additive API, comprehensive tests) diff --git a/ULTRATHINK_BD224.md b/ULTRATHINK_BD224.md deleted file mode 100644 index 5d2ce958..00000000 --- a/ULTRATHINK_BD224.md +++ /dev/null @@ -1,373 +0,0 @@ -# Ultrathink: Solving status/closed_at Inconsistency (bd-224) - -**Date**: 2025-10-15 -**Context**: Individual devs, small teams, future agent swarms, new codebase -**Problem**: Data model allows `status='open'` with `closed_at != NULL` (liminal state) - -## Executive Summary - -**Recommended Solution**: **Hybrid approach - Database CHECK constraint + Application enforcement** - -This provides defense-in-depth perfect for agent swarms while keeping the model simple for individual developers. - ---- - -## Current State Analysis - -### Where closed_at is Used - -1. **GetIssue**: Returns it to callers -2. **CloseIssue**: Sets `closed_at = now()` when closing -3. **SearchIssues**: Includes in results -4. **GetStatistics** ⚠️ **CRITICAL**: - ```sql - SELECT AVG((julianday(closed_at) - julianday(created_at)) * 24) - FROM issues - WHERE closed_at IS NOT NULL - ``` - Uses `closed_at IS NOT NULL` NOT `status='closed'` for lead time calculation! - -### Impact of Inconsistency - -- **Statistics are wrong**: Issues with `status='open'` but `closed_at != NULL` pollute lead time metrics -- **User confusion**: bd ready shows "closed" issues -- **Agent workflows**: Unpredictable behavior when agents query by status vs closed_at -- **Data integrity**: Can't trust the data model - -### Root Cause: Import & Update Don't Manage Invariant - -**Import** (cmd/bd/import.go:206-207): -```go -if _, ok := rawData["status"]; ok { - updates["status"] = issue.Status // ← Updates status -} -// ⚠️ Does NOT clear/set closed_at -``` - -**UpdateIssue** (internal/storage/sqlite/sqlite.go:509-624): -- Updates any field in the map -- Does NOT automatically manage closed_at when status changes -- Records EventClosed but doesn't enforce the invariant - -**Concrete example (bd-89)**: -1. Issue closed properly: `status='closed'`, `closed_at='2025-10-15 08:13:08'` -2. JSONL had old state: `status='open'`, `closed_at='2025-10-14 02:58:22'` (inconsistent!) -3. Auto-import updated status to 'open' but left closed_at set -4. Result: Inconsistent state in database - ---- - -## Solution Options - -### Option A: Database CHECK Constraint ⭐ **RECOMMENDED FOUNDATION** - -```sql -ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status - CHECK ((status = 'closed') = (closed_at IS NOT NULL)); -``` - -**Pros:** -- ✅ Enforces invariant at database level (most robust) -- ✅ Catches bugs in ANY code path (future-proof) -- ✅ Works across all clients (CLI, MCP, future integrations) -- ✅ Simple to understand for developers -- ✅ **Perfect for agent swarms**: Can't break it with buggy code -- ✅ Prevents inconsistent exports (can't write bad data to JSONL) - -**Cons:** -- ⚠️ Requires migration -- ⚠️ Need to fix existing inconsistent data first -- ⚠️ Update operations must manage closed_at (but we should do this anyway!) - -**Migration complexity**: LOW - few users, can break things - -### Option B: Application-Level Enforcement Only - -Make UpdateIssue and Import smart about status changes. - -**Pros:** -- ✅ No schema change needed -- ✅ Flexible for edge cases - -**Cons:** -- ❌ Easy to forget in new code paths -- ❌ Doesn't protect against direct SQL manipulation -- ❌ Multiple places to maintain (import, update, close, etc.) -- ❌ **Bad for agent swarms**: One buggy agent breaks the model -- ❌ Still allows export of inconsistent data - -**Verdict**: Not robust enough alone - -### Option C: Add Explicit Reopened Support - -Add `bd reopen` command that uses EventReopened and manages closed_at. - -**Pros:** -- ✅ Makes reopening explicit and trackable -- ✅ EventReopened already defined (types.go:150) but unused - -**Cons:** -- ⚠️ Doesn't solve the fundamental invariant problem -- ⚠️ Still need to decide: clear closed_at or keep it? -- ⚠️ More complex model if we keep historical closed_at - -**Verdict**: Good addition, but doesn't solve root cause - -### Option D: Remove closed_at Entirely - -Make events table the single source of truth. - -**Pros:** -- ✅ Simplest data model -- ✅ No invariant to maintain -- ✅ Events are authoritative - -**Cons:** -- ❌ **Performance**: Lead time calculation requires JOIN + subquery - ```sql - SELECT AVG( - (julianday(e.created_at) - julianday(i.created_at)) * 24 - ) - FROM issues i - JOIN events e ON i.id = e.issue_id - WHERE e.event_type = 'closed' - ``` -- ❌ Events could be missing/corrupted (no referential integrity on event_type) -- ❌ More complex queries throughout codebase -- ❌ **Statistics would be slower** (critical for dashboard UIs) - -**Verdict**: Too much complexity/performance cost for the benefit - ---- - -## Recommended Solution: **Hybrid Approach** - -Combine **Option A (DB constraint)** + **Application enforcement** + **Option C (reopen command)** - -### Part 1: Database Constraint (Foundation) - -```sql --- Migration: First clean up existing inconsistent data -UPDATE issues -SET closed_at = NULL -WHERE status != 'closed' AND closed_at IS NOT NULL; - -UPDATE issues -SET closed_at = CURRENT_TIMESTAMP -WHERE status = 'closed' AND closed_at IS NULL; - --- Add the constraint -ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status - CHECK ((status = 'closed') = (closed_at IS NOT NULL)); -``` - -### Part 2: UpdateIssue Smart Status Management - -Modify `internal/storage/sqlite/sqlite.go:509-624`: - -```go -func (s *SQLiteStorage) UpdateIssue(ctx context.Context, id string, updates map[string]interface{}, actor string) error { - // ... existing validation ... - - // Smart closed_at management based on status changes - if statusVal, ok := updates["status"]; ok { - newStatus := statusVal.(string) - - if newStatus == string(types.StatusClosed) { - // Changing to closed: ensure closed_at is set - if _, hasClosedAt := updates["closed_at"]; !hasClosedAt { - updates["closed_at"] = time.Now() - } - } else { - // Changing from closed to something else: clear closed_at - if oldIssue.Status == types.StatusClosed { - updates["closed_at"] = nil // This will set it to NULL - eventType = types.EventReopened - } - } - } - - // ... rest of existing code ... -} -``` - -### Part 3: Import Enforcement - -Modify `cmd/bd/import.go:206-231`: - -```go -if _, ok := rawData["status"]; ok { - updates["status"] = issue.Status - - // Enforce closed_at invariant - if issue.Status == types.StatusClosed { - // Status is closed: ensure closed_at is set - if issue.ClosedAt == nil { - now := time.Now() - updates["closed_at"] = now - } else { - updates["closed_at"] = *issue.ClosedAt - } - } else { - // Status is not closed: ensure closed_at is NULL - updates["closed_at"] = nil - } -} -``` - -### Part 4: Add Reopen Command - -Create `cmd/bd/reopen.go`: - -```go -var reopenCmd = &cobra.Command{ - Use: "reopen [id...]", - Short: "Reopen one or more closed issues", - Args: cobra.MinimumNArgs(1), - Run: func(cmd *cobra.Command, args []string) { - ctx := context.Background() - reason, _ := cmd.Flags().GetString("reason") - if reason == "" { - reason = "Reopened" - } - - for _, id := range args { - // Use UpdateIssue which now handles closed_at automatically - updates := map[string]interface{}{ - "status": "open", - } - if err := store.UpdateIssue(ctx, id, updates, getUser()); err != nil { - fmt.Fprintf(os.Stderr, "Error reopening %s: %v\n", id, err) - continue - } - - // Add comment explaining why - if reason != "" { - store.AddComment(ctx, id, getUser(), reason) - } - } - - markDirtyAndScheduleFlush() - }, -} - -func init() { - reopenCmd.Flags().StringP("reason", "r", "", "Reason for reopening") - rootCmd.AddCommand(reopenCmd) -} -``` - ---- - -## Why This Solution Wins - -### For Individual Devs & Small Teams -- **Simple mental model**: `closed_at` is set ⟺ issue is closed -- **Hard to break**: DB constraint catches mistakes -- **Explicit reopen**: `bd reopen bd-89` is clearer than `bd update bd-89 --status open` -- **No manual management**: Don't think about closed_at, it's automatic - -### For Agent Swarms -- **Robust**: DB constraint prevents any agent from creating inconsistent state -- **Race-safe**: Constraint is atomic, checked at commit time -- **Self-healing**: UpdateIssue automatically fixes the relationship -- **Import-safe**: Can't import inconsistent JSONL (constraint rejects it) - -### For New Codebase -- **Can break things**: Migration is easy with few users -- **Sets precedent**: Shows we value data integrity -- **Future-proof**: New features can't violate the invariant -- **Performance**: No query changes needed, closed_at stays fast - ---- - -## Migration Plan - -### Step 1: Clean Existing Data -```sql --- Find inconsistent issues -SELECT id, status, closed_at FROM issues -WHERE (status = 'closed') != (closed_at IS NOT NULL); - --- Fix them (choose one strategy) --- Strategy A: Trust status field -UPDATE issues SET closed_at = NULL -WHERE status != 'closed' AND closed_at IS NOT NULL; - -UPDATE issues SET closed_at = CURRENT_TIMESTAMP -WHERE status = 'closed' AND closed_at IS NULL; - --- Strategy B: Trust closed_at field -UPDATE issues SET status = 'closed' -WHERE status != 'closed' AND closed_at IS NOT NULL; - -UPDATE issues SET status = 'open' -WHERE status = 'closed' AND closed_at IS NULL; -``` - -**Recommendation**: Use Strategy A (trust status) since status is more often correct. - -### Step 2: Add Constraint -```sql --- Test first -SELECT id FROM issues -WHERE (status = 'closed') != (closed_at IS NOT NULL); --- Should return 0 rows - --- Add constraint -ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status - CHECK ((status = 'closed') = (closed_at IS NOT NULL)); -``` - -### Step 3: Update Code -1. Modify UpdateIssue (sqlite.go) -2. Modify Import (import.go) -3. Add reopen command -4. Add migration function to schema.go - -### Step 4: Test -```bash -# Test the constraint rejects bad writes -sqlite3 .beads/bd.db "UPDATE issues SET closed_at = NULL WHERE id = 'bd-1' AND status = 'closed';" -# Should fail with constraint violation - -# Test update handles it automatically -bd update bd-89 --status open -bd show bd-89 --json | jq '{status, closed_at}' -# Should show: {"status": "open", "closed_at": null} - -# Test reopen -bd create "Test issue" -p 1 -bd close bd-226 -bd reopen bd-226 --reason "Need more work" -# Should work without errors -``` - ---- - -## Alternative Considered: Soft Close with closed_at History - -Keep closed_at as "first time closed" and use events for current state. - -**Why rejected**: -- More complex model (two sources of truth) -- Unclear semantics (what does closed_at mean?) -- Lead time calculation becomes ambiguous (first close? most recent?) -- Doesn't simplify the problem - ---- - -## Conclusion - -The **hybrid approach** (DB constraint + smart UpdateIssue + import enforcement + reopen command) is the best solution because: - -1. **Defense in depth**: Multiple layers prevent inconsistency -2. **Hard to break**: Perfect for agent swarms -3. **Simple for users**: Automatic management of closed_at -4. **Low migration cost**: Can break things in new codebase -5. **Clear semantics**: closed_at = "currently closed" (not historical) -6. **Performance**: No query changes needed - -The DB constraint is the key insight: it makes the system robust against future bugs, new code paths, and agent mistakes. Combined with smart application code, it creates a self-healing system that's hard to misuse. - -This aligns perfectly with beads' goals: simple for individual devs, robust for agent swarms.