removed obsolete docs

2025-10-15 21:16:52 -07:00
parent dfc62054f1
commit 61d70f208f
2 changed files with 0 additions and 1325 deletions
@@ -1,952 +0,0 @@
-# Ultrathink: Batching API for Bulk Issue Creation (bd-222)
-
-**Date**: 2025-10-15
-**Context**: Individual devs, small teams, future agent swarms, bulk imports
-**Problem**: CreateIssue acquires dedicated connection per call, inefficient for bulk operations
-
-## Executive Summary
-
-**Recommended Solution**: **Hybrid approach - Add CreateIssues + Keep existing CreateIssue unchanged**
-
-Provides high-performance batch path for bulk operations while maintaining simple single-issue API for typical use.
-
---
-
-## Dependencies & Implementation Order
-
-### Critical Dependency: bd-224 (status/closed_at invariant)
-
-**bd-224 MUST be implemented before bd-222**
-
-**Why**: Both issues modify the same code paths:
- bd-224: Fixes `import.go` to enforce `closed_at` invariant (status='closed' ⟺ closed_at != NULL)
- bd-222: Changes `import.go` to use `CreateIssues` instead of `CreateIssue` loop
-
-**The Problem**:
-If we implement bd-222 first:
-1. `CreateIssues` won't enforce the closed_at invariant (inherits bug from CreateIssue)
-2. Import switches to use `CreateIssues`
-3. Import can still create inconsistent data (bd-224's bug persists)
-4. Later bd-224 fix requires modifying BOTH CreateIssue AND CreateIssues
-
-**The Solution**:
-If we implement bd-224 first:
-1. Add CHECK constraint: `(status = 'closed') = (closed_at IS NOT NULL)`
-2. Fix `UpdateIssue` to manage closed_at automatically
-3. Fix `import.go` to enforce invariant before calling CreateIssue
-4. **Then** implement bd-222's `CreateIssues` with invariant already enforced:
-   - Database constraint rejects bad data
-   - Issue.Validate() checks the invariant (per bd-224)
-   - Import code already normalizes before calling CreateIssues
-   - No new code needed in CreateIssues - it's correct by construction!
-
-### Implementation Impact
-
-**CreateIssues must validate closed_at invariant** (from bd-224):
-
-```go
-// Phase 1: Validation
-for i, issue := range issues {
-    if err := issue.Validate(); err != nil {  // ← Validates invariant (bd-224)
-        return fmt.Errorf("validation failed for issue %d: %w", i, err)
-    }
-}
-```
-
-After bd-224 is complete, `Issue.Validate()` will check:
-```go
-func (i *Issue) Validate() error {
-    // ... existing validation ...
-
-    // Enforce closed_at invariant (bd-224)
-    if i.Status == StatusClosed && i.ClosedAt == nil {
-        return fmt.Errorf("closed issues must have closed_at timestamp")
-    }
-    if i.Status != StatusClosed && i.ClosedAt != nil {
-        return fmt.Errorf("non-closed issues cannot have closed_at timestamp")
-    }
-
-    return nil
-}
-```
-
-This means `CreateIssues` automatically enforces the invariant through validation, with the database CHECK constraint as final defense.
-
-### Import Code Simplification
-
-**Before bd-224** (current import.go):
-```go
-for _, issue := range issues {
-    // Complex logic to handle status/closed_at independently
-    updates := make(map[string]interface{})
-    if _, ok := rawData["status"]; ok {
-        updates["status"] = issue.Status  // ← Doesn't manage closed_at
-    }
-    // ... more complex update logic
-    store.CreateIssue(ctx, issue, "import")
-}
-```
-
-**After bd-224** (import.go enforces invariant):
-```go
-for _, issue := range issues {
-    // Normalize closed_at based on status BEFORE creating
-    if issue.Status == types.StatusClosed {
-        if issue.ClosedAt == nil {
-            now := time.Now()
-            issue.ClosedAt = &now
-        }
-    } else {
-        issue.ClosedAt = nil  // ← Clear if not closed
-    }
-    store.CreateIssue(ctx, issue, "import")
-}
-```
-
-**After bd-222** (import.go uses batch):
-```go
-// Normalize all issues
-for _, issue := range issues {
-    if issue.Status == types.StatusClosed {
-        if issue.ClosedAt == nil {
-            now := time.Now()
-            issue.ClosedAt = &now
-        }
-    } else {
-        issue.ClosedAt = nil
-    }
-}
-
-// Single batch call (5-15x faster!)
-store.CreateIssues(ctx, issues, "import")
-```
-
-Much simpler: normalize once, call batch API, database constraint enforces correctness.
-
-### Recommended Implementation Sequence
-
-1. ✅ **Implement bd-224 first** (P1 bug fix)
-   - Add database CHECK constraint
-   - Add validation to `Issue.Validate()`
-   - Fix `UpdateIssue` to auto-manage closed_at
-   - Fix `import.go` to normalize closed_at before creating
-
-2. ✅ **Then implement bd-222** (P2 performance enhancement)
-   - Add `CreateIssues` method (inherits bd-224's validation)
-   - Update `import.go` to use `CreateIssues`
-   - Import code is simpler (no per-issue loop, just normalize + batch)
-
-3. ✅ **Benefits of this order**:
-   - bd-224 fixes data integrity bug (higher priority)
-   - bd-222 builds on correct foundation
-   - No duplicate invariant enforcement code
-   - Database constraint + validation = defense in depth
-   - CreateIssues is correct by construction
-
---
-
-## Current State Analysis
-
-### How CreateIssue Works (sqlite.go:315-453)
-
-```go
-func (s *SQLiteStorage) CreateIssue(ctx, issue, actor) error {
-    // 1. Acquire dedicated connection
-    conn, err := s.db.Conn(ctx)
-    defer conn.Close()
-
-    // 2. BEGIN IMMEDIATE transaction (acquires write lock)
-    conn.ExecContext(ctx, "BEGIN IMMEDIATE")
-
-    // 3. Generate ID atomically if needed
-    //    - Query issue_counters
-    //    - Update counter with MAX(existing, calculated) + 1
-
-    // 4. Insert issue
-    // 5. Record creation event
-    // 6. Mark dirty for export
-    // 7. COMMIT
-}
-```
-
-### Performance Characteristics
-
-**Single Issue Creation**:
- Connection acquisition: ~1ms
- BEGIN IMMEDIATE: ~1-5ms (lock acquisition)
- ID generation: ~2-3ms (subquery + update)
- Insert + event + dirty: ~2-3ms
- COMMIT: ~1-2ms
- **Total: ~7-14ms per issue**
-
-**Bulk Creation (100 issues, sequential)**:
- 100 connections: ~100ms
- 100 transactions: ~100-500ms (lock contention!)
- 100 ID generations: ~200-300ms
- 100 inserts: ~200-300ms
- **Total: ~600ms-1.2s**
-
-**With Batching (estimated)**:
- 1 connection: ~1ms
- 1 transaction: ~1-5ms
- ID generation batch: ~10-20ms (one query for range)
- Bulk insert: ~50-100ms (prepared stmt, multiple VALUES)
- **Total: ~60-130ms (5-10x faster)**
-
-### When Does This Matter?
-
-**Low Impact** (current approach is fine):
- Interactive CLI use: `bd create "Fix bug"`
- Individual agent creating 1-5 issues
- Typical development workflow
-
-**High Impact** (batching helps):
- ✅ Bulk import from JSONL (10-1000+ issues)
- ✅ Agent workflows generating issue decompositions (10-50 issues)
- ✅ Migrating from other systems (100-10000+ issues)
- ✅ Template instantiation (creating epic + subtasks)
- ✅ Test data generation
-
---
-
-## Solution Options
-
-### Option A: Simple All-or-Nothing Batch ⭐ **RECOMMENDED**
-
-```go
-// CreateIssues creates multiple issues atomically in a single transaction
-func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error
-```
-
-**Semantics**:
- All issues created, or none created (atomicity)
- Single transaction, single connection
- Returns error if ANY issue fails validation or insertion
- IDs generated atomically as a range
-
-**Pros**:
- ✅ Simple mental model (atomic batch)
- ✅ Clear error handling (one error = whole batch fails)
- ✅ Matches database transaction semantics
- ✅ Easy to implement (similar to CreateIssue)
- ✅ No partial state in database
- ✅ Safe for concurrent access (IMMEDIATE transaction)
- ✅ **5-10x faster for bulk operations**
-
-**Cons**:
- ⚠️ If one issue is invalid, whole batch fails
- ⚠️ Caller must retry entire batch on error
- ⚠️ No indication of WHICH issue failed
-
-**Mitigation**: Add validation-only mode to pre-check batch
-
-**Verdict**: Best for most use cases (import, migrations, agent workflows)
-
-### Option B: Partial Success with Error Details
-
-```go
-type CreateResult struct {
-    ID      string
-    Error   error
-}
-
-func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) ([]CreateResult, error)
-```
-
-**Semantics**:
- Best-effort creation
- Returns results for each issue (ID or error)
- Transaction commits even if some issues fail
- Complex rollback semantics
-
-**Pros**:
- ✅ Caller knows exactly which issues failed
- ✅ Partial progress on errors
- ✅ Good for unreliable input data
-
-**Cons**:
- ❌ **Complex transaction semantics**: Which failures abort transaction?
- ❌ **Partial state in database**: Caller must track what succeeded
- ❌ **ID generation complexity**: Skip failed issues in counter?
- ❌ **Dirty tracking complexity**: Which issues to mark dirty?
- ❌ **Event recording**: Record events for succeeded issues only?
- ❌ More complex API for common case
- ❌ Caller must handle partial state
-
-**Verdict**: Too complex, doesn't match database atomicity model
-
-### Option C: Batch with Configurable Strategy
-
-```go
-type BatchOptions struct {
-    FailFast        bool  // Stop on first error (default)
-    ContinueOnError bool  // Best effort
-    ValidateOnly    bool  // Dry run
-}
-
-func (s *SQLiteStorage) CreateIssues(ctx, issues, actor, opts) ([]CreateResult, error)
-```
-
-**Pros**:
- ✅ Flexible for different use cases
- ✅ Can support both atomic and partial modes
-
-**Cons**:
- ❌ **Too much complexity** for the benefit
- ❌ Multiple code paths = more bugs
- ❌ Unclear which mode to use when
- ❌ Doesn't solve the core problem (connection overhead)
-
-**Verdict**: Over-engineered for current needs
-
-### Option D: Internal Optimization Only (No API Change)
-
-Optimize CreateIssue internally to batch operations without changing API.
-
-**Approach**:
- Connection pooling improvements
- Prepared statement caching
- WAL optimization
-
-**Pros**:
- ✅ No API changes
- ✅ Benefits all callers automatically
-
-**Cons**:
- ❌ **Can't eliminate transaction overhead** (still N transactions)
- ❌ **Can't eliminate ID generation overhead** (still N counter updates)
- ❌ **Limited improvement** (maybe 20-30% faster, not 5-10x)
- ❌ Doesn't address root cause
-
-**Verdict**: Good to do anyway, but doesn't solve the problem
-
---
-
-## Recommended Solution: **Simple All-or-Nothing Batch (Option A)**
-
-### API Design
-
-```go
-// CreateIssues creates multiple issues atomically in a single transaction.
-// All issues are created or none are created. Returns error if any issue
-// fails validation or insertion.
-//
-// Performance: ~10x faster than calling CreateIssue in a loop for large batches.
-// Use this for bulk imports, migrations, or agent workflows creating many issues.
-//
-// Issues with empty IDs will have IDs generated atomically. Issues with
-// explicit IDs are used as-is (caller responsible for avoiding collisions).
-func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error
-```
-
-### Implementation Strategy
-
-#### Phase 1: Validation
-
-```go
-// Validate all issues first (fail-fast)
-for i, issue := range issues {
-    if err := issue.Validate(); err != nil {
-        return fmt.Errorf("validation failed for issue %d: %w", i, err)
-    }
-}
-```
-
-#### Phase 2: Connection & Transaction
-
-```go
-// Acquire dedicated connection (same as CreateIssue)
-conn, err := s.db.Conn(ctx)
-if err != nil {
-    return fmt.Errorf("failed to acquire connection: %w", err)
-}
-defer conn.Close()
-
-// BEGIN IMMEDIATE (same as CreateIssue)
-if _, err := conn.ExecContext(ctx, "BEGIN IMMEDIATE"); err != nil {
-    return fmt.Errorf("failed to begin immediate transaction: %w", err)
-}
-
-committed := false
-defer func() {
-    if !committed {
-        conn.ExecContext(context.Background(), "ROLLBACK")
-    }
-}()
-```
-
-#### Phase 3: Batch ID Generation
-
-**Key Insight**: Generate ID range atomically, then assign sequentially
-
-```go
-// Count how many issues need IDs
-needIDCount := 0
-for _, issue := range issues {
-    if issue.ID == "" {
-        needIDCount++
-    }
-}
-
-// Generate ID range atomically (if needed)
-var nextID int
-var prefix string
-if needIDCount > 0 {
-    // Get prefix from config
-    err := conn.QueryRowContext(ctx,
-        `SELECT value FROM config WHERE key = ?`,
-        "issue_prefix").Scan(&prefix)
-    if err == sql.ErrNoRows || prefix == "" {
-        prefix = "bd"
-    } else if err != nil {
-        return fmt.Errorf("failed to get config: %w", err)
-    }
-
-    // Atomically reserve ID range: [nextID, nextID+needIDCount)
-    // This is the KEY optimization - one counter update instead of N
-    err = conn.QueryRowContext(ctx, `
-        INSERT INTO issue_counters (prefix, last_id)
-        SELECT ?, COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0) + ?
-        FROM issues
-        WHERE id LIKE ? || '-%'
-          AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*'
-        ON CONFLICT(prefix) DO UPDATE SET
-            last_id = MAX(
-                last_id,
-                (SELECT COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0)
-                 FROM issues
-                 WHERE id LIKE ? || '-%'
-                   AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*')
-            ) + ?
-        RETURNING last_id
-    `, prefix, prefix, needIDCount, prefix, prefix, prefix, prefix, prefix, needIDCount).Scan(&nextID)
-    if err != nil {
-        return fmt.Errorf("failed to generate ID range: %w", err)
-    }
-
-    // Assign IDs sequentially
-    currentID := nextID - needIDCount + 1
-    for i := range issues {
-        if issues[i].ID == "" {
-            issues[i].ID = fmt.Sprintf("%s-%d", prefix, currentID)
-            currentID++
-        }
-    }
-}
-```
-
-#### Phase 4: Bulk Insert Issues
-
-**Two approaches**:
-
-**Approach A: Prepared Statement + Loop** (simpler, still fast)
-```go
-stmt, err := conn.PrepareContext(ctx, `
-    INSERT INTO issues (
-        id, title, description, design, acceptance_criteria, notes,
-        status, priority, issue_type, assignee, estimated_minutes,
-        created_at, updated_at, closed_at, external_ref
-    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-`)
-if err != nil {
-    return fmt.Errorf("failed to prepare statement: %w", err)
-}
-defer stmt.Close()
-
-now := time.Now()
-for _, issue := range issues {
-    issue.CreatedAt = now
-    issue.UpdatedAt = now
-
-    _, err = stmt.ExecContext(ctx,
-        issue.ID, issue.Title, issue.Description, issue.Design,
-        issue.AcceptanceCriteria, issue.Notes, issue.Status,
-        issue.Priority, issue.IssueType, issue.Assignee,
-        issue.EstimatedMinutes, issue.CreatedAt, issue.UpdatedAt,
-        issue.ClosedAt, issue.ExternalRef,
-    )
-    if err != nil {
-        return fmt.Errorf("failed to insert issue %s: %w", issue.ID, err)
-    }
-}
-```
-
-**Approach B: Multi-VALUE INSERT** (fastest, more complex)
-```go
-// Build multi-value INSERT
-// INSERT INTO issues VALUES (...), (...), (...)
-// More complex string building but ~2x faster for large batches
-// Defer to performance testing phase
-```
-
-**Decision**: Start with Approach A (prepared statement), optimize to Approach B if benchmarks show need
-
-#### Phase 5: Bulk Record Events
-
-```go
-// Prepare event statement
-eventStmt, err := conn.PrepareContext(ctx, `
-    INSERT INTO events (issue_id, event_type, actor, new_value)
-    VALUES (?, ?, ?, ?)
-`)
-if err != nil {
-    return fmt.Errorf("failed to prepare event statement: %w", err)
-}
-defer eventStmt.Close()
-
-for _, issue := range issues {
-    eventData, err := json.Marshal(issue)
-    if err != nil {
-        eventData = []byte(fmt.Sprintf(`{"id":"%s","title":"%s"}`, issue.ID, issue.Title))
-    }
-
-    _, err = eventStmt.ExecContext(ctx, issue.ID, types.EventCreated, actor, string(eventData))
-    if err != nil {
-        return fmt.Errorf("failed to record event for %s: %w", issue.ID, err)
-    }
-}
-```
-
-#### Phase 6: Bulk Mark Dirty
-
-```go
-// Bulk insert dirty markers
-dirtyStmt, err := conn.PrepareContext(ctx, `
-    INSERT INTO dirty_issues (issue_id, marked_at)
-    VALUES (?, ?)
-    ON CONFLICT (issue_id) DO UPDATE SET marked_at = excluded.marked_at
-`)
-if err != nil {
-    return fmt.Errorf("failed to prepare dirty statement: %w", err)
-}
-defer dirtyStmt.Close()
-
-dirtyTime := time.Now()
-for _, issue := range issues {
-    _, err = dirtyStmt.ExecContext(ctx, issue.ID, dirtyTime)
-    if err != nil {
-        return fmt.Errorf("failed to mark dirty %s: %w", issue.ID, err)
-    }
-}
-```
-
-#### Phase 7: Commit
-
-```go
-if _, err := conn.ExecContext(ctx, "COMMIT"); err != nil {
-    return fmt.Errorf("failed to commit transaction: %w", err)
-}
-committed = true
-return nil
-```
-
---
-
-## Design Decisions & Tradeoffs
-
-### Decision 1: All-or-Nothing Atomicity ✅
-
-**Rationale**: Matches database transaction semantics, simpler mental model
-
-**Tradeoff**: Batch fails if ANY issue is invalid
- **Mitigation**: Pre-validate all issues before starting transaction
- **Alternative**: Caller can retry with smaller batches or individual issues
-
-### Decision 2: Same Transaction Semantics as CreateIssue ✅
-
-Use BEGIN IMMEDIATE, not DEFERRED or EXCLUSIVE
-
-**Rationale**:
- Consistency with existing CreateIssue
- Prevents race conditions in ID generation
- Serializes batch operations (which is fine - they're rare)
-
-**Tradeoff**: Batches serialize (only one concurrent batch writer)
- **Impact**: Low - batch operations are rare (import, migration)
- **Benefit**: Simple, correct, no race conditions
-
-### Decision 3: Atomic ID Range Reservation ✅
-
-Generate range [nextID, nextID+N) in single counter update
-
-**Rationale**: KEY optimization - avoids N counter updates
-
-**Implementation**:
-```sql
-- Old approach (CreateIssue): N updates
-UPDATE issue_counters SET last_id = last_id + 1 RETURNING last_id;  -- N times
-
-- New approach (CreateIssues): 1 update
-UPDATE issue_counters SET last_id = last_id + N RETURNING last_id;  -- Once
-```
-
-**Correctness**: Safe because BEGIN IMMEDIATE serializes batches
-
-### Decision 4: Support Mixed ID Assignment ✅
-
-Some issues can have explicit IDs, others auto-generated
-
-**Use Case**: Import with some external IDs, some new issues
-
-```go
-issues := []*Issue{
-    {ID: "ext-123", Title: "External issue"},  // Keep ID
-    {ID: "", Title: "New issue"},               // Generate ID
-    {ID: "bd-999", Title: "Explicit ID"},      // Keep ID
-}
-```
-
-**Rationale**: Flexible for import scenarios
-
-**Complexity**: Low - just count issues needing IDs
-
-### Decision 5: Prepared Statements Over Multi-VALUE INSERT ✅
-
-Start with prepared statement loop, optimize later if needed
-
-**Rationale**:
- Simpler implementation
- Still much faster than N transactions (5-10x)
- Multi-VALUE INSERT only ~2x faster than prepared stmt
- Can optimize later if profiling shows need
-
-### Decision 6: Keep CreateIssue Unchanged ✅
-
-Don't modify existing CreateIssue implementation
-
-**Rationale**:
- Backward compatibility
- No risk to existing callers
- Additive change only
- Different use cases (single vs batch)
-
---
-
-## When to Use Which API
-
-### Use CreateIssue (existing)
-
- ✅ Interactive CLI: `bd create "Title"`
- ✅ Single issue creation
- ✅ Agent creating 1-3 issues
- ✅ When simplicity matters
- ✅ When you want per-issue error handling
-
-### Use CreateIssues (new)
-
- ✅ Bulk import from JSONL (10-1000+ issues)
- ✅ Migration from other systems (100-10000+ issues)
- ✅ Agent decomposing work into 10-50 issues
- ✅ Template instantiation (epic + subtasks)
- ✅ Test data generation
- ✅ When performance matters
-
-**Rule of Thumb**: Use CreateIssues for N > 5 issues
-
---
-
-## Implementation Checklist
-
-### Phase 1: Core Implementation ✅
- [ ] Add `CreateIssues` to Storage interface (storage/storage.go)
- [ ] Implement SQLiteStorage.CreateIssues (storage/sqlite/sqlite.go)
- [ ] Add comprehensive unit tests
- [ ] Add concurrency tests (multiple batch writers)
- [ ] Add performance benchmarks
-
-### Phase 2: CLI Integration
- [ ] Add `bd create-batch` command (or internal use only?)
- [ ] Update import.go to use CreateIssues for bulk imports
- [ ] Test with real JSONL imports
-
-### Phase 3: Documentation
- [ ] Document CreateIssues API (godoc)
- [ ] Add batch import example
- [ ] Update EXTENDING.md with batch usage
- [ ] Performance notes in README
-
-### Phase 4: Optimization (if needed)
- [ ] Profile CreateIssues with 100, 1000, 10000 issues
- [ ] Optimize to multi-VALUE INSERT if needed
- [ ] Consider batch size limits (split large batches)
-
---
-
-## Testing Strategy
-
-### Unit Tests
-
-```go
-func TestCreateIssues_Empty(t *testing.T)
-func TestCreateIssues_Single(t *testing.T)
-func TestCreateIssues_Multiple(t *testing.T)
-func TestCreateIssues_WithExplicitIDs(t *testing.T)
-func TestCreateIssues_MixedIDs(t *testing.T)
-func TestCreateIssues_ValidationError(t *testing.T)
-func TestCreateIssues_DuplicateID(t *testing.T)
-func TestCreateIssues_RollbackOnError(t *testing.T)
-```
-
-### Concurrency Tests
-
-```go
-func TestCreateIssues_Concurrent(t *testing.T) {
-    // 10 goroutines each creating 100 issues
-    // Verify no ID collisions
-    // Verify all issues created
-}
-
-func TestCreateIssues_MixedWithCreateIssue(t *testing.T) {
-    // Concurrent CreateIssue + CreateIssues
-    // Verify no ID collisions
-}
-```
-
-### Performance Benchmarks
-
-```go
-func BenchmarkCreateIssue_Sequential(b *testing.B)
-func BenchmarkCreateIssues_Batch(b *testing.B)
-
-// Expected results (100 issues):
-// CreateIssue x100:  ~600-1200ms
-// CreateIssues:      ~60-130ms
-// Speedup:           5-10x
-```
-
-### Integration Tests
-
-```go
-func TestImport_LargeJSONL(t *testing.T) {
-    // Import 1000 issues from JSONL
-    // Verify all created correctly
-    // Verify performance < 1s
-}
-```
-
---
-
-## Migration Plan
-
-### Step 1: Add Interface Method (Non-Breaking)
-
-```go
-// storage/storage.go
-type Storage interface {
-    CreateIssue(ctx context.Context, issue *types.Issue, actor string) error
-    CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error  // NEW
-    // ... rest unchanged
-}
-```
-
-### Step 2: Implement SQLiteStorage.CreateIssues
-
-Follow implementation strategy above
-
-### Step 3: Add Tests
-
-Comprehensive unit + concurrency + benchmark tests
-
-### Step 4: Update Import (Optional)
-
-```go
-// cmd/bd/import.go - replace loop with batch
-func importIssues(store Storage, issues []*Issue) error {
-    // Old:
-    // for _, issue := range issues {
-    //     store.CreateIssue(ctx, issue, "import")
-    // }
-
-    // New:
-    return store.CreateIssues(ctx, issues, "import")
-}
-```
-
-**Note**: Start with internal use (import), expose CLI later if needed
-
-### Step 5: Performance Testing
-
-```bash
-# Generate test JSONL
-bd export > backup.jsonl
-
-# Duplicate 100x for stress test
-cat backup.jsonl backup.jsonl ... > large_test.jsonl
-
-# Test import performance
-time bd import large_test.jsonl
-```
-
---
-
-## Future Enhancements (NOT for bd-222)
-
-### Batch Size Limits
-
-If very large batches cause memory issues:
-
-```go
-func (s *SQLiteStorage) CreateIssues(ctx, issues, actor) error {
-    const maxBatchSize = 1000
-
-    for i := 0; i < len(issues); i += maxBatchSize {
-        end := min(i+maxBatchSize, len(issues))
-        batch := issues[i:end]
-
-        if err := s.createIssuesBatch(ctx, batch, actor); err != nil {
-            return fmt.Errorf("batch %d-%d failed: %w", i, end, err)
-        }
-    }
-    return nil
-}
-```
-
-**Decision**: Don't implement until we see issues with large batches (>1000)
-
-### Validation-Only Mode
-
-Pre-validate batch without creating:
-
-```go
-func (s *SQLiteStorage) ValidateIssues(ctx, issues) error
-```
-
-**Use Case**: Dry-run before bulk import
-
-**Decision**: Add if import workflows request it
-
-### Progress Callbacks
-
-Report progress for long-running batches:
-
-```go
-type BatchProgress func(completed, total int)
-
-func (s *SQLiteStorage) CreateIssuesWithProgress(ctx, issues, actor, progress) error
-```
-
-**Decision**: Add if agent workflows request it (likely for 1000+ issue batches)
-
---
-
-## Performance Analysis
-
-### Baseline (CreateIssue loop)
-
-For 100 issues:
-```
-Connection overhead:  100ms   (1ms × 100)
-Transaction overhead: 300ms   (3ms × 100, with lock contention)
-ID generation:        250ms   (2.5ms × 100)
-Insert + event:       250ms   (2.5ms × 100)
-Total:                900ms
-```
-
-### With CreateIssues
-
-For 100 issues:
-```
-Connection overhead:   1ms    (1 connection)
-Transaction overhead:  5ms    (1 transaction)
-ID range generation:   15ms   (1 query, more complex)
-Bulk insert (prep):    50ms   (prepared stmt × 100)
-Bulk events (prep):    30ms   (prepared stmt × 100)
-Bulk dirty (prep):     20ms   (prepared stmt × 100)
-Commit:                5ms
-Total:                 126ms  (7x faster)
-```
-
-### Scalability
-
-| Issues | CreateIssue Loop | CreateIssues | Speedup |
-|--------|------------------|--------------|---------|
-| 10     | 90ms            | 30ms         | 3x      |
-| 100    | 900ms           | 126ms        | 7x      |
-| 1000   | 9s              | 800ms        | 11x     |
-| 10000  | 90s             | 6s           | 15x     |
-
-**Key Insight**: Speedup increases with batch size due to fixed overhead amortization
-
---
-
-## Why This Solution Wins
-
-### For Individual Devs & Small Teams
- **Zero impact on normal workflow**: CreateIssue unchanged
- **Fast imports**: 1000 issues in <1s instead of 10s
- **Simple mental model**: All-or-nothing batch
- **No new concepts**: Same semantics as CreateIssue, just faster
-
-### For Agent Swarms
- **Efficient decomposition**: Agent creates 50 subtasks in one call
- **Atomic work generation**: All issues created or none
- **No connection exhaustion**: One connection per batch
- **Safe concurrency**: BEGIN IMMEDIATE prevents races
-
-### For New Codebase
- **Non-breaking change**: Additive API only
- **Performance win**: 5-15x faster for bulk operations
- **Simple implementation**: ~200 LOC, similar to CreateIssue
- **Battle-tested pattern**: Same transaction semantics as CreateIssue
-
---
-
-## Alternatives Considered and Rejected
-
-### Alternative 1: Auto-Batch in CreateIssue
-
-Automatically detect rapid CreateIssue calls and batch them.
-
-**Why Rejected**:
- ❌ Magical behavior (implicit batching)
- ❌ Complex implementation (goroutine + timer + coordination)
- ❌ Race conditions and edge cases
- ❌ Unpredictable performance (when does batch trigger?)
- ❌ Can't guarantee atomicity across auto-batch boundary
-
-### Alternative 2: Separate Import API
-
-Add ImportIssues specifically for JSONL import, not general-purpose.
-
-**Why Rejected**:
- ❌ Limits use cases (what about agent workflows?)
- ❌ Name doesn't match behavior (it's just batch create)
- ❌ CreateIssues is more discoverable and general
-
-### Alternative 3: Streaming API
-
-```go
-type IssueStream interface {
-    Send(*Issue) error
-    CloseAndCommit() error
-}
-func (s *SQLiteStorage) CreateIssueStream(ctx, actor) (IssueStream, error)
-```
-
-**Why Rejected**:
- ❌ More complex API (stateful stream object)
- ❌ Error handling complexity (partial writes?)
- ❌ Doesn't match Go/SQL idioms
- ❌ Caller must manage stream lifecycle
- ❌ Simple slice is easier to work with
-
---
-
-## Conclusion
-
-The **simple all-or-nothing batch API** (CreateIssues) is the best solution because:
-
-1. **Significant performance win**: 5-15x faster for bulk operations
-2. **Simple API**: Just like CreateIssue but with slice
-3. **Safe**: Atomic transaction, no partial state
-4. **Non-breaking**: Existing CreateIssue unchanged
-5. **Flexible**: Supports mixed ID assignment (auto + explicit)
-6. **Proven pattern**: Same transaction semantics as CreateIssue
-
-The key insight is **atomic ID range reservation** - updating the counter once for N issues instead of N times. Combined with a single transaction and prepared statements, this provides major performance improvements without complexity.
-
-This aligns perfectly with beads' goals: simple for individual devs, efficient for bulk operations, robust for agent swarms.
-
-**Implementation size**: ~200 LOC + ~400 LOC tests = manageable, low-risk change
-**Expected performance**: 5-15x faster for bulk operations (N > 10)
-**Risk**: Low (additive API, comprehensive tests)
@@ -1,373 +0,0 @@
-# Ultrathink: Solving status/closed_at Inconsistency (bd-224)
-
-**Date**: 2025-10-15
-**Context**: Individual devs, small teams, future agent swarms, new codebase
-**Problem**: Data model allows `status='open'` with `closed_at != NULL` (liminal state)
-
-## Executive Summary
-
-**Recommended Solution**: **Hybrid approach - Database CHECK constraint + Application enforcement**
-
-This provides defense-in-depth perfect for agent swarms while keeping the model simple for individual developers.
-
---
-
-## Current State Analysis
-
-### Where closed_at is Used
-
-1. **GetIssue**: Returns it to callers
-2. **CloseIssue**: Sets `closed_at = now()` when closing
-3. **SearchIssues**: Includes in results
-4. **GetStatistics** ⚠️ **CRITICAL**:
-   ```sql
-   SELECT AVG((julianday(closed_at) - julianday(created_at)) * 24)
-   FROM issues
-   WHERE closed_at IS NOT NULL
-   ```
-   Uses `closed_at IS NOT NULL` NOT `status='closed'` for lead time calculation!
-
-### Impact of Inconsistency
-
- **Statistics are wrong**: Issues with `status='open'` but `closed_at != NULL` pollute lead time metrics
- **User confusion**: bd ready shows "closed" issues
- **Agent workflows**: Unpredictable behavior when agents query by status vs closed_at
- **Data integrity**: Can't trust the data model
-
-### Root Cause: Import & Update Don't Manage Invariant
-
-**Import** (cmd/bd/import.go:206-207):
-```go
-if _, ok := rawData["status"]; ok {
-    updates["status"] = issue.Status  // ← Updates status
-}
-// ⚠️ Does NOT clear/set closed_at
-```
-
-**UpdateIssue** (internal/storage/sqlite/sqlite.go:509-624):
- Updates any field in the map
- Does NOT automatically manage closed_at when status changes
- Records EventClosed but doesn't enforce the invariant
-
-**Concrete example (bd-89)**:
-1. Issue closed properly: `status='closed'`, `closed_at='2025-10-15 08:13:08'`
-2. JSONL had old state: `status='open'`, `closed_at='2025-10-14 02:58:22'` (inconsistent!)
-3. Auto-import updated status to 'open' but left closed_at set
-4. Result: Inconsistent state in database
-
---
-
-## Solution Options
-
-### Option A: Database CHECK Constraint ⭐ **RECOMMENDED FOUNDATION**
-
-```sql
-ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
-  CHECK ((status = 'closed') = (closed_at IS NOT NULL));
-```
-
-**Pros:**
- ✅ Enforces invariant at database level (most robust)
- ✅ Catches bugs in ANY code path (future-proof)
- ✅ Works across all clients (CLI, MCP, future integrations)
- ✅ Simple to understand for developers
- ✅ **Perfect for agent swarms**: Can't break it with buggy code
- ✅ Prevents inconsistent exports (can't write bad data to JSONL)
-
-**Cons:**
- ⚠️ Requires migration
- ⚠️ Need to fix existing inconsistent data first
- ⚠️ Update operations must manage closed_at (but we should do this anyway!)
-
-**Migration complexity**: LOW - few users, can break things
-
-### Option B: Application-Level Enforcement Only
-
-Make UpdateIssue and Import smart about status changes.
-
-**Pros:**
- ✅ No schema change needed
- ✅ Flexible for edge cases
-
-**Cons:**
- ❌ Easy to forget in new code paths
- ❌ Doesn't protect against direct SQL manipulation
- ❌ Multiple places to maintain (import, update, close, etc.)
- ❌ **Bad for agent swarms**: One buggy agent breaks the model
- ❌ Still allows export of inconsistent data
-
-**Verdict**: Not robust enough alone
-
-### Option C: Add Explicit Reopened Support
-
-Add `bd reopen` command that uses EventReopened and manages closed_at.
-
-**Pros:**
- ✅ Makes reopening explicit and trackable
- ✅ EventReopened already defined (types.go:150) but unused
-
-**Cons:**
- ⚠️ Doesn't solve the fundamental invariant problem
- ⚠️ Still need to decide: clear closed_at or keep it?
- ⚠️ More complex model if we keep historical closed_at
-
-**Verdict**: Good addition, but doesn't solve root cause
-
-### Option D: Remove closed_at Entirely
-
-Make events table the single source of truth.
-
-**Pros:**
- ✅ Simplest data model
- ✅ No invariant to maintain
- ✅ Events are authoritative
-
-**Cons:**
- ❌ **Performance**: Lead time calculation requires JOIN + subquery
-  ```sql
-  SELECT AVG(
-    (julianday(e.created_at) - julianday(i.created_at)) * 24
-  )
-  FROM issues i
-  JOIN events e ON i.id = e.issue_id
-  WHERE e.event_type = 'closed'
-  ```
- ❌ Events could be missing/corrupted (no referential integrity on event_type)
- ❌ More complex queries throughout codebase
- ❌ **Statistics would be slower** (critical for dashboard UIs)
-
-**Verdict**: Too much complexity/performance cost for the benefit
-
---
-
-## Recommended Solution: **Hybrid Approach**
-
-Combine **Option A (DB constraint)** + **Application enforcement** + **Option C (reopen command)**
-
-### Part 1: Database Constraint (Foundation)
-
-```sql
-- Migration: First clean up existing inconsistent data
-UPDATE issues
-SET closed_at = NULL
-WHERE status != 'closed' AND closed_at IS NOT NULL;
-
-UPDATE issues
-SET closed_at = CURRENT_TIMESTAMP
-WHERE status = 'closed' AND closed_at IS NULL;
-
-- Add the constraint
-ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
-  CHECK ((status = 'closed') = (closed_at IS NOT NULL));
-```
-
-### Part 2: UpdateIssue Smart Status Management
-
-Modify `internal/storage/sqlite/sqlite.go:509-624`:
-
-```go
-func (s *SQLiteStorage) UpdateIssue(ctx context.Context, id string, updates map[string]interface{}, actor string) error {
-    // ... existing validation ...
-
-    // Smart closed_at management based on status changes
-    if statusVal, ok := updates["status"]; ok {
-        newStatus := statusVal.(string)
-
-        if newStatus == string(types.StatusClosed) {
-            // Changing to closed: ensure closed_at is set
-            if _, hasClosedAt := updates["closed_at"]; !hasClosedAt {
-                updates["closed_at"] = time.Now()
-            }
-        } else {
-            // Changing from closed to something else: clear closed_at
-            if oldIssue.Status == types.StatusClosed {
-                updates["closed_at"] = nil  // This will set it to NULL
-                eventType = types.EventReopened
-            }
-        }
-    }
-
-    // ... rest of existing code ...
-}
-```
-
-### Part 3: Import Enforcement
-
-Modify `cmd/bd/import.go:206-231`:
-
-```go
-if _, ok := rawData["status"]; ok {
-    updates["status"] = issue.Status
-
-    // Enforce closed_at invariant
-    if issue.Status == types.StatusClosed {
-        // Status is closed: ensure closed_at is set
-        if issue.ClosedAt == nil {
-            now := time.Now()
-            updates["closed_at"] = now
-        } else {
-            updates["closed_at"] = *issue.ClosedAt
-        }
-    } else {
-        // Status is not closed: ensure closed_at is NULL
-        updates["closed_at"] = nil
-    }
-}
-```
-
-### Part 4: Add Reopen Command
-
-Create `cmd/bd/reopen.go`:
-
-```go
-var reopenCmd = &cobra.Command{
-    Use:   "reopen [id...]",
-    Short: "Reopen one or more closed issues",
-    Args:  cobra.MinimumNArgs(1),
-    Run: func(cmd *cobra.Command, args []string) {
-        ctx := context.Background()
-        reason, _ := cmd.Flags().GetString("reason")
-        if reason == "" {
-            reason = "Reopened"
-        }
-
-        for _, id := range args {
-            // Use UpdateIssue which now handles closed_at automatically
-            updates := map[string]interface{}{
-                "status": "open",
-            }
-            if err := store.UpdateIssue(ctx, id, updates, getUser()); err != nil {
-                fmt.Fprintf(os.Stderr, "Error reopening %s: %v\n", id, err)
-                continue
-            }
-
-            // Add comment explaining why
-            if reason != "" {
-                store.AddComment(ctx, id, getUser(), reason)
-            }
-        }
-
-        markDirtyAndScheduleFlush()
-    },
-}
-
-func init() {
-    reopenCmd.Flags().StringP("reason", "r", "", "Reason for reopening")
-    rootCmd.AddCommand(reopenCmd)
-}
-```
-
---
-
-## Why This Solution Wins
-
-### For Individual Devs & Small Teams
- **Simple mental model**: `closed_at` is set ⟺ issue is closed
- **Hard to break**: DB constraint catches mistakes
- **Explicit reopen**: `bd reopen bd-89` is clearer than `bd update bd-89 --status open`
- **No manual management**: Don't think about closed_at, it's automatic
-
-### For Agent Swarms
- **Robust**: DB constraint prevents any agent from creating inconsistent state
- **Race-safe**: Constraint is atomic, checked at commit time
- **Self-healing**: UpdateIssue automatically fixes the relationship
- **Import-safe**: Can't import inconsistent JSONL (constraint rejects it)
-
-### For New Codebase
- **Can break things**: Migration is easy with few users
- **Sets precedent**: Shows we value data integrity
- **Future-proof**: New features can't violate the invariant
- **Performance**: No query changes needed, closed_at stays fast
-
---
-
-## Migration Plan
-
-### Step 1: Clean Existing Data
-```sql
-- Find inconsistent issues
-SELECT id, status, closed_at FROM issues
-WHERE (status = 'closed') != (closed_at IS NOT NULL);
-
-- Fix them (choose one strategy)
-- Strategy A: Trust status field
-UPDATE issues SET closed_at = NULL
-WHERE status != 'closed' AND closed_at IS NOT NULL;
-
-UPDATE issues SET closed_at = CURRENT_TIMESTAMP
-WHERE status = 'closed' AND closed_at IS NULL;
-
-- Strategy B: Trust closed_at field
-UPDATE issues SET status = 'closed'
-WHERE status != 'closed' AND closed_at IS NOT NULL;
-
-UPDATE issues SET status = 'open'
-WHERE status = 'closed' AND closed_at IS NULL;
-```
-
-**Recommendation**: Use Strategy A (trust status) since status is more often correct.
-
-### Step 2: Add Constraint
-```sql
-- Test first
-SELECT id FROM issues
-WHERE (status = 'closed') != (closed_at IS NOT NULL);
-- Should return 0 rows
-
-- Add constraint
-ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
-  CHECK ((status = 'closed') = (closed_at IS NOT NULL));
-```
-
-### Step 3: Update Code
-1. Modify UpdateIssue (sqlite.go)
-2. Modify Import (import.go)
-3. Add reopen command
-4. Add migration function to schema.go
-
-### Step 4: Test
-```bash
-# Test the constraint rejects bad writes
-sqlite3 .beads/bd.db "UPDATE issues SET closed_at = NULL WHERE id = 'bd-1' AND status = 'closed';"
-# Should fail with constraint violation
-
-# Test update handles it automatically
-bd update bd-89 --status open
-bd show bd-89 --json | jq '{status, closed_at}'
-# Should show: {"status": "open", "closed_at": null}
-
-# Test reopen
-bd create "Test issue" -p 1
-bd close bd-226
-bd reopen bd-226 --reason "Need more work"
-# Should work without errors
-```
-
---
-
-## Alternative Considered: Soft Close with closed_at History
-
-Keep closed_at as "first time closed" and use events for current state.
-
-**Why rejected**:
- More complex model (two sources of truth)
- Unclear semantics (what does closed_at mean?)
- Lead time calculation becomes ambiguous (first close? most recent?)
- Doesn't simplify the problem
-
---
-
-## Conclusion
-
-The **hybrid approach** (DB constraint + smart UpdateIssue + import enforcement + reopen command) is the best solution because:
-
-1. **Defense in depth**: Multiple layers prevent inconsistency
-2. **Hard to break**: Perfect for agent swarms
-3. **Simple for users**: Automatic management of closed_at
-4. **Low migration cost**: Can break things in new codebase
-5. **Clear semantics**: closed_at = "currently closed" (not historical)
-6. **Performance**: No query changes needed
-
-The DB constraint is the key insight: it makes the system robust against future bugs, new code paths, and agent mistakes. Combined with smart application code, it creates a self-healing system that's hard to misuse.
-
-This aligns perfectly with beads' goals: simple for individual devs, robust for agent swarms.