Files
beads/ULTRATHINK_BD222.md
Steve Yegge ab809c5baf Create issue structure for bd-222 and bd-224
- Added design documents (ULTRATHINK_BD222.md, ULTRATHINK_BD224.md)
- Created bd-224 epic with 6 child issues for status/closed_at invariant fix
- Created bd-222 epic with 7 child issues for batching API
- Set up dependencies: bd-224 blocks bd-222 (must fix invariant first)
- Dependencies enable max parallelism while ensuring correct order
2025-10-15 14:27:10 -07:00

27 KiB
Raw Blame History

Ultrathink: Batching API for Bulk Issue Creation (bd-222)

Date: 2025-10-15 Context: Individual devs, small teams, future agent swarms, bulk imports Problem: CreateIssue acquires dedicated connection per call, inefficient for bulk operations

Executive Summary

Recommended Solution: Hybrid approach - Add CreateIssues + Keep existing CreateIssue unchanged

Provides high-performance batch path for bulk operations while maintaining simple single-issue API for typical use.


Dependencies & Implementation Order

Critical Dependency: bd-224 (status/closed_at invariant)

bd-224 MUST be implemented before bd-222

Why: Both issues modify the same code paths:

  • bd-224: Fixes import.go to enforce closed_at invariant (status='closed' ⟺ closed_at != NULL)
  • bd-222: Changes import.go to use CreateIssues instead of CreateIssue loop

The Problem: If we implement bd-222 first:

  1. CreateIssues won't enforce the closed_at invariant (inherits bug from CreateIssue)
  2. Import switches to use CreateIssues
  3. Import can still create inconsistent data (bd-224's bug persists)
  4. Later bd-224 fix requires modifying BOTH CreateIssue AND CreateIssues

The Solution: If we implement bd-224 first:

  1. Add CHECK constraint: (status = 'closed') = (closed_at IS NOT NULL)
  2. Fix UpdateIssue to manage closed_at automatically
  3. Fix import.go to enforce invariant before calling CreateIssue
  4. Then implement bd-222's CreateIssues with invariant already enforced:
    • Database constraint rejects bad data
    • Issue.Validate() checks the invariant (per bd-224)
    • Import code already normalizes before calling CreateIssues
    • No new code needed in CreateIssues - it's correct by construction!

Implementation Impact

CreateIssues must validate closed_at invariant (from bd-224):

// Phase 1: Validation
for i, issue := range issues {
    if err := issue.Validate(); err != nil {  // ← Validates invariant (bd-224)
        return fmt.Errorf("validation failed for issue %d: %w", i, err)
    }
}

After bd-224 is complete, Issue.Validate() will check:

func (i *Issue) Validate() error {
    // ... existing validation ...

    // Enforce closed_at invariant (bd-224)
    if i.Status == StatusClosed && i.ClosedAt == nil {
        return fmt.Errorf("closed issues must have closed_at timestamp")
    }
    if i.Status != StatusClosed && i.ClosedAt != nil {
        return fmt.Errorf("non-closed issues cannot have closed_at timestamp")
    }

    return nil
}

This means CreateIssues automatically enforces the invariant through validation, with the database CHECK constraint as final defense.

Import Code Simplification

Before bd-224 (current import.go):

for _, issue := range issues {
    // Complex logic to handle status/closed_at independently
    updates := make(map[string]interface{})
    if _, ok := rawData["status"]; ok {
        updates["status"] = issue.Status  // ← Doesn't manage closed_at
    }
    // ... more complex update logic
    store.CreateIssue(ctx, issue, "import")
}

After bd-224 (import.go enforces invariant):

for _, issue := range issues {
    // Normalize closed_at based on status BEFORE creating
    if issue.Status == types.StatusClosed {
        if issue.ClosedAt == nil {
            now := time.Now()
            issue.ClosedAt = &now
        }
    } else {
        issue.ClosedAt = nil  // ← Clear if not closed
    }
    store.CreateIssue(ctx, issue, "import")
}

After bd-222 (import.go uses batch):

// Normalize all issues
for _, issue := range issues {
    if issue.Status == types.StatusClosed {
        if issue.ClosedAt == nil {
            now := time.Now()
            issue.ClosedAt = &now
        }
    } else {
        issue.ClosedAt = nil
    }
}

// Single batch call (5-15x faster!)
store.CreateIssues(ctx, issues, "import")

Much simpler: normalize once, call batch API, database constraint enforces correctness.

  1. Implement bd-224 first (P1 bug fix)

    • Add database CHECK constraint
    • Add validation to Issue.Validate()
    • Fix UpdateIssue to auto-manage closed_at
    • Fix import.go to normalize closed_at before creating
  2. Then implement bd-222 (P2 performance enhancement)

    • Add CreateIssues method (inherits bd-224's validation)
    • Update import.go to use CreateIssues
    • Import code is simpler (no per-issue loop, just normalize + batch)
  3. Benefits of this order:

    • bd-224 fixes data integrity bug (higher priority)
    • bd-222 builds on correct foundation
    • No duplicate invariant enforcement code
    • Database constraint + validation = defense in depth
    • CreateIssues is correct by construction

Current State Analysis

How CreateIssue Works (sqlite.go:315-453)

func (s *SQLiteStorage) CreateIssue(ctx, issue, actor) error {
    // 1. Acquire dedicated connection
    conn, err := s.db.Conn(ctx)
    defer conn.Close()

    // 2. BEGIN IMMEDIATE transaction (acquires write lock)
    conn.ExecContext(ctx, "BEGIN IMMEDIATE")

    // 3. Generate ID atomically if needed
    //    - Query issue_counters
    //    - Update counter with MAX(existing, calculated) + 1

    // 4. Insert issue
    // 5. Record creation event
    // 6. Mark dirty for export
    // 7. COMMIT
}

Performance Characteristics

Single Issue Creation:

  • Connection acquisition: ~1ms
  • BEGIN IMMEDIATE: ~1-5ms (lock acquisition)
  • ID generation: ~2-3ms (subquery + update)
  • Insert + event + dirty: ~2-3ms
  • COMMIT: ~1-2ms
  • Total: ~7-14ms per issue

Bulk Creation (100 issues, sequential):

  • 100 connections: ~100ms
  • 100 transactions: ~100-500ms (lock contention!)
  • 100 ID generations: ~200-300ms
  • 100 inserts: ~200-300ms
  • Total: ~600ms-1.2s

With Batching (estimated):

  • 1 connection: ~1ms
  • 1 transaction: ~1-5ms
  • ID generation batch: ~10-20ms (one query for range)
  • Bulk insert: ~50-100ms (prepared stmt, multiple VALUES)
  • Total: ~60-130ms (5-10x faster)

When Does This Matter?

Low Impact (current approach is fine):

  • Interactive CLI use: bd create "Fix bug"
  • Individual agent creating 1-5 issues
  • Typical development workflow

High Impact (batching helps):

  • Bulk import from JSONL (10-1000+ issues)
  • Agent workflows generating issue decompositions (10-50 issues)
  • Migrating from other systems (100-10000+ issues)
  • Template instantiation (creating epic + subtasks)
  • Test data generation

Solution Options

// CreateIssues creates multiple issues atomically in a single transaction
func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error

Semantics:

  • All issues created, or none created (atomicity)
  • Single transaction, single connection
  • Returns error if ANY issue fails validation or insertion
  • IDs generated atomically as a range

Pros:

  • Simple mental model (atomic batch)
  • Clear error handling (one error = whole batch fails)
  • Matches database transaction semantics
  • Easy to implement (similar to CreateIssue)
  • No partial state in database
  • Safe for concurrent access (IMMEDIATE transaction)
  • 5-10x faster for bulk operations

Cons:

  • ⚠️ If one issue is invalid, whole batch fails
  • ⚠️ Caller must retry entire batch on error
  • ⚠️ No indication of WHICH issue failed

Mitigation: Add validation-only mode to pre-check batch

Verdict: Best for most use cases (import, migrations, agent workflows)

Option B: Partial Success with Error Details

type CreateResult struct {
    ID      string
    Error   error
}

func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) ([]CreateResult, error)

Semantics:

  • Best-effort creation
  • Returns results for each issue (ID or error)
  • Transaction commits even if some issues fail
  • Complex rollback semantics

Pros:

  • Caller knows exactly which issues failed
  • Partial progress on errors
  • Good for unreliable input data

Cons:

  • Complex transaction semantics: Which failures abort transaction?
  • Partial state in database: Caller must track what succeeded
  • ID generation complexity: Skip failed issues in counter?
  • Dirty tracking complexity: Which issues to mark dirty?
  • Event recording: Record events for succeeded issues only?
  • More complex API for common case
  • Caller must handle partial state

Verdict: Too complex, doesn't match database atomicity model

Option C: Batch with Configurable Strategy

type BatchOptions struct {
    FailFast        bool  // Stop on first error (default)
    ContinueOnError bool  // Best effort
    ValidateOnly    bool  // Dry run
}

func (s *SQLiteStorage) CreateIssues(ctx, issues, actor, opts) ([]CreateResult, error)

Pros:

  • Flexible for different use cases
  • Can support both atomic and partial modes

Cons:

  • Too much complexity for the benefit
  • Multiple code paths = more bugs
  • Unclear which mode to use when
  • Doesn't solve the core problem (connection overhead)

Verdict: Over-engineered for current needs

Option D: Internal Optimization Only (No API Change)

Optimize CreateIssue internally to batch operations without changing API.

Approach:

  • Connection pooling improvements
  • Prepared statement caching
  • WAL optimization

Pros:

  • No API changes
  • Benefits all callers automatically

Cons:

  • Can't eliminate transaction overhead (still N transactions)
  • Can't eliminate ID generation overhead (still N counter updates)
  • Limited improvement (maybe 20-30% faster, not 5-10x)
  • Doesn't address root cause

Verdict: Good to do anyway, but doesn't solve the problem


API Design

// CreateIssues creates multiple issues atomically in a single transaction.
// All issues are created or none are created. Returns error if any issue
// fails validation or insertion.
//
// Performance: ~10x faster than calling CreateIssue in a loop for large batches.
// Use this for bulk imports, migrations, or agent workflows creating many issues.
//
// Issues with empty IDs will have IDs generated atomically. Issues with
// explicit IDs are used as-is (caller responsible for avoiding collisions).
func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error

Implementation Strategy

Phase 1: Validation

// Validate all issues first (fail-fast)
for i, issue := range issues {
    if err := issue.Validate(); err != nil {
        return fmt.Errorf("validation failed for issue %d: %w", i, err)
    }
}

Phase 2: Connection & Transaction

// Acquire dedicated connection (same as CreateIssue)
conn, err := s.db.Conn(ctx)
if err != nil {
    return fmt.Errorf("failed to acquire connection: %w", err)
}
defer conn.Close()

// BEGIN IMMEDIATE (same as CreateIssue)
if _, err := conn.ExecContext(ctx, "BEGIN IMMEDIATE"); err != nil {
    return fmt.Errorf("failed to begin immediate transaction: %w", err)
}

committed := false
defer func() {
    if !committed {
        conn.ExecContext(context.Background(), "ROLLBACK")
    }
}()

Phase 3: Batch ID Generation

Key Insight: Generate ID range atomically, then assign sequentially

// Count how many issues need IDs
needIDCount := 0
for _, issue := range issues {
    if issue.ID == "" {
        needIDCount++
    }
}

// Generate ID range atomically (if needed)
var nextID int
var prefix string
if needIDCount > 0 {
    // Get prefix from config
    err := conn.QueryRowContext(ctx,
        `SELECT value FROM config WHERE key = ?`,
        "issue_prefix").Scan(&prefix)
    if err == sql.ErrNoRows || prefix == "" {
        prefix = "bd"
    } else if err != nil {
        return fmt.Errorf("failed to get config: %w", err)
    }

    // Atomically reserve ID range: [nextID, nextID+needIDCount)
    // This is the KEY optimization - one counter update instead of N
    err = conn.QueryRowContext(ctx, `
        INSERT INTO issue_counters (prefix, last_id)
        SELECT ?, COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0) + ?
        FROM issues
        WHERE id LIKE ? || '-%'
          AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*'
        ON CONFLICT(prefix) DO UPDATE SET
            last_id = MAX(
                last_id,
                (SELECT COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0)
                 FROM issues
                 WHERE id LIKE ? || '-%'
                   AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*')
            ) + ?
        RETURNING last_id
    `, prefix, prefix, needIDCount, prefix, prefix, prefix, prefix, prefix, needIDCount).Scan(&nextID)
    if err != nil {
        return fmt.Errorf("failed to generate ID range: %w", err)
    }

    // Assign IDs sequentially
    currentID := nextID - needIDCount + 1
    for i := range issues {
        if issues[i].ID == "" {
            issues[i].ID = fmt.Sprintf("%s-%d", prefix, currentID)
            currentID++
        }
    }
}

Phase 4: Bulk Insert Issues

Two approaches:

Approach A: Prepared Statement + Loop (simpler, still fast)

stmt, err := conn.PrepareContext(ctx, `
    INSERT INTO issues (
        id, title, description, design, acceptance_criteria, notes,
        status, priority, issue_type, assignee, estimated_minutes,
        created_at, updated_at, closed_at, external_ref
    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`)
if err != nil {
    return fmt.Errorf("failed to prepare statement: %w", err)
}
defer stmt.Close()

now := time.Now()
for _, issue := range issues {
    issue.CreatedAt = now
    issue.UpdatedAt = now

    _, err = stmt.ExecContext(ctx,
        issue.ID, issue.Title, issue.Description, issue.Design,
        issue.AcceptanceCriteria, issue.Notes, issue.Status,
        issue.Priority, issue.IssueType, issue.Assignee,
        issue.EstimatedMinutes, issue.CreatedAt, issue.UpdatedAt,
        issue.ClosedAt, issue.ExternalRef,
    )
    if err != nil {
        return fmt.Errorf("failed to insert issue %s: %w", issue.ID, err)
    }
}

Approach B: Multi-VALUE INSERT (fastest, more complex)

// Build multi-value INSERT
// INSERT INTO issues VALUES (...), (...), (...)
// More complex string building but ~2x faster for large batches
// Defer to performance testing phase

Decision: Start with Approach A (prepared statement), optimize to Approach B if benchmarks show need

Phase 5: Bulk Record Events

// Prepare event statement
eventStmt, err := conn.PrepareContext(ctx, `
    INSERT INTO events (issue_id, event_type, actor, new_value)
    VALUES (?, ?, ?, ?)
`)
if err != nil {
    return fmt.Errorf("failed to prepare event statement: %w", err)
}
defer eventStmt.Close()

for _, issue := range issues {
    eventData, err := json.Marshal(issue)
    if err != nil {
        eventData = []byte(fmt.Sprintf(`{"id":"%s","title":"%s"}`, issue.ID, issue.Title))
    }

    _, err = eventStmt.ExecContext(ctx, issue.ID, types.EventCreated, actor, string(eventData))
    if err != nil {
        return fmt.Errorf("failed to record event for %s: %w", issue.ID, err)
    }
}

Phase 6: Bulk Mark Dirty

// Bulk insert dirty markers
dirtyStmt, err := conn.PrepareContext(ctx, `
    INSERT INTO dirty_issues (issue_id, marked_at)
    VALUES (?, ?)
    ON CONFLICT (issue_id) DO UPDATE SET marked_at = excluded.marked_at
`)
if err != nil {
    return fmt.Errorf("failed to prepare dirty statement: %w", err)
}
defer dirtyStmt.Close()

dirtyTime := time.Now()
for _, issue := range issues {
    _, err = dirtyStmt.ExecContext(ctx, issue.ID, dirtyTime)
    if err != nil {
        return fmt.Errorf("failed to mark dirty %s: %w", issue.ID, err)
    }
}

Phase 7: Commit

if _, err := conn.ExecContext(ctx, "COMMIT"); err != nil {
    return fmt.Errorf("failed to commit transaction: %w", err)
}
committed = true
return nil

Design Decisions & Tradeoffs

Decision 1: All-or-Nothing Atomicity

Rationale: Matches database transaction semantics, simpler mental model

Tradeoff: Batch fails if ANY issue is invalid

  • Mitigation: Pre-validate all issues before starting transaction
  • Alternative: Caller can retry with smaller batches or individual issues

Decision 2: Same Transaction Semantics as CreateIssue

Use BEGIN IMMEDIATE, not DEFERRED or EXCLUSIVE

Rationale:

  • Consistency with existing CreateIssue
  • Prevents race conditions in ID generation
  • Serializes batch operations (which is fine - they're rare)

Tradeoff: Batches serialize (only one concurrent batch writer)

  • Impact: Low - batch operations are rare (import, migration)
  • Benefit: Simple, correct, no race conditions

Decision 3: Atomic ID Range Reservation

Generate range [nextID, nextID+N) in single counter update

Rationale: KEY optimization - avoids N counter updates

Implementation:

-- Old approach (CreateIssue): N updates
UPDATE issue_counters SET last_id = last_id + 1 RETURNING last_id;  -- N times

-- New approach (CreateIssues): 1 update
UPDATE issue_counters SET last_id = last_id + N RETURNING last_id;  -- Once

Correctness: Safe because BEGIN IMMEDIATE serializes batches

Decision 4: Support Mixed ID Assignment

Some issues can have explicit IDs, others auto-generated

Use Case: Import with some external IDs, some new issues

issues := []*Issue{
    {ID: "ext-123", Title: "External issue"},  // Keep ID
    {ID: "", Title: "New issue"},               // Generate ID
    {ID: "bd-999", Title: "Explicit ID"},      // Keep ID
}

Rationale: Flexible for import scenarios

Complexity: Low - just count issues needing IDs

Decision 5: Prepared Statements Over Multi-VALUE INSERT

Start with prepared statement loop, optimize later if needed

Rationale:

  • Simpler implementation
  • Still much faster than N transactions (5-10x)
  • Multi-VALUE INSERT only ~2x faster than prepared stmt
  • Can optimize later if profiling shows need

Decision 6: Keep CreateIssue Unchanged

Don't modify existing CreateIssue implementation

Rationale:

  • Backward compatibility
  • No risk to existing callers
  • Additive change only
  • Different use cases (single vs batch)

When to Use Which API

Use CreateIssue (existing)

  • Interactive CLI: bd create "Title"
  • Single issue creation
  • Agent creating 1-3 issues
  • When simplicity matters
  • When you want per-issue error handling

Use CreateIssues (new)

  • Bulk import from JSONL (10-1000+ issues)
  • Migration from other systems (100-10000+ issues)
  • Agent decomposing work into 10-50 issues
  • Template instantiation (epic + subtasks)
  • Test data generation
  • When performance matters

Rule of Thumb: Use CreateIssues for N > 5 issues


Implementation Checklist

Phase 1: Core Implementation

  • Add CreateIssues to Storage interface (storage/storage.go)
  • Implement SQLiteStorage.CreateIssues (storage/sqlite/sqlite.go)
  • Add comprehensive unit tests
  • Add concurrency tests (multiple batch writers)
  • Add performance benchmarks

Phase 2: CLI Integration

  • Add bd create-batch command (or internal use only?)
  • Update import.go to use CreateIssues for bulk imports
  • Test with real JSONL imports

Phase 3: Documentation

  • Document CreateIssues API (godoc)
  • Add batch import example
  • Update EXTENDING.md with batch usage
  • Performance notes in README

Phase 4: Optimization (if needed)

  • Profile CreateIssues with 100, 1000, 10000 issues
  • Optimize to multi-VALUE INSERT if needed
  • Consider batch size limits (split large batches)

Testing Strategy

Unit Tests

func TestCreateIssues_Empty(t *testing.T)
func TestCreateIssues_Single(t *testing.T)
func TestCreateIssues_Multiple(t *testing.T)
func TestCreateIssues_WithExplicitIDs(t *testing.T)
func TestCreateIssues_MixedIDs(t *testing.T)
func TestCreateIssues_ValidationError(t *testing.T)
func TestCreateIssues_DuplicateID(t *testing.T)
func TestCreateIssues_RollbackOnError(t *testing.T)

Concurrency Tests

func TestCreateIssues_Concurrent(t *testing.T) {
    // 10 goroutines each creating 100 issues
    // Verify no ID collisions
    // Verify all issues created
}

func TestCreateIssues_MixedWithCreateIssue(t *testing.T) {
    // Concurrent CreateIssue + CreateIssues
    // Verify no ID collisions
}

Performance Benchmarks

func BenchmarkCreateIssue_Sequential(b *testing.B)
func BenchmarkCreateIssues_Batch(b *testing.B)

// Expected results (100 issues):
// CreateIssue x100:  ~600-1200ms
// CreateIssues:      ~60-130ms
// Speedup:           5-10x

Integration Tests

func TestImport_LargeJSONL(t *testing.T) {
    // Import 1000 issues from JSONL
    // Verify all created correctly
    // Verify performance < 1s
}

Migration Plan

Step 1: Add Interface Method (Non-Breaking)

// storage/storage.go
type Storage interface {
    CreateIssue(ctx context.Context, issue *types.Issue, actor string) error
    CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error  // NEW
    // ... rest unchanged
}

Step 2: Implement SQLiteStorage.CreateIssues

Follow implementation strategy above

Step 3: Add Tests

Comprehensive unit + concurrency + benchmark tests

Step 4: Update Import (Optional)

// cmd/bd/import.go - replace loop with batch
func importIssues(store Storage, issues []*Issue) error {
    // Old:
    // for _, issue := range issues {
    //     store.CreateIssue(ctx, issue, "import")
    // }

    // New:
    return store.CreateIssues(ctx, issues, "import")
}

Note: Start with internal use (import), expose CLI later if needed

Step 5: Performance Testing

# Generate test JSONL
bd export > backup.jsonl

# Duplicate 100x for stress test
cat backup.jsonl backup.jsonl ... > large_test.jsonl

# Test import performance
time bd import large_test.jsonl

Future Enhancements (NOT for bd-222)

Batch Size Limits

If very large batches cause memory issues:

func (s *SQLiteStorage) CreateIssues(ctx, issues, actor) error {
    const maxBatchSize = 1000

    for i := 0; i < len(issues); i += maxBatchSize {
        end := min(i+maxBatchSize, len(issues))
        batch := issues[i:end]

        if err := s.createIssuesBatch(ctx, batch, actor); err != nil {
            return fmt.Errorf("batch %d-%d failed: %w", i, end, err)
        }
    }
    return nil
}

Decision: Don't implement until we see issues with large batches (>1000)

Validation-Only Mode

Pre-validate batch without creating:

func (s *SQLiteStorage) ValidateIssues(ctx, issues) error

Use Case: Dry-run before bulk import

Decision: Add if import workflows request it

Progress Callbacks

Report progress for long-running batches:

type BatchProgress func(completed, total int)

func (s *SQLiteStorage) CreateIssuesWithProgress(ctx, issues, actor, progress) error

Decision: Add if agent workflows request it (likely for 1000+ issue batches)


Performance Analysis

Baseline (CreateIssue loop)

For 100 issues:

Connection overhead:  100ms   (1ms × 100)
Transaction overhead: 300ms   (3ms × 100, with lock contention)
ID generation:        250ms   (2.5ms × 100)
Insert + event:       250ms   (2.5ms × 100)
Total:                900ms

With CreateIssues

For 100 issues:

Connection overhead:   1ms    (1 connection)
Transaction overhead:  5ms    (1 transaction)
ID range generation:   15ms   (1 query, more complex)
Bulk insert (prep):    50ms   (prepared stmt × 100)
Bulk events (prep):    30ms   (prepared stmt × 100)
Bulk dirty (prep):     20ms   (prepared stmt × 100)
Commit:                5ms
Total:                 126ms  (7x faster)

Scalability

Issues CreateIssue Loop CreateIssues Speedup
10 90ms 30ms 3x
100 900ms 126ms 7x
1000 9s 800ms 11x
10000 90s 6s 15x

Key Insight: Speedup increases with batch size due to fixed overhead amortization


Why This Solution Wins

For Individual Devs & Small Teams

  • Zero impact on normal workflow: CreateIssue unchanged
  • Fast imports: 1000 issues in <1s instead of 10s
  • Simple mental model: All-or-nothing batch
  • No new concepts: Same semantics as CreateIssue, just faster

For Agent Swarms

  • Efficient decomposition: Agent creates 50 subtasks in one call
  • Atomic work generation: All issues created or none
  • No connection exhaustion: One connection per batch
  • Safe concurrency: BEGIN IMMEDIATE prevents races

For New Codebase

  • Non-breaking change: Additive API only
  • Performance win: 5-15x faster for bulk operations
  • Simple implementation: ~200 LOC, similar to CreateIssue
  • Battle-tested pattern: Same transaction semantics as CreateIssue

Alternatives Considered and Rejected

Alternative 1: Auto-Batch in CreateIssue

Automatically detect rapid CreateIssue calls and batch them.

Why Rejected:

  • Magical behavior (implicit batching)
  • Complex implementation (goroutine + timer + coordination)
  • Race conditions and edge cases
  • Unpredictable performance (when does batch trigger?)
  • Can't guarantee atomicity across auto-batch boundary

Alternative 2: Separate Import API

Add ImportIssues specifically for JSONL import, not general-purpose.

Why Rejected:

  • Limits use cases (what about agent workflows?)
  • Name doesn't match behavior (it's just batch create)
  • CreateIssues is more discoverable and general

Alternative 3: Streaming API

type IssueStream interface {
    Send(*Issue) error
    CloseAndCommit() error
}
func (s *SQLiteStorage) CreateIssueStream(ctx, actor) (IssueStream, error)

Why Rejected:

  • More complex API (stateful stream object)
  • Error handling complexity (partial writes?)
  • Doesn't match Go/SQL idioms
  • Caller must manage stream lifecycle
  • Simple slice is easier to work with

Conclusion

The simple all-or-nothing batch API (CreateIssues) is the best solution because:

  1. Significant performance win: 5-15x faster for bulk operations
  2. Simple API: Just like CreateIssue but with slice
  3. Safe: Atomic transaction, no partial state
  4. Non-breaking: Existing CreateIssue unchanged
  5. Flexible: Supports mixed ID assignment (auto + explicit)
  6. Proven pattern: Same transaction semantics as CreateIssue

The key insight is atomic ID range reservation - updating the counter once for N issues instead of N times. Combined with a single transaction and prepared statements, this provides major performance improvements without complexity.

This aligns perfectly with beads' goals: simple for individual devs, efficient for bulk operations, robust for agent swarms.

Implementation size: ~200 LOC + ~400 LOC tests = manageable, low-risk change Expected performance: 5-15x faster for bulk operations (N > 10) Risk: Low (additive API, comprehensive tests)