removed obsolete docs
This commit is contained in:
@@ -1,952 +0,0 @@
|
||||
# Ultrathink: Batching API for Bulk Issue Creation (bd-222)
|
||||
|
||||
**Date**: 2025-10-15
|
||||
**Context**: Individual devs, small teams, future agent swarms, bulk imports
|
||||
**Problem**: CreateIssue acquires dedicated connection per call, inefficient for bulk operations
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Recommended Solution**: **Hybrid approach - Add CreateIssues + Keep existing CreateIssue unchanged**
|
||||
|
||||
Provides high-performance batch path for bulk operations while maintaining simple single-issue API for typical use.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Implementation Order
|
||||
|
||||
### Critical Dependency: bd-224 (status/closed_at invariant)
|
||||
|
||||
**bd-224 MUST be implemented before bd-222**
|
||||
|
||||
**Why**: Both issues modify the same code paths:
|
||||
- bd-224: Fixes `import.go` to enforce `closed_at` invariant (status='closed' ⟺ closed_at != NULL)
|
||||
- bd-222: Changes `import.go` to use `CreateIssues` instead of `CreateIssue` loop
|
||||
|
||||
**The Problem**:
|
||||
If we implement bd-222 first:
|
||||
1. `CreateIssues` won't enforce the closed_at invariant (inherits bug from CreateIssue)
|
||||
2. Import switches to use `CreateIssues`
|
||||
3. Import can still create inconsistent data (bd-224's bug persists)
|
||||
4. Later bd-224 fix requires modifying BOTH CreateIssue AND CreateIssues
|
||||
|
||||
**The Solution**:
|
||||
If we implement bd-224 first:
|
||||
1. Add CHECK constraint: `(status = 'closed') = (closed_at IS NOT NULL)`
|
||||
2. Fix `UpdateIssue` to manage closed_at automatically
|
||||
3. Fix `import.go` to enforce invariant before calling CreateIssue
|
||||
4. **Then** implement bd-222's `CreateIssues` with invariant already enforced:
|
||||
- Database constraint rejects bad data
|
||||
- Issue.Validate() checks the invariant (per bd-224)
|
||||
- Import code already normalizes before calling CreateIssues
|
||||
- No new code needed in CreateIssues - it's correct by construction!
|
||||
|
||||
### Implementation Impact
|
||||
|
||||
**CreateIssues must validate closed_at invariant** (from bd-224):
|
||||
|
||||
```go
|
||||
// Phase 1: Validation
|
||||
for i, issue := range issues {
|
||||
if err := issue.Validate(); err != nil { // ← Validates invariant (bd-224)
|
||||
return fmt.Errorf("validation failed for issue %d: %w", i, err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
After bd-224 is complete, `Issue.Validate()` will check:
|
||||
```go
|
||||
func (i *Issue) Validate() error {
|
||||
// ... existing validation ...
|
||||
|
||||
// Enforce closed_at invariant (bd-224)
|
||||
if i.Status == StatusClosed && i.ClosedAt == nil {
|
||||
return fmt.Errorf("closed issues must have closed_at timestamp")
|
||||
}
|
||||
if i.Status != StatusClosed && i.ClosedAt != nil {
|
||||
return fmt.Errorf("non-closed issues cannot have closed_at timestamp")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
This means `CreateIssues` automatically enforces the invariant through validation, with the database CHECK constraint as final defense.
|
||||
|
||||
### Import Code Simplification
|
||||
|
||||
**Before bd-224** (current import.go):
|
||||
```go
|
||||
for _, issue := range issues {
|
||||
// Complex logic to handle status/closed_at independently
|
||||
updates := make(map[string]interface{})
|
||||
if _, ok := rawData["status"]; ok {
|
||||
updates["status"] = issue.Status // ← Doesn't manage closed_at
|
||||
}
|
||||
// ... more complex update logic
|
||||
store.CreateIssue(ctx, issue, "import")
|
||||
}
|
||||
```
|
||||
|
||||
**After bd-224** (import.go enforces invariant):
|
||||
```go
|
||||
for _, issue := range issues {
|
||||
// Normalize closed_at based on status BEFORE creating
|
||||
if issue.Status == types.StatusClosed {
|
||||
if issue.ClosedAt == nil {
|
||||
now := time.Now()
|
||||
issue.ClosedAt = &now
|
||||
}
|
||||
} else {
|
||||
issue.ClosedAt = nil // ← Clear if not closed
|
||||
}
|
||||
store.CreateIssue(ctx, issue, "import")
|
||||
}
|
||||
```
|
||||
|
||||
**After bd-222** (import.go uses batch):
|
||||
```go
|
||||
// Normalize all issues
|
||||
for _, issue := range issues {
|
||||
if issue.Status == types.StatusClosed {
|
||||
if issue.ClosedAt == nil {
|
||||
now := time.Now()
|
||||
issue.ClosedAt = &now
|
||||
}
|
||||
} else {
|
||||
issue.ClosedAt = nil
|
||||
}
|
||||
}
|
||||
|
||||
// Single batch call (5-15x faster!)
|
||||
store.CreateIssues(ctx, issues, "import")
|
||||
```
|
||||
|
||||
Much simpler: normalize once, call batch API, database constraint enforces correctness.
|
||||
|
||||
### Recommended Implementation Sequence
|
||||
|
||||
1. ✅ **Implement bd-224 first** (P1 bug fix)
|
||||
- Add database CHECK constraint
|
||||
- Add validation to `Issue.Validate()`
|
||||
- Fix `UpdateIssue` to auto-manage closed_at
|
||||
- Fix `import.go` to normalize closed_at before creating
|
||||
|
||||
2. ✅ **Then implement bd-222** (P2 performance enhancement)
|
||||
- Add `CreateIssues` method (inherits bd-224's validation)
|
||||
- Update `import.go` to use `CreateIssues`
|
||||
- Import code is simpler (no per-issue loop, just normalize + batch)
|
||||
|
||||
3. ✅ **Benefits of this order**:
|
||||
- bd-224 fixes data integrity bug (higher priority)
|
||||
- bd-222 builds on correct foundation
|
||||
- No duplicate invariant enforcement code
|
||||
- Database constraint + validation = defense in depth
|
||||
- CreateIssues is correct by construction
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### How CreateIssue Works (sqlite.go:315-453)
|
||||
|
||||
```go
|
||||
func (s *SQLiteStorage) CreateIssue(ctx, issue, actor) error {
|
||||
// 1. Acquire dedicated connection
|
||||
conn, err := s.db.Conn(ctx)
|
||||
defer conn.Close()
|
||||
|
||||
// 2. BEGIN IMMEDIATE transaction (acquires write lock)
|
||||
conn.ExecContext(ctx, "BEGIN IMMEDIATE")
|
||||
|
||||
// 3. Generate ID atomically if needed
|
||||
// - Query issue_counters
|
||||
// - Update counter with MAX(existing, calculated) + 1
|
||||
|
||||
// 4. Insert issue
|
||||
// 5. Record creation event
|
||||
// 6. Mark dirty for export
|
||||
// 7. COMMIT
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Characteristics
|
||||
|
||||
**Single Issue Creation**:
|
||||
- Connection acquisition: ~1ms
|
||||
- BEGIN IMMEDIATE: ~1-5ms (lock acquisition)
|
||||
- ID generation: ~2-3ms (subquery + update)
|
||||
- Insert + event + dirty: ~2-3ms
|
||||
- COMMIT: ~1-2ms
|
||||
- **Total: ~7-14ms per issue**
|
||||
|
||||
**Bulk Creation (100 issues, sequential)**:
|
||||
- 100 connections: ~100ms
|
||||
- 100 transactions: ~100-500ms (lock contention!)
|
||||
- 100 ID generations: ~200-300ms
|
||||
- 100 inserts: ~200-300ms
|
||||
- **Total: ~600ms-1.2s**
|
||||
|
||||
**With Batching (estimated)**:
|
||||
- 1 connection: ~1ms
|
||||
- 1 transaction: ~1-5ms
|
||||
- ID generation batch: ~10-20ms (one query for range)
|
||||
- Bulk insert: ~50-100ms (prepared stmt, multiple VALUES)
|
||||
- **Total: ~60-130ms (5-10x faster)**
|
||||
|
||||
### When Does This Matter?
|
||||
|
||||
**Low Impact** (current approach is fine):
|
||||
- Interactive CLI use: `bd create "Fix bug"`
|
||||
- Individual agent creating 1-5 issues
|
||||
- Typical development workflow
|
||||
|
||||
**High Impact** (batching helps):
|
||||
- ✅ Bulk import from JSONL (10-1000+ issues)
|
||||
- ✅ Agent workflows generating issue decompositions (10-50 issues)
|
||||
- ✅ Migrating from other systems (100-10000+ issues)
|
||||
- ✅ Template instantiation (creating epic + subtasks)
|
||||
- ✅ Test data generation
|
||||
|
||||
---
|
||||
|
||||
## Solution Options
|
||||
|
||||
### Option A: Simple All-or-Nothing Batch ⭐ **RECOMMENDED**
|
||||
|
||||
```go
|
||||
// CreateIssues creates multiple issues atomically in a single transaction
|
||||
func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error
|
||||
```
|
||||
|
||||
**Semantics**:
|
||||
- All issues created, or none created (atomicity)
|
||||
- Single transaction, single connection
|
||||
- Returns error if ANY issue fails validation or insertion
|
||||
- IDs generated atomically as a range
|
||||
|
||||
**Pros**:
|
||||
- ✅ Simple mental model (atomic batch)
|
||||
- ✅ Clear error handling (one error = whole batch fails)
|
||||
- ✅ Matches database transaction semantics
|
||||
- ✅ Easy to implement (similar to CreateIssue)
|
||||
- ✅ No partial state in database
|
||||
- ✅ Safe for concurrent access (IMMEDIATE transaction)
|
||||
- ✅ **5-10x faster for bulk operations**
|
||||
|
||||
**Cons**:
|
||||
- ⚠️ If one issue is invalid, whole batch fails
|
||||
- ⚠️ Caller must retry entire batch on error
|
||||
- ⚠️ No indication of WHICH issue failed
|
||||
|
||||
**Mitigation**: Add validation-only mode to pre-check batch
|
||||
|
||||
**Verdict**: Best for most use cases (import, migrations, agent workflows)
|
||||
|
||||
### Option B: Partial Success with Error Details
|
||||
|
||||
```go
|
||||
type CreateResult struct {
|
||||
ID string
|
||||
Error error
|
||||
}
|
||||
|
||||
func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) ([]CreateResult, error)
|
||||
```
|
||||
|
||||
**Semantics**:
|
||||
- Best-effort creation
|
||||
- Returns results for each issue (ID or error)
|
||||
- Transaction commits even if some issues fail
|
||||
- Complex rollback semantics
|
||||
|
||||
**Pros**:
|
||||
- ✅ Caller knows exactly which issues failed
|
||||
- ✅ Partial progress on errors
|
||||
- ✅ Good for unreliable input data
|
||||
|
||||
**Cons**:
|
||||
- ❌ **Complex transaction semantics**: Which failures abort transaction?
|
||||
- ❌ **Partial state in database**: Caller must track what succeeded
|
||||
- ❌ **ID generation complexity**: Skip failed issues in counter?
|
||||
- ❌ **Dirty tracking complexity**: Which issues to mark dirty?
|
||||
- ❌ **Event recording**: Record events for succeeded issues only?
|
||||
- ❌ More complex API for common case
|
||||
- ❌ Caller must handle partial state
|
||||
|
||||
**Verdict**: Too complex, doesn't match database atomicity model
|
||||
|
||||
### Option C: Batch with Configurable Strategy
|
||||
|
||||
```go
|
||||
type BatchOptions struct {
|
||||
FailFast bool // Stop on first error (default)
|
||||
ContinueOnError bool // Best effort
|
||||
ValidateOnly bool // Dry run
|
||||
}
|
||||
|
||||
func (s *SQLiteStorage) CreateIssues(ctx, issues, actor, opts) ([]CreateResult, error)
|
||||
```
|
||||
|
||||
**Pros**:
|
||||
- ✅ Flexible for different use cases
|
||||
- ✅ Can support both atomic and partial modes
|
||||
|
||||
**Cons**:
|
||||
- ❌ **Too much complexity** for the benefit
|
||||
- ❌ Multiple code paths = more bugs
|
||||
- ❌ Unclear which mode to use when
|
||||
- ❌ Doesn't solve the core problem (connection overhead)
|
||||
|
||||
**Verdict**: Over-engineered for current needs
|
||||
|
||||
### Option D: Internal Optimization Only (No API Change)
|
||||
|
||||
Optimize CreateIssue internally to batch operations without changing API.
|
||||
|
||||
**Approach**:
|
||||
- Connection pooling improvements
|
||||
- Prepared statement caching
|
||||
- WAL optimization
|
||||
|
||||
**Pros**:
|
||||
- ✅ No API changes
|
||||
- ✅ Benefits all callers automatically
|
||||
|
||||
**Cons**:
|
||||
- ❌ **Can't eliminate transaction overhead** (still N transactions)
|
||||
- ❌ **Can't eliminate ID generation overhead** (still N counter updates)
|
||||
- ❌ **Limited improvement** (maybe 20-30% faster, not 5-10x)
|
||||
- ❌ Doesn't address root cause
|
||||
|
||||
**Verdict**: Good to do anyway, but doesn't solve the problem
|
||||
|
||||
---
|
||||
|
||||
## Recommended Solution: **Simple All-or-Nothing Batch (Option A)**
|
||||
|
||||
### API Design
|
||||
|
||||
```go
|
||||
// CreateIssues creates multiple issues atomically in a single transaction.
|
||||
// All issues are created or none are created. Returns error if any issue
|
||||
// fails validation or insertion.
|
||||
//
|
||||
// Performance: ~10x faster than calling CreateIssue in a loop for large batches.
|
||||
// Use this for bulk imports, migrations, or agent workflows creating many issues.
|
||||
//
|
||||
// Issues with empty IDs will have IDs generated atomically. Issues with
|
||||
// explicit IDs are used as-is (caller responsible for avoiding collisions).
|
||||
func (s *SQLiteStorage) CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error
|
||||
```
|
||||
|
||||
### Implementation Strategy
|
||||
|
||||
#### Phase 1: Validation
|
||||
|
||||
```go
|
||||
// Validate all issues first (fail-fast)
|
||||
for i, issue := range issues {
|
||||
if err := issue.Validate(); err != nil {
|
||||
return fmt.Errorf("validation failed for issue %d: %w", i, err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 2: Connection & Transaction
|
||||
|
||||
```go
|
||||
// Acquire dedicated connection (same as CreateIssue)
|
||||
conn, err := s.db.Conn(ctx)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to acquire connection: %w", err)
|
||||
}
|
||||
defer conn.Close()
|
||||
|
||||
// BEGIN IMMEDIATE (same as CreateIssue)
|
||||
if _, err := conn.ExecContext(ctx, "BEGIN IMMEDIATE"); err != nil {
|
||||
return fmt.Errorf("failed to begin immediate transaction: %w", err)
|
||||
}
|
||||
|
||||
committed := false
|
||||
defer func() {
|
||||
if !committed {
|
||||
conn.ExecContext(context.Background(), "ROLLBACK")
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
#### Phase 3: Batch ID Generation
|
||||
|
||||
**Key Insight**: Generate ID range atomically, then assign sequentially
|
||||
|
||||
```go
|
||||
// Count how many issues need IDs
|
||||
needIDCount := 0
|
||||
for _, issue := range issues {
|
||||
if issue.ID == "" {
|
||||
needIDCount++
|
||||
}
|
||||
}
|
||||
|
||||
// Generate ID range atomically (if needed)
|
||||
var nextID int
|
||||
var prefix string
|
||||
if needIDCount > 0 {
|
||||
// Get prefix from config
|
||||
err := conn.QueryRowContext(ctx,
|
||||
`SELECT value FROM config WHERE key = ?`,
|
||||
"issue_prefix").Scan(&prefix)
|
||||
if err == sql.ErrNoRows || prefix == "" {
|
||||
prefix = "bd"
|
||||
} else if err != nil {
|
||||
return fmt.Errorf("failed to get config: %w", err)
|
||||
}
|
||||
|
||||
// Atomically reserve ID range: [nextID, nextID+needIDCount)
|
||||
// This is the KEY optimization - one counter update instead of N
|
||||
err = conn.QueryRowContext(ctx, `
|
||||
INSERT INTO issue_counters (prefix, last_id)
|
||||
SELECT ?, COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0) + ?
|
||||
FROM issues
|
||||
WHERE id LIKE ? || '-%'
|
||||
AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*'
|
||||
ON CONFLICT(prefix) DO UPDATE SET
|
||||
last_id = MAX(
|
||||
last_id,
|
||||
(SELECT COALESCE(MAX(CAST(substr(id, LENGTH(?) + 2) AS INTEGER)), 0)
|
||||
FROM issues
|
||||
WHERE id LIKE ? || '-%'
|
||||
AND substr(id, LENGTH(?) + 2) GLOB '[0-9]*')
|
||||
) + ?
|
||||
RETURNING last_id
|
||||
`, prefix, prefix, needIDCount, prefix, prefix, prefix, prefix, prefix, needIDCount).Scan(&nextID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to generate ID range: %w", err)
|
||||
}
|
||||
|
||||
// Assign IDs sequentially
|
||||
currentID := nextID - needIDCount + 1
|
||||
for i := range issues {
|
||||
if issues[i].ID == "" {
|
||||
issues[i].ID = fmt.Sprintf("%s-%d", prefix, currentID)
|
||||
currentID++
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 4: Bulk Insert Issues
|
||||
|
||||
**Two approaches**:
|
||||
|
||||
**Approach A: Prepared Statement + Loop** (simpler, still fast)
|
||||
```go
|
||||
stmt, err := conn.PrepareContext(ctx, `
|
||||
INSERT INTO issues (
|
||||
id, title, description, design, acceptance_criteria, notes,
|
||||
status, priority, issue_type, assignee, estimated_minutes,
|
||||
created_at, updated_at, closed_at, external_ref
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
`)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to prepare statement: %w", err)
|
||||
}
|
||||
defer stmt.Close()
|
||||
|
||||
now := time.Now()
|
||||
for _, issue := range issues {
|
||||
issue.CreatedAt = now
|
||||
issue.UpdatedAt = now
|
||||
|
||||
_, err = stmt.ExecContext(ctx,
|
||||
issue.ID, issue.Title, issue.Description, issue.Design,
|
||||
issue.AcceptanceCriteria, issue.Notes, issue.Status,
|
||||
issue.Priority, issue.IssueType, issue.Assignee,
|
||||
issue.EstimatedMinutes, issue.CreatedAt, issue.UpdatedAt,
|
||||
issue.ClosedAt, issue.ExternalRef,
|
||||
)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to insert issue %s: %w", issue.ID, err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Approach B: Multi-VALUE INSERT** (fastest, more complex)
|
||||
```go
|
||||
// Build multi-value INSERT
|
||||
// INSERT INTO issues VALUES (...), (...), (...)
|
||||
// More complex string building but ~2x faster for large batches
|
||||
// Defer to performance testing phase
|
||||
```
|
||||
|
||||
**Decision**: Start with Approach A (prepared statement), optimize to Approach B if benchmarks show need
|
||||
|
||||
#### Phase 5: Bulk Record Events
|
||||
|
||||
```go
|
||||
// Prepare event statement
|
||||
eventStmt, err := conn.PrepareContext(ctx, `
|
||||
INSERT INTO events (issue_id, event_type, actor, new_value)
|
||||
VALUES (?, ?, ?, ?)
|
||||
`)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to prepare event statement: %w", err)
|
||||
}
|
||||
defer eventStmt.Close()
|
||||
|
||||
for _, issue := range issues {
|
||||
eventData, err := json.Marshal(issue)
|
||||
if err != nil {
|
||||
eventData = []byte(fmt.Sprintf(`{"id":"%s","title":"%s"}`, issue.ID, issue.Title))
|
||||
}
|
||||
|
||||
_, err = eventStmt.ExecContext(ctx, issue.ID, types.EventCreated, actor, string(eventData))
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to record event for %s: %w", issue.ID, err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 6: Bulk Mark Dirty
|
||||
|
||||
```go
|
||||
// Bulk insert dirty markers
|
||||
dirtyStmt, err := conn.PrepareContext(ctx, `
|
||||
INSERT INTO dirty_issues (issue_id, marked_at)
|
||||
VALUES (?, ?)
|
||||
ON CONFLICT (issue_id) DO UPDATE SET marked_at = excluded.marked_at
|
||||
`)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to prepare dirty statement: %w", err)
|
||||
}
|
||||
defer dirtyStmt.Close()
|
||||
|
||||
dirtyTime := time.Now()
|
||||
for _, issue := range issues {
|
||||
_, err = dirtyStmt.ExecContext(ctx, issue.ID, dirtyTime)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to mark dirty %s: %w", issue.ID, err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 7: Commit
|
||||
|
||||
```go
|
||||
if _, err := conn.ExecContext(ctx, "COMMIT"); err != nil {
|
||||
return fmt.Errorf("failed to commit transaction: %w", err)
|
||||
}
|
||||
committed = true
|
||||
return nil
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Design Decisions & Tradeoffs
|
||||
|
||||
### Decision 1: All-or-Nothing Atomicity ✅
|
||||
|
||||
**Rationale**: Matches database transaction semantics, simpler mental model
|
||||
|
||||
**Tradeoff**: Batch fails if ANY issue is invalid
|
||||
- **Mitigation**: Pre-validate all issues before starting transaction
|
||||
- **Alternative**: Caller can retry with smaller batches or individual issues
|
||||
|
||||
### Decision 2: Same Transaction Semantics as CreateIssue ✅
|
||||
|
||||
Use BEGIN IMMEDIATE, not DEFERRED or EXCLUSIVE
|
||||
|
||||
**Rationale**:
|
||||
- Consistency with existing CreateIssue
|
||||
- Prevents race conditions in ID generation
|
||||
- Serializes batch operations (which is fine - they're rare)
|
||||
|
||||
**Tradeoff**: Batches serialize (only one concurrent batch writer)
|
||||
- **Impact**: Low - batch operations are rare (import, migration)
|
||||
- **Benefit**: Simple, correct, no race conditions
|
||||
|
||||
### Decision 3: Atomic ID Range Reservation ✅
|
||||
|
||||
Generate range [nextID, nextID+N) in single counter update
|
||||
|
||||
**Rationale**: KEY optimization - avoids N counter updates
|
||||
|
||||
**Implementation**:
|
||||
```sql
|
||||
-- Old approach (CreateIssue): N updates
|
||||
UPDATE issue_counters SET last_id = last_id + 1 RETURNING last_id; -- N times
|
||||
|
||||
-- New approach (CreateIssues): 1 update
|
||||
UPDATE issue_counters SET last_id = last_id + N RETURNING last_id; -- Once
|
||||
```
|
||||
|
||||
**Correctness**: Safe because BEGIN IMMEDIATE serializes batches
|
||||
|
||||
### Decision 4: Support Mixed ID Assignment ✅
|
||||
|
||||
Some issues can have explicit IDs, others auto-generated
|
||||
|
||||
**Use Case**: Import with some external IDs, some new issues
|
||||
|
||||
```go
|
||||
issues := []*Issue{
|
||||
{ID: "ext-123", Title: "External issue"}, // Keep ID
|
||||
{ID: "", Title: "New issue"}, // Generate ID
|
||||
{ID: "bd-999", Title: "Explicit ID"}, // Keep ID
|
||||
}
|
||||
```
|
||||
|
||||
**Rationale**: Flexible for import scenarios
|
||||
|
||||
**Complexity**: Low - just count issues needing IDs
|
||||
|
||||
### Decision 5: Prepared Statements Over Multi-VALUE INSERT ✅
|
||||
|
||||
Start with prepared statement loop, optimize later if needed
|
||||
|
||||
**Rationale**:
|
||||
- Simpler implementation
|
||||
- Still much faster than N transactions (5-10x)
|
||||
- Multi-VALUE INSERT only ~2x faster than prepared stmt
|
||||
- Can optimize later if profiling shows need
|
||||
|
||||
### Decision 6: Keep CreateIssue Unchanged ✅
|
||||
|
||||
Don't modify existing CreateIssue implementation
|
||||
|
||||
**Rationale**:
|
||||
- Backward compatibility
|
||||
- No risk to existing callers
|
||||
- Additive change only
|
||||
- Different use cases (single vs batch)
|
||||
|
||||
---
|
||||
|
||||
## When to Use Which API
|
||||
|
||||
### Use CreateIssue (existing)
|
||||
|
||||
- ✅ Interactive CLI: `bd create "Title"`
|
||||
- ✅ Single issue creation
|
||||
- ✅ Agent creating 1-3 issues
|
||||
- ✅ When simplicity matters
|
||||
- ✅ When you want per-issue error handling
|
||||
|
||||
### Use CreateIssues (new)
|
||||
|
||||
- ✅ Bulk import from JSONL (10-1000+ issues)
|
||||
- ✅ Migration from other systems (100-10000+ issues)
|
||||
- ✅ Agent decomposing work into 10-50 issues
|
||||
- ✅ Template instantiation (epic + subtasks)
|
||||
- ✅ Test data generation
|
||||
- ✅ When performance matters
|
||||
|
||||
**Rule of Thumb**: Use CreateIssues for N > 5 issues
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
### Phase 1: Core Implementation ✅
|
||||
- [ ] Add `CreateIssues` to Storage interface (storage/storage.go)
|
||||
- [ ] Implement SQLiteStorage.CreateIssues (storage/sqlite/sqlite.go)
|
||||
- [ ] Add comprehensive unit tests
|
||||
- [ ] Add concurrency tests (multiple batch writers)
|
||||
- [ ] Add performance benchmarks
|
||||
|
||||
### Phase 2: CLI Integration
|
||||
- [ ] Add `bd create-batch` command (or internal use only?)
|
||||
- [ ] Update import.go to use CreateIssues for bulk imports
|
||||
- [ ] Test with real JSONL imports
|
||||
|
||||
### Phase 3: Documentation
|
||||
- [ ] Document CreateIssues API (godoc)
|
||||
- [ ] Add batch import example
|
||||
- [ ] Update EXTENDING.md with batch usage
|
||||
- [ ] Performance notes in README
|
||||
|
||||
### Phase 4: Optimization (if needed)
|
||||
- [ ] Profile CreateIssues with 100, 1000, 10000 issues
|
||||
- [ ] Optimize to multi-VALUE INSERT if needed
|
||||
- [ ] Consider batch size limits (split large batches)
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```go
|
||||
func TestCreateIssues_Empty(t *testing.T)
|
||||
func TestCreateIssues_Single(t *testing.T)
|
||||
func TestCreateIssues_Multiple(t *testing.T)
|
||||
func TestCreateIssues_WithExplicitIDs(t *testing.T)
|
||||
func TestCreateIssues_MixedIDs(t *testing.T)
|
||||
func TestCreateIssues_ValidationError(t *testing.T)
|
||||
func TestCreateIssues_DuplicateID(t *testing.T)
|
||||
func TestCreateIssues_RollbackOnError(t *testing.T)
|
||||
```
|
||||
|
||||
### Concurrency Tests
|
||||
|
||||
```go
|
||||
func TestCreateIssues_Concurrent(t *testing.T) {
|
||||
// 10 goroutines each creating 100 issues
|
||||
// Verify no ID collisions
|
||||
// Verify all issues created
|
||||
}
|
||||
|
||||
func TestCreateIssues_MixedWithCreateIssue(t *testing.T) {
|
||||
// Concurrent CreateIssue + CreateIssues
|
||||
// Verify no ID collisions
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Benchmarks
|
||||
|
||||
```go
|
||||
func BenchmarkCreateIssue_Sequential(b *testing.B)
|
||||
func BenchmarkCreateIssues_Batch(b *testing.B)
|
||||
|
||||
// Expected results (100 issues):
|
||||
// CreateIssue x100: ~600-1200ms
|
||||
// CreateIssues: ~60-130ms
|
||||
// Speedup: 5-10x
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```go
|
||||
func TestImport_LargeJSONL(t *testing.T) {
|
||||
// Import 1000 issues from JSONL
|
||||
// Verify all created correctly
|
||||
// Verify performance < 1s
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Plan
|
||||
|
||||
### Step 1: Add Interface Method (Non-Breaking)
|
||||
|
||||
```go
|
||||
// storage/storage.go
|
||||
type Storage interface {
|
||||
CreateIssue(ctx context.Context, issue *types.Issue, actor string) error
|
||||
CreateIssues(ctx context.Context, issues []*types.Issue, actor string) error // NEW
|
||||
// ... rest unchanged
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Implement SQLiteStorage.CreateIssues
|
||||
|
||||
Follow implementation strategy above
|
||||
|
||||
### Step 3: Add Tests
|
||||
|
||||
Comprehensive unit + concurrency + benchmark tests
|
||||
|
||||
### Step 4: Update Import (Optional)
|
||||
|
||||
```go
|
||||
// cmd/bd/import.go - replace loop with batch
|
||||
func importIssues(store Storage, issues []*Issue) error {
|
||||
// Old:
|
||||
// for _, issue := range issues {
|
||||
// store.CreateIssue(ctx, issue, "import")
|
||||
// }
|
||||
|
||||
// New:
|
||||
return store.CreateIssues(ctx, issues, "import")
|
||||
}
|
||||
```
|
||||
|
||||
**Note**: Start with internal use (import), expose CLI later if needed
|
||||
|
||||
### Step 5: Performance Testing
|
||||
|
||||
```bash
|
||||
# Generate test JSONL
|
||||
bd export > backup.jsonl
|
||||
|
||||
# Duplicate 100x for stress test
|
||||
cat backup.jsonl backup.jsonl ... > large_test.jsonl
|
||||
|
||||
# Test import performance
|
||||
time bd import large_test.jsonl
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements (NOT for bd-222)
|
||||
|
||||
### Batch Size Limits
|
||||
|
||||
If very large batches cause memory issues:
|
||||
|
||||
```go
|
||||
func (s *SQLiteStorage) CreateIssues(ctx, issues, actor) error {
|
||||
const maxBatchSize = 1000
|
||||
|
||||
for i := 0; i < len(issues); i += maxBatchSize {
|
||||
end := min(i+maxBatchSize, len(issues))
|
||||
batch := issues[i:end]
|
||||
|
||||
if err := s.createIssuesBatch(ctx, batch, actor); err != nil {
|
||||
return fmt.Errorf("batch %d-%d failed: %w", i, end, err)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
**Decision**: Don't implement until we see issues with large batches (>1000)
|
||||
|
||||
### Validation-Only Mode
|
||||
|
||||
Pre-validate batch without creating:
|
||||
|
||||
```go
|
||||
func (s *SQLiteStorage) ValidateIssues(ctx, issues) error
|
||||
```
|
||||
|
||||
**Use Case**: Dry-run before bulk import
|
||||
|
||||
**Decision**: Add if import workflows request it
|
||||
|
||||
### Progress Callbacks
|
||||
|
||||
Report progress for long-running batches:
|
||||
|
||||
```go
|
||||
type BatchProgress func(completed, total int)
|
||||
|
||||
func (s *SQLiteStorage) CreateIssuesWithProgress(ctx, issues, actor, progress) error
|
||||
```
|
||||
|
||||
**Decision**: Add if agent workflows request it (likely for 1000+ issue batches)
|
||||
|
||||
---
|
||||
|
||||
## Performance Analysis
|
||||
|
||||
### Baseline (CreateIssue loop)
|
||||
|
||||
For 100 issues:
|
||||
```
|
||||
Connection overhead: 100ms (1ms × 100)
|
||||
Transaction overhead: 300ms (3ms × 100, with lock contention)
|
||||
ID generation: 250ms (2.5ms × 100)
|
||||
Insert + event: 250ms (2.5ms × 100)
|
||||
Total: 900ms
|
||||
```
|
||||
|
||||
### With CreateIssues
|
||||
|
||||
For 100 issues:
|
||||
```
|
||||
Connection overhead: 1ms (1 connection)
|
||||
Transaction overhead: 5ms (1 transaction)
|
||||
ID range generation: 15ms (1 query, more complex)
|
||||
Bulk insert (prep): 50ms (prepared stmt × 100)
|
||||
Bulk events (prep): 30ms (prepared stmt × 100)
|
||||
Bulk dirty (prep): 20ms (prepared stmt × 100)
|
||||
Commit: 5ms
|
||||
Total: 126ms (7x faster)
|
||||
```
|
||||
|
||||
### Scalability
|
||||
|
||||
| Issues | CreateIssue Loop | CreateIssues | Speedup |
|
||||
|--------|------------------|--------------|---------|
|
||||
| 10 | 90ms | 30ms | 3x |
|
||||
| 100 | 900ms | 126ms | 7x |
|
||||
| 1000 | 9s | 800ms | 11x |
|
||||
| 10000 | 90s | 6s | 15x |
|
||||
|
||||
**Key Insight**: Speedup increases with batch size due to fixed overhead amortization
|
||||
|
||||
---
|
||||
|
||||
## Why This Solution Wins
|
||||
|
||||
### For Individual Devs & Small Teams
|
||||
- **Zero impact on normal workflow**: CreateIssue unchanged
|
||||
- **Fast imports**: 1000 issues in <1s instead of 10s
|
||||
- **Simple mental model**: All-or-nothing batch
|
||||
- **No new concepts**: Same semantics as CreateIssue, just faster
|
||||
|
||||
### For Agent Swarms
|
||||
- **Efficient decomposition**: Agent creates 50 subtasks in one call
|
||||
- **Atomic work generation**: All issues created or none
|
||||
- **No connection exhaustion**: One connection per batch
|
||||
- **Safe concurrency**: BEGIN IMMEDIATE prevents races
|
||||
|
||||
### For New Codebase
|
||||
- **Non-breaking change**: Additive API only
|
||||
- **Performance win**: 5-15x faster for bulk operations
|
||||
- **Simple implementation**: ~200 LOC, similar to CreateIssue
|
||||
- **Battle-tested pattern**: Same transaction semantics as CreateIssue
|
||||
|
||||
---
|
||||
|
||||
## Alternatives Considered and Rejected
|
||||
|
||||
### Alternative 1: Auto-Batch in CreateIssue
|
||||
|
||||
Automatically detect rapid CreateIssue calls and batch them.
|
||||
|
||||
**Why Rejected**:
|
||||
- ❌ Magical behavior (implicit batching)
|
||||
- ❌ Complex implementation (goroutine + timer + coordination)
|
||||
- ❌ Race conditions and edge cases
|
||||
- ❌ Unpredictable performance (when does batch trigger?)
|
||||
- ❌ Can't guarantee atomicity across auto-batch boundary
|
||||
|
||||
### Alternative 2: Separate Import API
|
||||
|
||||
Add ImportIssues specifically for JSONL import, not general-purpose.
|
||||
|
||||
**Why Rejected**:
|
||||
- ❌ Limits use cases (what about agent workflows?)
|
||||
- ❌ Name doesn't match behavior (it's just batch create)
|
||||
- ❌ CreateIssues is more discoverable and general
|
||||
|
||||
### Alternative 3: Streaming API
|
||||
|
||||
```go
|
||||
type IssueStream interface {
|
||||
Send(*Issue) error
|
||||
CloseAndCommit() error
|
||||
}
|
||||
func (s *SQLiteStorage) CreateIssueStream(ctx, actor) (IssueStream, error)
|
||||
```
|
||||
|
||||
**Why Rejected**:
|
||||
- ❌ More complex API (stateful stream object)
|
||||
- ❌ Error handling complexity (partial writes?)
|
||||
- ❌ Doesn't match Go/SQL idioms
|
||||
- ❌ Caller must manage stream lifecycle
|
||||
- ❌ Simple slice is easier to work with
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The **simple all-or-nothing batch API** (CreateIssues) is the best solution because:
|
||||
|
||||
1. **Significant performance win**: 5-15x faster for bulk operations
|
||||
2. **Simple API**: Just like CreateIssue but with slice
|
||||
3. **Safe**: Atomic transaction, no partial state
|
||||
4. **Non-breaking**: Existing CreateIssue unchanged
|
||||
5. **Flexible**: Supports mixed ID assignment (auto + explicit)
|
||||
6. **Proven pattern**: Same transaction semantics as CreateIssue
|
||||
|
||||
The key insight is **atomic ID range reservation** - updating the counter once for N issues instead of N times. Combined with a single transaction and prepared statements, this provides major performance improvements without complexity.
|
||||
|
||||
This aligns perfectly with beads' goals: simple for individual devs, efficient for bulk operations, robust for agent swarms.
|
||||
|
||||
**Implementation size**: ~200 LOC + ~400 LOC tests = manageable, low-risk change
|
||||
**Expected performance**: 5-15x faster for bulk operations (N > 10)
|
||||
**Risk**: Low (additive API, comprehensive tests)
|
||||
@@ -1,373 +0,0 @@
|
||||
# Ultrathink: Solving status/closed_at Inconsistency (bd-224)
|
||||
|
||||
**Date**: 2025-10-15
|
||||
**Context**: Individual devs, small teams, future agent swarms, new codebase
|
||||
**Problem**: Data model allows `status='open'` with `closed_at != NULL` (liminal state)
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Recommended Solution**: **Hybrid approach - Database CHECK constraint + Application enforcement**
|
||||
|
||||
This provides defense-in-depth perfect for agent swarms while keeping the model simple for individual developers.
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Where closed_at is Used
|
||||
|
||||
1. **GetIssue**: Returns it to callers
|
||||
2. **CloseIssue**: Sets `closed_at = now()` when closing
|
||||
3. **SearchIssues**: Includes in results
|
||||
4. **GetStatistics** ⚠️ **CRITICAL**:
|
||||
```sql
|
||||
SELECT AVG((julianday(closed_at) - julianday(created_at)) * 24)
|
||||
FROM issues
|
||||
WHERE closed_at IS NOT NULL
|
||||
```
|
||||
Uses `closed_at IS NOT NULL` NOT `status='closed'` for lead time calculation!
|
||||
|
||||
### Impact of Inconsistency
|
||||
|
||||
- **Statistics are wrong**: Issues with `status='open'` but `closed_at != NULL` pollute lead time metrics
|
||||
- **User confusion**: bd ready shows "closed" issues
|
||||
- **Agent workflows**: Unpredictable behavior when agents query by status vs closed_at
|
||||
- **Data integrity**: Can't trust the data model
|
||||
|
||||
### Root Cause: Import & Update Don't Manage Invariant
|
||||
|
||||
**Import** (cmd/bd/import.go:206-207):
|
||||
```go
|
||||
if _, ok := rawData["status"]; ok {
|
||||
updates["status"] = issue.Status // ← Updates status
|
||||
}
|
||||
// ⚠️ Does NOT clear/set closed_at
|
||||
```
|
||||
|
||||
**UpdateIssue** (internal/storage/sqlite/sqlite.go:509-624):
|
||||
- Updates any field in the map
|
||||
- Does NOT automatically manage closed_at when status changes
|
||||
- Records EventClosed but doesn't enforce the invariant
|
||||
|
||||
**Concrete example (bd-89)**:
|
||||
1. Issue closed properly: `status='closed'`, `closed_at='2025-10-15 08:13:08'`
|
||||
2. JSONL had old state: `status='open'`, `closed_at='2025-10-14 02:58:22'` (inconsistent!)
|
||||
3. Auto-import updated status to 'open' but left closed_at set
|
||||
4. Result: Inconsistent state in database
|
||||
|
||||
---
|
||||
|
||||
## Solution Options
|
||||
|
||||
### Option A: Database CHECK Constraint ⭐ **RECOMMENDED FOUNDATION**
|
||||
|
||||
```sql
|
||||
ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
|
||||
CHECK ((status = 'closed') = (closed_at IS NOT NULL));
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- ✅ Enforces invariant at database level (most robust)
|
||||
- ✅ Catches bugs in ANY code path (future-proof)
|
||||
- ✅ Works across all clients (CLI, MCP, future integrations)
|
||||
- ✅ Simple to understand for developers
|
||||
- ✅ **Perfect for agent swarms**: Can't break it with buggy code
|
||||
- ✅ Prevents inconsistent exports (can't write bad data to JSONL)
|
||||
|
||||
**Cons:**
|
||||
- ⚠️ Requires migration
|
||||
- ⚠️ Need to fix existing inconsistent data first
|
||||
- ⚠️ Update operations must manage closed_at (but we should do this anyway!)
|
||||
|
||||
**Migration complexity**: LOW - few users, can break things
|
||||
|
||||
### Option B: Application-Level Enforcement Only
|
||||
|
||||
Make UpdateIssue and Import smart about status changes.
|
||||
|
||||
**Pros:**
|
||||
- ✅ No schema change needed
|
||||
- ✅ Flexible for edge cases
|
||||
|
||||
**Cons:**
|
||||
- ❌ Easy to forget in new code paths
|
||||
- ❌ Doesn't protect against direct SQL manipulation
|
||||
- ❌ Multiple places to maintain (import, update, close, etc.)
|
||||
- ❌ **Bad for agent swarms**: One buggy agent breaks the model
|
||||
- ❌ Still allows export of inconsistent data
|
||||
|
||||
**Verdict**: Not robust enough alone
|
||||
|
||||
### Option C: Add Explicit Reopened Support
|
||||
|
||||
Add `bd reopen` command that uses EventReopened and manages closed_at.
|
||||
|
||||
**Pros:**
|
||||
- ✅ Makes reopening explicit and trackable
|
||||
- ✅ EventReopened already defined (types.go:150) but unused
|
||||
|
||||
**Cons:**
|
||||
- ⚠️ Doesn't solve the fundamental invariant problem
|
||||
- ⚠️ Still need to decide: clear closed_at or keep it?
|
||||
- ⚠️ More complex model if we keep historical closed_at
|
||||
|
||||
**Verdict**: Good addition, but doesn't solve root cause
|
||||
|
||||
### Option D: Remove closed_at Entirely
|
||||
|
||||
Make events table the single source of truth.
|
||||
|
||||
**Pros:**
|
||||
- ✅ Simplest data model
|
||||
- ✅ No invariant to maintain
|
||||
- ✅ Events are authoritative
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Performance**: Lead time calculation requires JOIN + subquery
|
||||
```sql
|
||||
SELECT AVG(
|
||||
(julianday(e.created_at) - julianday(i.created_at)) * 24
|
||||
)
|
||||
FROM issues i
|
||||
JOIN events e ON i.id = e.issue_id
|
||||
WHERE e.event_type = 'closed'
|
||||
```
|
||||
- ❌ Events could be missing/corrupted (no referential integrity on event_type)
|
||||
- ❌ More complex queries throughout codebase
|
||||
- ❌ **Statistics would be slower** (critical for dashboard UIs)
|
||||
|
||||
**Verdict**: Too much complexity/performance cost for the benefit
|
||||
|
||||
---
|
||||
|
||||
## Recommended Solution: **Hybrid Approach**
|
||||
|
||||
Combine **Option A (DB constraint)** + **Application enforcement** + **Option C (reopen command)**
|
||||
|
||||
### Part 1: Database Constraint (Foundation)
|
||||
|
||||
```sql
|
||||
-- Migration: First clean up existing inconsistent data
|
||||
UPDATE issues
|
||||
SET closed_at = NULL
|
||||
WHERE status != 'closed' AND closed_at IS NOT NULL;
|
||||
|
||||
UPDATE issues
|
||||
SET closed_at = CURRENT_TIMESTAMP
|
||||
WHERE status = 'closed' AND closed_at IS NULL;
|
||||
|
||||
-- Add the constraint
|
||||
ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
|
||||
CHECK ((status = 'closed') = (closed_at IS NOT NULL));
|
||||
```
|
||||
|
||||
### Part 2: UpdateIssue Smart Status Management
|
||||
|
||||
Modify `internal/storage/sqlite/sqlite.go:509-624`:
|
||||
|
||||
```go
|
||||
func (s *SQLiteStorage) UpdateIssue(ctx context.Context, id string, updates map[string]interface{}, actor string) error {
|
||||
// ... existing validation ...
|
||||
|
||||
// Smart closed_at management based on status changes
|
||||
if statusVal, ok := updates["status"]; ok {
|
||||
newStatus := statusVal.(string)
|
||||
|
||||
if newStatus == string(types.StatusClosed) {
|
||||
// Changing to closed: ensure closed_at is set
|
||||
if _, hasClosedAt := updates["closed_at"]; !hasClosedAt {
|
||||
updates["closed_at"] = time.Now()
|
||||
}
|
||||
} else {
|
||||
// Changing from closed to something else: clear closed_at
|
||||
if oldIssue.Status == types.StatusClosed {
|
||||
updates["closed_at"] = nil // This will set it to NULL
|
||||
eventType = types.EventReopened
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ... rest of existing code ...
|
||||
}
|
||||
```
|
||||
|
||||
### Part 3: Import Enforcement
|
||||
|
||||
Modify `cmd/bd/import.go:206-231`:
|
||||
|
||||
```go
|
||||
if _, ok := rawData["status"]; ok {
|
||||
updates["status"] = issue.Status
|
||||
|
||||
// Enforce closed_at invariant
|
||||
if issue.Status == types.StatusClosed {
|
||||
// Status is closed: ensure closed_at is set
|
||||
if issue.ClosedAt == nil {
|
||||
now := time.Now()
|
||||
updates["closed_at"] = now
|
||||
} else {
|
||||
updates["closed_at"] = *issue.ClosedAt
|
||||
}
|
||||
} else {
|
||||
// Status is not closed: ensure closed_at is NULL
|
||||
updates["closed_at"] = nil
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Part 4: Add Reopen Command
|
||||
|
||||
Create `cmd/bd/reopen.go`:
|
||||
|
||||
```go
|
||||
var reopenCmd = &cobra.Command{
|
||||
Use: "reopen [id...]",
|
||||
Short: "Reopen one or more closed issues",
|
||||
Args: cobra.MinimumNArgs(1),
|
||||
Run: func(cmd *cobra.Command, args []string) {
|
||||
ctx := context.Background()
|
||||
reason, _ := cmd.Flags().GetString("reason")
|
||||
if reason == "" {
|
||||
reason = "Reopened"
|
||||
}
|
||||
|
||||
for _, id := range args {
|
||||
// Use UpdateIssue which now handles closed_at automatically
|
||||
updates := map[string]interface{}{
|
||||
"status": "open",
|
||||
}
|
||||
if err := store.UpdateIssue(ctx, id, updates, getUser()); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error reopening %s: %v\n", id, err)
|
||||
continue
|
||||
}
|
||||
|
||||
// Add comment explaining why
|
||||
if reason != "" {
|
||||
store.AddComment(ctx, id, getUser(), reason)
|
||||
}
|
||||
}
|
||||
|
||||
markDirtyAndScheduleFlush()
|
||||
},
|
||||
}
|
||||
|
||||
func init() {
|
||||
reopenCmd.Flags().StringP("reason", "r", "", "Reason for reopening")
|
||||
rootCmd.AddCommand(reopenCmd)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why This Solution Wins
|
||||
|
||||
### For Individual Devs & Small Teams
|
||||
- **Simple mental model**: `closed_at` is set ⟺ issue is closed
|
||||
- **Hard to break**: DB constraint catches mistakes
|
||||
- **Explicit reopen**: `bd reopen bd-89` is clearer than `bd update bd-89 --status open`
|
||||
- **No manual management**: Don't think about closed_at, it's automatic
|
||||
|
||||
### For Agent Swarms
|
||||
- **Robust**: DB constraint prevents any agent from creating inconsistent state
|
||||
- **Race-safe**: Constraint is atomic, checked at commit time
|
||||
- **Self-healing**: UpdateIssue automatically fixes the relationship
|
||||
- **Import-safe**: Can't import inconsistent JSONL (constraint rejects it)
|
||||
|
||||
### For New Codebase
|
||||
- **Can break things**: Migration is easy with few users
|
||||
- **Sets precedent**: Shows we value data integrity
|
||||
- **Future-proof**: New features can't violate the invariant
|
||||
- **Performance**: No query changes needed, closed_at stays fast
|
||||
|
||||
---
|
||||
|
||||
## Migration Plan
|
||||
|
||||
### Step 1: Clean Existing Data
|
||||
```sql
|
||||
-- Find inconsistent issues
|
||||
SELECT id, status, closed_at FROM issues
|
||||
WHERE (status = 'closed') != (closed_at IS NOT NULL);
|
||||
|
||||
-- Fix them (choose one strategy)
|
||||
-- Strategy A: Trust status field
|
||||
UPDATE issues SET closed_at = NULL
|
||||
WHERE status != 'closed' AND closed_at IS NOT NULL;
|
||||
|
||||
UPDATE issues SET closed_at = CURRENT_TIMESTAMP
|
||||
WHERE status = 'closed' AND closed_at IS NULL;
|
||||
|
||||
-- Strategy B: Trust closed_at field
|
||||
UPDATE issues SET status = 'closed'
|
||||
WHERE status != 'closed' AND closed_at IS NOT NULL;
|
||||
|
||||
UPDATE issues SET status = 'open'
|
||||
WHERE status = 'closed' AND closed_at IS NULL;
|
||||
```
|
||||
|
||||
**Recommendation**: Use Strategy A (trust status) since status is more often correct.
|
||||
|
||||
### Step 2: Add Constraint
|
||||
```sql
|
||||
-- Test first
|
||||
SELECT id FROM issues
|
||||
WHERE (status = 'closed') != (closed_at IS NOT NULL);
|
||||
-- Should return 0 rows
|
||||
|
||||
-- Add constraint
|
||||
ALTER TABLE issues ADD CONSTRAINT chk_closed_at_status
|
||||
CHECK ((status = 'closed') = (closed_at IS NOT NULL));
|
||||
```
|
||||
|
||||
### Step 3: Update Code
|
||||
1. Modify UpdateIssue (sqlite.go)
|
||||
2. Modify Import (import.go)
|
||||
3. Add reopen command
|
||||
4. Add migration function to schema.go
|
||||
|
||||
### Step 4: Test
|
||||
```bash
|
||||
# Test the constraint rejects bad writes
|
||||
sqlite3 .beads/bd.db "UPDATE issues SET closed_at = NULL WHERE id = 'bd-1' AND status = 'closed';"
|
||||
# Should fail with constraint violation
|
||||
|
||||
# Test update handles it automatically
|
||||
bd update bd-89 --status open
|
||||
bd show bd-89 --json | jq '{status, closed_at}'
|
||||
# Should show: {"status": "open", "closed_at": null}
|
||||
|
||||
# Test reopen
|
||||
bd create "Test issue" -p 1
|
||||
bd close bd-226
|
||||
bd reopen bd-226 --reason "Need more work"
|
||||
# Should work without errors
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative Considered: Soft Close with closed_at History
|
||||
|
||||
Keep closed_at as "first time closed" and use events for current state.
|
||||
|
||||
**Why rejected**:
|
||||
- More complex model (two sources of truth)
|
||||
- Unclear semantics (what does closed_at mean?)
|
||||
- Lead time calculation becomes ambiguous (first close? most recent?)
|
||||
- Doesn't simplify the problem
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The **hybrid approach** (DB constraint + smart UpdateIssue + import enforcement + reopen command) is the best solution because:
|
||||
|
||||
1. **Defense in depth**: Multiple layers prevent inconsistency
|
||||
2. **Hard to break**: Perfect for agent swarms
|
||||
3. **Simple for users**: Automatic management of closed_at
|
||||
4. **Low migration cost**: Can break things in new codebase
|
||||
5. **Clear semantics**: closed_at = "currently closed" (not historical)
|
||||
6. **Performance**: No query changes needed
|
||||
|
||||
The DB constraint is the key insight: it makes the system robust against future bugs, new code paths, and agent mistakes. Combined with smart application code, it creates a self-healing system that's hard to misuse.
|
||||
|
||||
This aligns perfectly with beads' goals: simple for individual devs, robust for agent swarms.
|
||||
Reference in New Issue
Block a user