Files

Steve Yegge 216f640ab4 Fix getNextID bug and design collision resolution (bd-9)

Critical bug fix: getNextID() was using alphabetical MAX instead of
numerical MAX, causing "bd-9" to be treated as max when "bd-10" existed.
This blocked all new issue creation after bd-10.

Fixed by using SQL CAST to extract and compare numeric portions of IDs.
This ensures bd-10 > bd-9 numerically, not alphabetically.

Also completed comprehensive design for bd-9 (collision resolution):
- Algorithm design with 7 phases (detection, scoring, remapping, etc.)
- Created 7 child issues (bd-10, bd-12-17) breaking down implementation
- Added design documents to .beads/ for future reference
- Updated issues JSONL with new issues and dependencies

Issues created:
- bd-10: Export dependencies in JSONL
- bd-12: Collision detection
- bd-13: Reference scoring algorithm
- bd-14: ID remapping with updates
- bd-15: CLI flags and reporting
- bd-16: Comprehensive tests
- bd-17: Documentation updates
- bd-18: Add design/notes fields to update command

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-12 14:41:58 -07:00

9.2 KiB

Raw Blame History

BD-9: Collision Resolution Design Document

Status: In progress, design complete, ready for implementation Date: 2025-10-12 Issue: bd-9 - Build collision resolution tooling for distributed branch workflows

Problem Statement

When branches diverge and both create issues, auto-incrementing IDs collide on merge:

Branch A creates bd-10, bd-11, bd-12
Branch B (diverged) creates bd-10, bd-11, bd-12 (different issues!)
On merge: 6 issues, but 3 duplicate IDs
References to "bd-10" in descriptions/dependencies are now ambiguous

Design Goals

Preserve brevity - Keep bd-302 format, not bd-302-branch-a-uuid-mess
Minimize disruption - Renumber issues with fewer references
Update all references - Text fields AND dependency table
Atomic operation - All or nothing
Clear feedback - User must understand what changed

Algorithm Design

Phase 1: Collision Detection

Input: JSONL issues + current DB state
Output: Set of colliding issues

for each issue in JSONL:
  if DB contains issue.ID:
    if DB issue == JSONL issue:
      skip (already imported, idempotent)
    else:
      mark as COLLISION

Phase 2: Reference Counting (The Smart Part)

Renumber issues with FEWER references first because:

If bd-10 is referenced 20 times and bd-11 once
Renumbering bd-11→bd-15 updates 1 reference
Renumbering bd-10→bd-15 updates 20 references

for each colliding_issue:
  score = 0

  // Count text references in OTHER issues
  for each other_issue in JSONL:
    score += count_mentions(other_issue.all_text, colliding_issue.ID)

  // Count dependency references
  deps = DB.get_dependents(colliding_issue.ID)  // who depends on me?
  score += len(deps)

  // Store score
  collision_scores[colliding_issue.ID] = score

// Sort ascending: lowest score = fewest references = renumber first
sorted_collisions = sort_by(collision_scores)

Phase 3: ID Allocation

id_mapping = {}  // old_id -> new_id
next_num = DB.get_next_id_number()

for collision in sorted_collisions:
  // Find next available ID
  while true:
    candidate = f"{prefix}-{next_num}"
    if not DB.exists(candidate) and candidate not in id_mapping.values():
      id_mapping[collision.ID] = candidate
      next_num++
      break
    next_num++

Phase 4: Reference Updates

This is the trickiest part - must update:

Issue IDs themselves
Text field references (description, design, notes, acceptance_criteria)
Dependency records (when they reference old IDs)

updated_issues = []
reference_update_count = 0

for issue in JSONL:
  new_issue = clone(issue)

  // 1. Update own ID if it collided
  if issue.ID in id_mapping:
    new_issue.ID = id_mapping[issue.ID]

  // 2. Update text field references
  for old_id, new_id in id_mapping:
    for field in [title, description, design, notes, acceptance_criteria]:
      if field:
        pattern = r'\b' + re.escape(old_id) + r'\b'
        new_text, count = re.subn(pattern, new_id, field)
        field = new_text
        reference_update_count += count

  updated_issues.append(new_issue)

Phase 5: Dependency Handling

Approach A: Export dependencies in JSONL (PREFERRED)

Extend export to include "dependencies": [{...}] per issue
Import dependencies along with issues
Update dependency records during phase 4

Why preferred:

Self-contained JSONL (better for git workflow)
Easier to reason about
Can detect cross-file dependencies

Phase 6: Atomic Import

transaction:
  for issue in updated_issues:
    if issue.ID was remapped:
      DB.create_issue(issue)
    else:
      DB.upsert_issue(issue)

  // Update dependency table
  for issue in updated_issues:
    for dep in issue.dependencies:
      // dep IDs already updated in phase 4
      DB.create_or_update_dependency(dep)

  commit

Phase 7: User Reporting

report = {
  collisions_detected: N,
  remappings: [
    "bd-10 → bd-15 (Score: 3 references)",
    "bd-11 → bd-16 (Score: 15 references)",
  ],
  text_updates: M,
  dependency_updates: K,
}

Edge Cases

Chain dependencies: bd-10 depends on bd-11, both collide
- Sorted renumbering handles this naturally
- Lower-referenced one renumbered first
Circular dependencies: Shouldn't happen (DB has cycle detection)
Partial ID matches: "bd-1" shouldn't match "bd-10"
- Use word boundary regex: \bbd-10\b
Case sensitivity: IDs are case-sensitive (bd-10 ≠ BD-10)
IDs in code blocks: Will be replaced
- Could add --preserve-code-blocks flag later
Triple merges: Branch A, B, C all create bd-10
- Algorithm handles N collisions
Dependencies pointing to DB-only issues:
- JSONL issue depends on bd-999 (only in DB)
- No collision, works fine

Performance Considerations

O(N*M) for reference counting (N issues × M collisions)
For 1000 issues, 10 collisions: 10,000 text scans
Acceptable for typical use (hundreds of issues)
Could optimize with index/trie if needed

API Design

# Default: fail on collision (safe)
bd import -i issues.jsonl
# Error: Collision detected: bd-10 already exists

# With auto-resolution
bd import -i issues.jsonl --resolve-collisions
# Resolved 3 collisions:
#   bd-10 → bd-15 (3 refs)
#   bd-11 → bd-16 (1 ref)
#   bd-12 → bd-17 (7 refs)
# Imported 45 issues, updated 23 references

# Dry run (preview changes)
bd import -i issues.jsonl --resolve-collisions --dry-run

Implementation Breakdown

Child Issues to Create

bd-10: Extend export to include dependencies in JSONL
- Modify export.go to include dependencies array
- Format: {"id":"bd-10","dependencies":[{"depends_on_id":"bd-5","type":"blocks"}]}
- Priority: 1, Type: task
bd-11: Implement collision detection in import
- Create collision.go with detectCollisions() function
- Compare incoming JSONL against DB state
- Distinguish: exact match (skip), collision (flag), new (create)
- Priority: 1, Type: task
bd-12: Implement reference scoring algorithm
- Count text mentions + dependency references
- Sort collisions by score ascending (fewest refs first)
- Minimize total updates during renumbering
- Priority: 1, Type: task
bd-13: Implement ID remapping with reference updates
- Allocate new IDs for colliding issues
- Update text field references with word-boundary regex
- Update dependency records
- Build id_mapping for reporting
- Priority: 1, Type: task
bd-14: Add --resolve-collisions flag and user reporting
- Add import flags: --resolve-collisions, --dry-run
- Display clear report with remappings and counts
- Default: fail on collision (safe)
- Priority: 1, Type: task
bd-15: Write comprehensive collision resolution tests
- Test cases: simple/multiple collisions, dependencies, text refs
- Edge cases: partial matches, case sensitivity, triple merges
- Add to import_test.go and collision_test.go
- Priority: 1, Type: task
bd-16: Update documentation for collision resolution
- Update README.md with collision resolution section
- Update CLAUDE.md with new workflow
- Document flags and example scenarios
- Priority: 1, Type: task

Additional Issue: Add Design Field Support

NEW ISSUE: Add design field to bd update command

Currently: bd update doesn't support --design flag (discovered during work)
Need: Allow updating design, notes, acceptance_criteria fields
This would make bd-9's design easier to attach to the issue itself
Priority: 2, Type: feature

Current State

bd-9 is in_progress (claimed)
bd-10 was successfully created (first child issue)
bd-11+ creation failed with UNIQUE constraint (collision!)
- This demonstrates the exact problem we're solving
- Need to manually create remaining issues with different IDs
- Or implement collision resolution first! (chicken/egg)

Data Structures Involved

Issues table: id, title, description, design, notes, acceptance_criteria, status, priority, issue_type, assignee, estimated_minutes, created_at, updated_at, closed_at
Dependencies table: issue_id, depends_on_id, type, created_at, created_by
Text fields with ID references: description, design, notes, acceptance_criteria (title too?)

Files to Modify

cmd/bd/export.go - Add dependency export
cmd/bd/import.go - Call collision resolution
cmd/bd/collision.go - NEW FILE - Core algorithm
cmd/bd/collision_test.go - NEW FILE - Tests
internal/types/types.go - May need collision report types
README.md - Documentation
CLAUDE.md - AI agent workflow docs

Next Steps

✅ Design complete
🔄 Create child issues (bd-10 created, bd-11+ need different IDs)
⏳ Implement Phase 1: Export enhancement
⏳ Implement Phase 2-7: Core algorithm
⏳ Tests
⏳ Documentation
⏳ Export issues to JSONL before committing

Meta: Real Collision Encountered!

While creating child issues, we hit the exact problem:

bd-10 was created successfully
bd-11, bd-12, bd-13, bd-14, bd-15, bd-16 all failed with "UNIQUE constraint failed"
This means the DB already has bd-11+ from a previous session/import
Perfect demonstration of why we need collision resolution!

Resolution: Create remaining child issues manually with explicit IDs after checking what exists.

9.2 KiB Raw Blame History Unescape Escape