Files
beads/.beads/bd-9-design.md
Steve Yegge 216f640ab4 Fix getNextID bug and design collision resolution (bd-9)
Critical bug fix: getNextID() was using alphabetical MAX instead of
numerical MAX, causing "bd-9" to be treated as max when "bd-10" existed.
This blocked all new issue creation after bd-10.

Fixed by using SQL CAST to extract and compare numeric portions of IDs.
This ensures bd-10 > bd-9 numerically, not alphabetically.

Also completed comprehensive design for bd-9 (collision resolution):
- Algorithm design with 7 phases (detection, scoring, remapping, etc.)
- Created 7 child issues (bd-10, bd-12-17) breaking down implementation
- Added design documents to .beads/ for future reference
- Updated issues JSONL with new issues and dependencies

Issues created:
- bd-10: Export dependencies in JSONL
- bd-12: Collision detection
- bd-13: Reference scoring algorithm
- bd-14: ID remapping with updates
- bd-15: CLI flags and reporting
- bd-16: Comprehensive tests
- bd-17: Documentation updates
- bd-18: Add design/notes fields to update command

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 14:41:58 -07:00

9.2 KiB
Raw Blame History

BD-9: Collision Resolution Design Document

Status: In progress, design complete, ready for implementation Date: 2025-10-12 Issue: bd-9 - Build collision resolution tooling for distributed branch workflows

Problem Statement

When branches diverge and both create issues, auto-incrementing IDs collide on merge:

  • Branch A creates bd-10, bd-11, bd-12
  • Branch B (diverged) creates bd-10, bd-11, bd-12 (different issues!)
  • On merge: 6 issues, but 3 duplicate IDs
  • References to "bd-10" in descriptions/dependencies are now ambiguous

Design Goals

  1. Preserve brevity - Keep bd-302 format, not bd-302-branch-a-uuid-mess
  2. Minimize disruption - Renumber issues with fewer references
  3. Update all references - Text fields AND dependency table
  4. Atomic operation - All or nothing
  5. Clear feedback - User must understand what changed

Algorithm Design

Phase 1: Collision Detection

Input: JSONL issues + current DB state
Output: Set of colliding issues

for each issue in JSONL:
  if DB contains issue.ID:
    if DB issue == JSONL issue:
      skip (already imported, idempotent)
    else:
      mark as COLLISION

Phase 2: Reference Counting (The Smart Part)

Renumber issues with FEWER references first because:

  • If bd-10 is referenced 20 times and bd-11 once
  • Renumbering bd-11→bd-15 updates 1 reference
  • Renumbering bd-10→bd-15 updates 20 references
for each colliding_issue:
  score = 0

  // Count text references in OTHER issues
  for each other_issue in JSONL:
    score += count_mentions(other_issue.all_text, colliding_issue.ID)

  // Count dependency references
  deps = DB.get_dependents(colliding_issue.ID)  // who depends on me?
  score += len(deps)

  // Store score
  collision_scores[colliding_issue.ID] = score

// Sort ascending: lowest score = fewest references = renumber first
sorted_collisions = sort_by(collision_scores)

Phase 3: ID Allocation

id_mapping = {}  // old_id -> new_id
next_num = DB.get_next_id_number()

for collision in sorted_collisions:
  // Find next available ID
  while true:
    candidate = f"{prefix}-{next_num}"
    if not DB.exists(candidate) and candidate not in id_mapping.values():
      id_mapping[collision.ID] = candidate
      next_num++
      break
    next_num++

Phase 4: Reference Updates

This is the trickiest part - must update:

  1. Issue IDs themselves
  2. Text field references (description, design, notes, acceptance_criteria)
  3. Dependency records (when they reference old IDs)
updated_issues = []
reference_update_count = 0

for issue in JSONL:
  new_issue = clone(issue)

  // 1. Update own ID if it collided
  if issue.ID in id_mapping:
    new_issue.ID = id_mapping[issue.ID]

  // 2. Update text field references
  for old_id, new_id in id_mapping:
    for field in [title, description, design, notes, acceptance_criteria]:
      if field:
        pattern = r'\b' + re.escape(old_id) + r'\b'
        new_text, count = re.subn(pattern, new_id, field)
        field = new_text
        reference_update_count += count

  updated_issues.append(new_issue)

Phase 5: Dependency Handling

Approach A: Export dependencies in JSONL (PREFERRED)

  • Extend export to include "dependencies": [{...}] per issue
  • Import dependencies along with issues
  • Update dependency records during phase 4

Why preferred:

  • Self-contained JSONL (better for git workflow)
  • Easier to reason about
  • Can detect cross-file dependencies

Phase 6: Atomic Import

transaction:
  for issue in updated_issues:
    if issue.ID was remapped:
      DB.create_issue(issue)
    else:
      DB.upsert_issue(issue)

  // Update dependency table
  for issue in updated_issues:
    for dep in issue.dependencies:
      // dep IDs already updated in phase 4
      DB.create_or_update_dependency(dep)

  commit

Phase 7: User Reporting

report = {
  collisions_detected: N,
  remappings: [
    "bd-10 → bd-15 (Score: 3 references)",
    "bd-11 → bd-16 (Score: 15 references)",
  ],
  text_updates: M,
  dependency_updates: K,
}

Edge Cases

  1. Chain dependencies: bd-10 depends on bd-11, both collide

    • Sorted renumbering handles this naturally
    • Lower-referenced one renumbered first
  2. Circular dependencies: Shouldn't happen (DB has cycle detection)

  3. Partial ID matches: "bd-1" shouldn't match "bd-10"

    • Use word boundary regex: \bbd-10\b
  4. Case sensitivity: IDs are case-sensitive (bd-10 ≠ BD-10)

  5. IDs in code blocks: Will be replaced

    • Could add --preserve-code-blocks flag later
  6. Triple merges: Branch A, B, C all create bd-10

    • Algorithm handles N collisions
  7. Dependencies pointing to DB-only issues:

    • JSONL issue depends on bd-999 (only in DB)
    • No collision, works fine

Performance Considerations

  • O(N*M) for reference counting (N issues × M collisions)
  • For 1000 issues, 10 collisions: 10,000 text scans
  • Acceptable for typical use (hundreds of issues)
  • Could optimize with index/trie if needed

API Design

# Default: fail on collision (safe)
bd import -i issues.jsonl
# Error: Collision detected: bd-10 already exists

# With auto-resolution
bd import -i issues.jsonl --resolve-collisions
# Resolved 3 collisions:
#   bd-10 → bd-15 (3 refs)
#   bd-11 → bd-16 (1 ref)
#   bd-12 → bd-17 (7 refs)
# Imported 45 issues, updated 23 references

# Dry run (preview changes)
bd import -i issues.jsonl --resolve-collisions --dry-run

Implementation Breakdown

Child Issues to Create

  1. bd-10: Extend export to include dependencies in JSONL

    • Modify export.go to include dependencies array
    • Format: {"id":"bd-10","dependencies":[{"depends_on_id":"bd-5","type":"blocks"}]}
    • Priority: 1, Type: task
  2. bd-11: Implement collision detection in import

    • Create collision.go with detectCollisions() function
    • Compare incoming JSONL against DB state
    • Distinguish: exact match (skip), collision (flag), new (create)
    • Priority: 1, Type: task
  3. bd-12: Implement reference scoring algorithm

    • Count text mentions + dependency references
    • Sort collisions by score ascending (fewest refs first)
    • Minimize total updates during renumbering
    • Priority: 1, Type: task
  4. bd-13: Implement ID remapping with reference updates

    • Allocate new IDs for colliding issues
    • Update text field references with word-boundary regex
    • Update dependency records
    • Build id_mapping for reporting
    • Priority: 1, Type: task
  5. bd-14: Add --resolve-collisions flag and user reporting

    • Add import flags: --resolve-collisions, --dry-run
    • Display clear report with remappings and counts
    • Default: fail on collision (safe)
    • Priority: 1, Type: task
  6. bd-15: Write comprehensive collision resolution tests

    • Test cases: simple/multiple collisions, dependencies, text refs
    • Edge cases: partial matches, case sensitivity, triple merges
    • Add to import_test.go and collision_test.go
    • Priority: 1, Type: task
  7. bd-16: Update documentation for collision resolution

    • Update README.md with collision resolution section
    • Update CLAUDE.md with new workflow
    • Document flags and example scenarios
    • Priority: 1, Type: task

Additional Issue: Add Design Field Support

NEW ISSUE: Add design field to bd update command

  • Currently: bd update doesn't support --design flag (discovered during work)
  • Need: Allow updating design, notes, acceptance_criteria fields
  • This would make bd-9's design easier to attach to the issue itself
  • Priority: 2, Type: feature

Current State

  • bd-9 is in_progress (claimed)
  • bd-10 was successfully created (first child issue)
  • bd-11+ creation failed with UNIQUE constraint (collision!)
    • This demonstrates the exact problem we're solving
    • Need to manually create remaining issues with different IDs
    • Or implement collision resolution first! (chicken/egg)

Data Structures Involved

  • Issues table: id, title, description, design, notes, acceptance_criteria, status, priority, issue_type, assignee, estimated_minutes, created_at, updated_at, closed_at
  • Dependencies table: issue_id, depends_on_id, type, created_at, created_by
  • Text fields with ID references: description, design, notes, acceptance_criteria (title too?)

Files to Modify

  1. cmd/bd/export.go - Add dependency export
  2. cmd/bd/import.go - Call collision resolution
  3. cmd/bd/collision.go - NEW FILE - Core algorithm
  4. cmd/bd/collision_test.go - NEW FILE - Tests
  5. internal/types/types.go - May need collision report types
  6. README.md - Documentation
  7. CLAUDE.md - AI agent workflow docs

Next Steps

  1. Design complete
  2. 🔄 Create child issues (bd-10 created, bd-11+ need different IDs)
  3. Implement Phase 1: Export enhancement
  4. Implement Phase 2-7: Core algorithm
  5. Tests
  6. Documentation
  7. Export issues to JSONL before committing

Meta: Real Collision Encountered!

While creating child issues, we hit the exact problem:

  • bd-10 was created successfully
  • bd-11, bd-12, bd-13, bd-14, bd-15, bd-16 all failed with "UNIQUE constraint failed"
  • This means the DB already has bd-11+ from a previous session/import
  • Perfect demonstration of why we need collision resolution!

Resolution: Create remaining child issues manually with explicit IDs after checking what exists.