Files
beads/n-way-collision-convergence.md

3.8 KiB

N-Way Collision Convergence Problem

Summary

The current collision resolution implementation (--resolve-collisions) works correctly for 2-way collisions but does not converge for 3-way (and by extension N-way) collisions. This is a critical limitation for parallel worker scenarios where multiple agents file issues simultaneously.

Test Evidence

TestThreeCloneCollision in beads_twoclone_test.go demonstrates the problem with 3 clones creating the same issue ID (test-1) with different content.

Observed Behavior

Sync Order A→B→C:

  • Clone A: 0 issues (empty database after final pull)
  • Clone B: 2 issues (missing "Issue from clone C")
  • Clone C: 3 issues (has all issues)

Sync Order C→A→B:

  • Clone A: 2 issues (missing "Issue from clone B")
  • Clone B: 3 issues (has all issues)
  • Clone C: 0 issues (empty database after final pull)

Pattern: The middle clone in the sync order gets all issues, but the first and last clones end up with incomplete data. This behavior is 100% reproducible across all test runs.

Root Cause Analysis

When the third clone pulls and resolves collisions:

  1. It correctly remaps its conflicting issue to a new ID (e.g., test-1test-3)
  2. It imports the issues from the other two clones
  3. It pushes the merged state

However, when the first clone pulls this merged state:

  1. The import sees new issues that collide with its local database
  2. The resolution logic doesn't properly handle issues that were already remapped upstream
  3. The database ends up in an inconsistent state (often empty or partially populated)

Why This Matters

This prevents reliable N-way parallel worker scenarios:

  • Multiple AI agents filing issues simultaneously
  • Distributed teams working on different clones
  • CI/CD systems creating issues in parallel builds

Current workaround: Only works reliably with 2 workers or sequential issue creation.

What Needs To Be Fixed

1. Import Logic Enhancement

The --resolve-collisions import needs to:

  • Detect when incoming issues were already remapped upstream
  • Preserve the remapping chain (track test-1test-2test-3)
  • Not re-remap already-remapped issues

2. Convergence Algorithm

Implement a proper convergence algorithm that ensures:

  • All clones eventually have the same complete set of issues
  • Idempotent imports (importing the same JSONL multiple times is safe)
  • Transitive collision resolution (if A remaps to B, and B exists, handle gracefully)

3. Test Requirements

The fix should make TestThreeCloneCollision pass without skipping:

  • All three clones must have all three issues (by title)
  • Content must match across all clones (ignoring timestamps and specific ID assignments)
  • Must work for both sync orders (A→B→C and C→A→B)

4. Extend to N-Way

Once 3-way works, verify it generalizes to N workers:

  • Test with 5+ clones
  • Test with different sync order permutations
  • Ensure convergence time is bounded

Files To Examine

  • beads_twoclone_test.go: Contains TestThreeCloneCollision that reproduces the issue
  • cmd/bd/import.go: Import logic with --resolve-collisions flag
  • internal/storage/sqlite/sqlite.go: Database operations for collision detection
  • cmd/bd/sync.go: Sync workflow that calls import/export

Success Criteria

  1. TestThreeCloneCollision passes without skipping
  2. All clones converge to identical content after final pull
  3. No data loss (all issues present in all clones)
  4. ID assignments can be non-deterministic, but content must match
  5. Works for N workers (extend test to 5+ clones)

Current Test Status

go test -v -run TestThreeCloneCollision
# Both subtests SKIP with message:
# "KNOWN LIMITATION: 3-way collisions may require additional resolution logic"

The test is designed to skip when convergence fails, so it won't break CI, but it documents the limitation clearly.