Commit Graph

26 Commits

Author SHA1 Message Date
Steve Yegge
773aa736e4 Document external_ref in content hash behavior (bd-9f4a)
- Added comprehensive code comments in collision.go explaining external_ref inclusion
- Documented content hash behavior in HASH_ID_DESIGN.md with examples
- Enhanced test documentation in collision_test.go
- Closes bd-9f4a, bd-df11, bd-537e

Amp-Thread-ID: https://ampcode.com/threads/T-47525168-d51c-4f56-b598-18402e5ea389
Co-authored-by: Amp <amp@ampcode.com>
2025-11-08 02:22:15 -08:00
Steve Yegge
55c722a3e3 Implement external_ref as primary matching key for import updates (bd-1022)
- Add GetIssueByExternalRef() query function to storage interface and implementations
- Update DetectCollisions() to prioritize external_ref matching over ID matching
- Modify upsertIssues() to handle external_ref matches in import logic
- Add index on external_ref column for performance
- Add comprehensive tests for external_ref matching in both collision detection and import
- Enables re-syncing from external systems (Jira, GitHub, Linear) without duplicates
- Preserves local issues (no external_ref) from being overwritten
2025-11-02 15:28:09 -08:00
Steve Yegge
2b086951c4 fix: Address all errcheck and misspell linter errors 2025-11-01 23:56:03 -07:00
Steve Yegge
64fe51d6bb Remove obsolete collision remapping code and tests
- Deleted collision remapping tests (obsolete with hash IDs bd-8e05)
- Simplified collision.go from 704 to 138 lines
- Removed RemapCollisions, ScoreCollisions, and reference update code
- Removed issue_counters table dependencies (bd-807b)
- Added COLLISION_MATH.md documentation
- Fixed RenameCounterPrefix and ResetCounter to be no-ops
- Closed bd-a58f, bd-3d65, bd-807b

Hash-based IDs make collision remapping unnecessary since collisions
are extremely rare (same ID = same content).

Amp-Thread-ID: https://ampcode.com/threads/T-cbb0f111-6a95-4598-b03e-c137112f9875
Co-authored-by: Amp <amp@ampcode.com>
2025-10-31 00:19:42 -07:00
Steve Yegge
5d137ffeeb Remove sequential ID generation and SyncAllCounters (bd-c7af, bd-8e05, bd-4c74)
- Removed SyncAllCounters() and all call sites (already no-op with hash IDs)
- Removed AllocateNextID() and getNextIDForPrefix() - sequential ID generation
- Removed collision remapping logic in internal/storage/sqlite/collision.go
- Removed rename collision handling in internal/importer/importer.go
- Removed branch-merge example (collision resolution no longer needed)
- Updated EXTENDING.md to remove counter sync examples

These were all deprecated code paths for sequential IDs that are obsolete
with hash-based IDs. Hash ID collisions are handled by extending the hash,
not by remapping to new sequential IDs.
2025-10-30 22:24:42 -07:00
Steve Yegge
ebb425388c bd-109: Add retry logic and race condition handling for N-way collisions
- Added ExecInTransaction helper for atomic database operations
- Added IsUniqueConstraintError to detect UNIQUE constraint violations
- Wrapped RemapCollisions with retry logic (3 attempts with counter sync)
- Enhanced handleRename to detect race conditions where target ID exists
- Added defensive checks for when old ID has been deleted by another clone

Progress: Improves N-way collision handling, though full solution requires
more work (tracked in bd-108). Tests now reach later convergence rounds
before hitting complex collision scenarios.

Amp-Thread-ID: https://ampcode.com/threads/T-2b850a80-f8bd-4e38-b661-e33d1cfa7281
Co-authored-by: Amp <amp@ampcode.com>
2025-10-29 10:45:25 -07:00
Steve Yegge
ff02615f61 Implement content-first idempotent import (bd-98)
- Refactored upsertIssues to match by content hash first, then by ID
- Added buildHashMap, buildIDMap, and handleRename helper functions
- Import now detects and handles renames (same content, different ID)
- Importing same data multiple times is idempotent (reports Unchanged)
- Exported BuildReplacementCache and ReplaceIDReferencesWithCache for reuse
- All 30+ existing import tests pass
- Improved convergence for N-way collision scenarios

Changes:
- internal/importer/importer.go: Content-first matching in upsertIssues
- internal/storage/sqlite/collision.go: Exported helper functions
- internal/storage/sqlite/collision_test.go: Updated function names

Amp-Thread-ID: https://ampcode.com/threads/T-3df96ad8-7c0e-4190-87b5-6d5327718f0a
Co-authored-by: Amp <amp@ampcode.com>
2025-10-28 20:40:36 -07:00
Steve Yegge
b0d28bbdbd Remove spurious collision-related code after ultrathink review
After 2 weeks of collision/stale-data fixes, reviewed all changes to identify
spurious code that is no longer needed after content-hash resolution was implemented.

**Removed:**
1. countReferences() function from collision.go (lines 274-328)
   - Was used for reference-count based collision scoring
   - Completely unused after switching to content-hash based resolution (commit 2e87329)
   - Still exists in duplicates.go for deduplication (different use case)

2. ReferenceScore field from CollisionDetail struct
   - Marked as DEPRECATED but never removed
   - No longer used by ScoreCollisions() which now uses content hashing

3. TestCountReferences and TestCountReferencesWordBoundary tests
   - Tested the now-deleted countReferences() function
   - No longer relevant

**Fixed:**
- Updated CheckpointWAL comments to remove misleading "staleness detection" claim
  - Staleness detection uses metadata (last_import_time), NOT file mtime
  - CheckpointWAL is still valuable for data persistence and WAL size reduction
  - Comments now accurately reflect actual benefits

**Verified:**
- All tests pass (internal/storage/sqlite)
- Content-hash collision resolution still works correctly
- No behavioral changes, just cleanup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 20:07:01 -07:00
Steve Yegge
9644d61de2 Make DetectCollisions read-only (bd-96)
- Add RenameDetail type to track content matches with different IDs
- Remove deletion logic from DetectCollisions (now read-only)
- Create ApplyCollisionResolution to handle all modifications
- Update importer.go to use two-phase approach (detect then apply)
- Fix dependency preservation in RemapCollisions
  - Collect all dependencies before CASCADE DELETE
  - Recreate with updated IDs after remapping
- Add tests: TestDetectCollisionsReadOnly, TestApplyCollisionResolution
- Update collision tests for content-hash scoring behavior
- Create bd-100 to track fixing autoimport tests
2025-10-28 19:16:51 -07:00
Steve Yegge
d9eb273e15 Complete bd-95: Add content-addressable identity (ContentHash field) 2025-10-28 18:57:16 -07:00
Steve Yegge
ceb1b922a9 bd sync: 2025-10-28 17:19:28 2025-10-28 17:19:28 -07:00
Steve Yegge
2e87329cf8 WIP: Implement content-hash based collision resolution (bd-89)
- Replace timestamp-based collision scoring with deterministic content hashing
- Add hashIssueContent() using SHA-256 of all substantive fields
- Modify ScoreCollisions to compare hashes and set RemapIncoming flag
- Update RemapCollisions to handle both directions (remap incoming OR existing)
- Add CollisionDetail.RemapIncoming field to control which version gets remapped
- Add unit tests for hash function and deterministic collision resolution

Status: Hash-based resolution works correctly, but TestTwoCloneCollision still
fails due to missing rename detection. After Clone B resolves collision,
Clone A needs to recognize its issue was remapped to a different ID.

Next: Add content-based rename detection during import to prevent
re-resolving already-resolved collisions.

Progress on bd-86.

Amp-Thread-ID: https://ampcode.com/threads/T-b19b49e8-b52a-463d-b052-8a526a500260
Co-authored-by: Amp <amp@ampcode.com>
2025-10-28 17:11:40 -07:00
Steve Yegge
bb33007036 Fix revive style issues (bd-56)
- Fix 14 unused-parameter warnings (rename to _)
- Fix 2 redefines-builtin-id (max→maxCount, min→minInt)
- Fix 3 indent-error-flow issues with gofmt
- Merged duplicate bd-126 into bd-116
2025-10-25 18:13:49 -07:00
Steve Yegge
e3ff12448f Fix: RemapCollisions deletes existing issue dependencies (GH #120, bd-56)
Bug: updateDependencyReferences() was incorrectly updating ALL dependencies
in the database during collision resolution with --resolve-collisions,
including dependencies belonging to existing issues.

Root cause: The function checked if dep.IssueID was in idMapping keys
(old imported IDs like 'bd-1'), but those are also the IDs of existing
database issues. This caused existing dependencies to be incorrectly
modified or deleted.

Fix: Changed logic to only update dependencies where IssueID is in
idMapping VALUES (new remapped IDs like 'bd-295'). This ensures only
dependencies from remapped issues are updated, not existing ones.

During normal import flow, this is effectively a no-op since imported
dependencies haven't been added to the database yet when RemapCollisions
runs (they're added later in Phase 5 of import_shared.go).

Changes:
- Updated updateDependencyReferences() in collision.go to build a set
  of new remapped IDs and only update dependencies with those IDs
- Added comprehensive documentation explaining the correct semantics
- Added regression tests: TestRemapCollisionsRemapsImportedNotExisting
  and TestRemapCollisionsDoesNotUpdateNonexistentDependencies
- Skipped 3 tests that expected the old buggy behavior with clear
  notes about why they need to be rewritten

Real-world impact: In one case, 125 dependencies were incorrectly
deleted from 157 existing issues during collision resolution.

Fixes https://github.com/steveyegge/beads/issues/120
Fixes bd-56
2025-10-23 10:25:13 -07:00
Steve Yegge
e97c122feb Cleanup and fixes: godoc comments, removed dead code, fixed renumber FK constraint bug
- Added comprehensive godoc comments for auto-flush functions (bd-4)
- Removed unused issueMap in scoreCollisions (bd-6)
- Fixed renumber command FK constraint failure (bd-143)
  - Changed UpdateIssueID to use explicit connection with FK disabled
  - Resolves 'constraint failed: FOREIGN KEY constraint failed' error
- Deleted 22 test/placeholder issues
- Renumbered issues from bd-1 to bd-143 (eliminated gaps)

Amp-Thread-ID: https://ampcode.com/threads/T-65f78f08-4856-4af0-9d6c-af33e88b5f63
Co-authored-by: Amp <amp@ampcode.com>
2025-10-19 19:37:58 -07:00
Steve Yegge
6811d1cf42 Fix multiple issues and renumber database
- bd-170: Implement hybrid sorting for ready work (recent 48h first, then oldest)
- bd-87: Use safer null-byte placeholders in ID remapping
- bd-92: Make auto-flush debounce configurable via BEADS_FLUSH_DEBOUNCE
- bd-171: Fix nil pointer dereference in renumber command
- Delete spurious test issues (bd-7, bd-130-134)
- Renumber database from 171 down to 144 issues
2025-10-18 09:58:35 -07:00
Steve Yegge
fc29beae6d Fix bd-421: Add deduplication to prevent importing duplicate issues
- Added deduplicateIncomingIssues() to consolidate content-identical issues
- DetectCollisions now deduplicates within incoming batch before processing
- Keeps issue with smallest ID when duplicates found
- Added comprehensive test suite in collision_dedup_test.go
- Export clean JSONL with bd-421 fix applied

Amp-Thread-ID: https://ampcode.com/threads/T-c17dd8bf-c298-4a80-baa5-55fa7c7bb9a3
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 18:08:58 -07:00
Steve Yegge
c3e3326bba Fix critical bugs: bd-169, bd-28, bd-393
- bd-169: Add -q/--quiet flag to bd init command
- bd-28: Improve error handling in RemoveDependency
  - Now checks RowsAffected and returns error if dependency doesn't exist
  - New removeDependencyIfExists() helper for collision remapping
- bd-393: CRITICAL - Fix auto-import skipping collisions
  - Auto-import was LOSING work from other workers
  - Now automatically remaps collisions to new IDs
  - Calls RemapCollisions() instead of skipping

All tests pass.

Amp-Thread-ID: https://ampcode.com/threads/T-cba86837-28db-47ce-94eb-67fade82376a
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 15:00:54 -07:00
Steve Yegge
4fe648d140 Fix bd-437: Use addDependencyUnchecked during collision resolution
When remapping dependencies during collision resolution, skip semantic
validation (like parent-child direction checks) since we're just updating
IDs on existing dependencies that were already validated.

Fixes parent-child validation error during import --resolve-collisions.
2025-10-16 13:42:47 -07:00
Steve Yegge
331a435418 Fix collision resolver: sync ID counters before remapping
The collision resolver was failing when remapping issues to new IDs because
the issue_counters table was out of sync with actual database IDs. Test issues
(bd-200 through bd-312) were created without updating the counter, causing
UNIQUE constraint violations when --resolve-collisions tried to use those IDs.

Added SyncAllCounters() call before remapping to ensure counters reflect all
existing issues in the database. This prevents ID collisions during import.

Fixes the core functionality needed for multi-device git-based sync (bd-271).

Amp-Thread-ID: https://ampcode.com/threads/T-a2a94e1b-b220-41b0-bf7d-f2640a44292b
Co-authored-by: Amp <amp@ampcode.com>
2025-10-16 11:59:34 -07:00
Steve Yegge
f0266339c5 Complete bd-227: Audit status/closed_at inconsistencies
Findings:
- 86/93 closed issues (92%) are missing closed_at timestamps
- All inconsistencies are historical (old issues bd-1 through bd-93)
- No cases of non-closed issues with timestamps

Recommendation: Set closed_at = updated_at for affected issues
Next: Apply cleanup SQL and add constraint
2025-10-15 13:08:48 -07:00
v4rgas
20e3235435 fix: replace in-memory ID counter with atomic database counter
Replace the in-memory nextID counter with an atomic database-backed
counter using the issue_counters table. This fixes race conditions
when multiple processes create issues concurrently.

Changes:
- Add issue_counters table with atomic INSERT...ON CONFLICT pattern
- Remove in-memory nextID field and sync.Mutex from SQLiteStorage
- Implement getNextIDForPrefix() for atomic ID generation
- Update CreateIssue() to use database counter instead of memory
- Update RemapCollisions() to use database counter for collision resolution
- Clean up old planning and bug documentation files

Fixes the multi-process ID generation race condition tested in
cmd/bd/race_test.go.
2025-10-14 01:18:50 -07:00
Steve Yegge
25644d9717 Cache compiled regexes in ID replacement for 1.9x performance boost
Implements bd-27: Cache compiled regexes in replaceIDReferences for performance

Problem:
replaceIDReferences() was compiling regex patterns on every call. With 100
issues and 10 ID mappings, that resulted in 4,000 regex compilations (100
issues × 4 text fields × 10 ID mappings).

Solution:
- Added buildReplacementCache() to pre-compile all regexes once
- Added replaceIDReferencesWithCache() to reuse compiled regexes
- Updated updateReferences() to build cache once and reuse for all issues
- Kept replaceIDReferences() for backward compatibility (calls cached version)

Performance Results (from benchmarks):
Single text:
- 1.33x faster (26,162 ns → 19,641 ns)
- 68% less memory (25,769 B → 8,241 B)
- 80% fewer allocations (278 → 55)

Real-world (400 texts, 10 mappings):
- 1.89x faster (5.1ms → 2.7ms)
- 90% less memory (7.7 MB → 0.8 MB)
- 86% fewer allocations (104,112 → 14,801)

Tests:
- All existing tests pass
- Added 3 benchmark tests demonstrating improvements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 23:50:48 -07:00
Steve Yegge
183ded4096 Add collision resolution with automatic ID remapping
Implements --resolve-collisions flag for import command to safely handle ID
collisions during branch merges. When enabled, colliding issues are remapped
to new IDs and all text references and dependencies are automatically updated.

Also adds comprehensive tests, branch-merge example, and documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 17:13:09 -07:00
Steve Yegge
42e3bb315d Implement reference scoring algorithm (bd-13)
Add reference scoring to prioritize which colliding issues should be
renumbered during collision resolution. Issues with fewer references
are renumbered first to minimize total update work.

Changes to collision.go:
- Add ReferenceScore field to CollisionDetail
- scoreCollisions() calculates scores and sorts collisions ascending
- countReferences() counts text mentions + dependency references
- Uses word-boundary regex (\b) to match exact IDs (bd-10 not bd-100)

New tests in collision_test.go:
- TestCountReferences: validates reference counting logic
- TestScoreCollisions: verifies scoring and sorting behavior
- TestCountReferencesWordBoundary: ensures exact ID matching

Reference score = text mentions (desc/design/notes/criteria) + deps
Sort order: fewest references first (minimizes renumbering impact)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 16:27:05 -07:00
Steve Yegge
b065b32a51 Implement collision detection for import (bd-12)
Add collision detection infrastructure to identify conflicts during
JSONL import when branches diverge and create issues with the same ID.

New files:
- internal/storage/sqlite/collision.go: Core collision detection logic
  - detectCollisions() categorizes issues as exact matches, collisions, or new
  - compareIssues() identifies which fields differ between issues
  - CollisionDetail provides detailed collision reporting

- internal/storage/sqlite/collision_test.go: Comprehensive test suite
  - Tests exact matches, new issues, and various collision scenarios
  - Tests multi-field conflicts and edge cases
  - All tests passing

This lays the foundation for bd-13 through bd-17 (reference scoring,
ID remapping, CLI flags, and documentation).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 16:22:44 -07:00