Commit Graph

7 Commits

Author SHA1 Message Date
Steve Yegge
ebb425388c bd-109: Add retry logic and race condition handling for N-way collisions
- Added ExecInTransaction helper for atomic database operations
- Added IsUniqueConstraintError to detect UNIQUE constraint violations
- Wrapped RemapCollisions with retry logic (3 attempts with counter sync)
- Enhanced handleRename to detect race conditions where target ID exists
- Added defensive checks for when old ID has been deleted by another clone

Progress: Improves N-way collision handling, though full solution requires
more work (tracked in bd-108). Tests now reach later convergence rounds
before hitting complex collision scenarios.

Amp-Thread-ID: https://ampcode.com/threads/T-2b850a80-f8bd-4e38-b661-e33d1cfa7281
Co-authored-by: Amp <amp@ampcode.com>
2025-10-29 10:45:25 -07:00
Steve Yegge
ff02615f61 Implement content-first idempotent import (bd-98)
- Refactored upsertIssues to match by content hash first, then by ID
- Added buildHashMap, buildIDMap, and handleRename helper functions
- Import now detects and handles renames (same content, different ID)
- Importing same data multiple times is idempotent (reports Unchanged)
- Exported BuildReplacementCache and ReplaceIDReferencesWithCache for reuse
- All 30+ existing import tests pass
- Improved convergence for N-way collision scenarios

Changes:
- internal/importer/importer.go: Content-first matching in upsertIssues
- internal/storage/sqlite/collision.go: Exported helper functions
- internal/storage/sqlite/collision_test.go: Updated function names

Amp-Thread-ID: https://ampcode.com/threads/T-3df96ad8-7c0e-4190-87b5-6d5327718f0a
Co-authored-by: Amp <amp@ampcode.com>
2025-10-28 20:40:36 -07:00
Steve Yegge
b0d28bbdbd Remove spurious collision-related code after ultrathink review
After 2 weeks of collision/stale-data fixes, reviewed all changes to identify
spurious code that is no longer needed after content-hash resolution was implemented.

**Removed:**
1. countReferences() function from collision.go (lines 274-328)
   - Was used for reference-count based collision scoring
   - Completely unused after switching to content-hash based resolution (commit 2e87329)
   - Still exists in duplicates.go for deduplication (different use case)

2. ReferenceScore field from CollisionDetail struct
   - Marked as DEPRECATED but never removed
   - No longer used by ScoreCollisions() which now uses content hashing

3. TestCountReferences and TestCountReferencesWordBoundary tests
   - Tested the now-deleted countReferences() function
   - No longer relevant

**Fixed:**
- Updated CheckpointWAL comments to remove misleading "staleness detection" claim
  - Staleness detection uses metadata (last_import_time), NOT file mtime
  - CheckpointWAL is still valuable for data persistence and WAL size reduction
  - Comments now accurately reflect actual benefits

**Verified:**
- All tests pass (internal/storage/sqlite)
- Content-hash collision resolution still works correctly
- No behavioral changes, just cleanup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 20:07:01 -07:00
Steve Yegge
9644d61de2 Make DetectCollisions read-only (bd-96)
- Add RenameDetail type to track content matches with different IDs
- Remove deletion logic from DetectCollisions (now read-only)
- Create ApplyCollisionResolution to handle all modifications
- Update importer.go to use two-phase approach (detect then apply)
- Fix dependency preservation in RemapCollisions
  - Collect all dependencies before CASCADE DELETE
  - Recreate with updated IDs after remapping
- Add tests: TestDetectCollisionsReadOnly, TestApplyCollisionResolution
- Update collision tests for content-hash scoring behavior
- Create bd-100 to track fixing autoimport tests
2025-10-28 19:16:51 -07:00
Steve Yegge
d9eb273e15 Complete bd-95: Add content-addressable identity (ContentHash field) 2025-10-28 18:57:16 -07:00
Steve Yegge
a46c2f79a9 Resolve merge conflicts: use importer package 2025-10-27 22:44:40 -07:00
Steve Yegge
adfe177dba Fix bd-132: Implement daemon auto-import after git pull
- Created internal/importer package with all import logic
- Moved import phases from cmd/bd to internal/importer
- Implemented real importFunc in daemon's checkAndAutoImportIfStale()
- Added single-flight concurrency guard to prevent parallel imports
- Added fast mtime check to avoid unnecessary file reads (99% of requests <0.1ms)
- Fixed import options: RenameOnImport=true instead of SkipPrefixValidation
- Added export trigger after ID remapping to prevent collision loops
- Fixed memory storage interface: added GetDirtyIssueHash, GetExportHash, SetExportHash
- Updated GetDependencyTree signature for reverse parameter

Performance:
- Mtime check: ~0.01ms per request
- Import when needed: ~10-100ms (rare, only after git pull)
- Throughput maintained: 4300+ issues/sec
- No duplicate work with single-flight guard

Fixes critical data corruption bug where daemon served stale data after
git pull, causing fresh JSONL changes to be overwritten.

Amp-Thread-ID: https://ampcode.com/threads/T-71224a2d-b2d7-4173-b21e-449b64f9dd71
Co-authored-by: Amp <amp@ampcode.com>
2025-10-27 16:29:12 -07:00