Commit Graph

28 Commits

Author SHA1 Message Date
Steve Yegge
ff8f6ecadf feat(import): add import.orphan_handling config with 4 modes
- Add GetOrphanHandling() helper to SQLiteStorage (reads from config table)
- Add --orphan-handling flag to 'bd import' command
- Wire OrphanHandling through ImportOptions -> importer.Options
- Auto-read config if flag not provided (default: 'allow')
- Document in CONFIG.md with detailed mode explanations

Modes:
- strict: Fail on missing parent (safest)
- resurrect: Auto-create parent tombstones from JSONL
- skip: Skip orphans with warning
- allow: Import without validation (default, most permissive)

Closes bd-8072, bd-b92a

Amp-Thread-ID: https://ampcode.com/threads/T-fd18d4a5-06b3-4400-9073-194d570846d8
Co-authored-by: Amp <amp@ampcode.com>
2025-11-04 23:59:50 -08:00
Steve Yegge
0bf5c91cb3 Wire OrphanHandling through import pipeline (bd-8072)
- Added OrphanHandling type to sqlite package with 4 modes: strict/resurrect/skip/allow
- Updated EnsureIDs() to accept orphanHandling parameter and implement mode logic
- Added CreateIssuesWithOptions() that passes orphan handling through batch creation
- Made importer.OrphanHandling an alias to sqlite.OrphanHandling
- Importer now respects opts.OrphanHandling during batch issue creation

Next: Add import.orphan_handling config and wire through CLI commands
Amp-Thread-ID: https://ampcode.com/threads/T-bb7ffdd9-f444-4975-b5f7-bfff97cb92ff
Co-authored-by: Amp <amp@ampcode.com>
2025-11-04 23:53:44 -08:00
Steve Yegge
09f2edafba Add OrphanHandling type with 'allow' as default (bd-8072)
- Add OrphanHandling enum: strict/resurrect/skip/allow
- Add OrphanHandling field to importer.Options
- Default to 'allow' mode to work around existing system bugs
- Strict mode can be enabled via config for safer imports

Related: bd-8072, bd-b92a
2025-11-04 23:40:22 -08:00
Steve Yegge
d38a312583 Fix bd-3xq: Import gracefully handles missing parents
Implemented hybrid approach (topological sort + resurrection):

Phase 1: Import ordering (fixes latent bug)
- Sort issues by hierarchy depth before batch creation
- Create in depth-ordered batches (0→1→2→3)
- Ensures parents always created before children

Phase 2: Parent resurrection
- Attempt to resurrect missing parents from import batch
- Only fail if parent truly doesn't exist anywhere
- Enables deleted parent scenarios to work correctly

Benefits:
- Fixes import failure when parents deleted via bd-delete
- Handles parent-child pairs in same import batch
- Maintains referential integrity
- Enables multi-repo workflows with divergent deletion states

Amp-Thread-ID: https://ampcode.com/threads/T-14d3a206-aeac-4499-8ae9-47f3715e18fa
Co-authored-by: Amp <amp@ampcode.com>
2025-11-04 23:12:39 -08:00
Steve Yegge
d6e2ff6151 Phase 1: Add topological sorting to fix import ordering
- Add sort.go with depth-based utilities (GetHierarchyDepth, SortByDepth, GroupByDepth)
- Sort issues by hierarchy depth before batch creation
- Create in depth-order batches (0→1→2→3)
- Fixes latent bug: parent-child pairs in same batch could fail if wrong order
- Comprehensive tests for all sorting functions
- Closes bd-37dd, bd-3433, bd-8b65

Part of bd-d19a (Fix import failure on missing parent issues)

Amp-Thread-ID: https://ampcode.com/threads/T-44a36985-b59c-426f-834c-60a0faa0f9fb
Co-authored-by: Amp <amp@ampcode.com>
2025-11-04 22:25:26 -08:00
Steve Yegge
178a43dd5d Add external_ref UNIQUE constraint and validation
- Add migration for UNIQUE index on external_ref column (bd-897a)
- Add validation for duplicate external_ref in batch imports (bd-7315)
- Add query planner test to verify index usage (bd-f9a1)
- Add concurrent import tests for external_ref (bd-3f6a)

The migration detects existing duplicates and fails gracefully.
Batch imports now reject duplicates with clear error messages.
Tests verify the index is actually used by SQLite query planner.

Amp-Thread-ID: https://ampcode.com/threads/T-45ca66ed-3912-46c4-963c-caa7724a9a2f
Co-authored-by: Amp <amp@ampcode.com>
2025-11-02 16:27:42 -08:00
Steve Yegge
55c722a3e3 Implement external_ref as primary matching key for import updates (bd-1022)
- Add GetIssueByExternalRef() query function to storage interface and implementations
- Update DetectCollisions() to prioritize external_ref matching over ID matching
- Modify upsertIssues() to handle external_ref matches in import logic
- Add index on external_ref column for performance
- Add comprehensive tests for external_ref matching in both collision detection and import
- Enables re-syncing from external systems (Jira, GitHub, Linear) without duplicates
- Preserves local issues (no external_ref) from being overwritten
2025-11-02 15:28:09 -08:00
Steve Yegge
09bd4d3462 Fix CI test failures (bd-1231)
- Always recompute content_hash in importer to avoid stale hashes from JSONL
- Add .gitignore to test repos to prevent database files from being tracked
- Fix daemon auto-import test to use correct RPC operation ('show' not 'get_issue')
- Set last_import_time metadata in test helper to enable staleness check
- Add filesystem settle delay after git pull in tests

Root cause: CloseIssue updates status but not content_hash, so importer
thought issues were unchanged. Always recomputing content_hash fixes this.

Amp-Thread-ID: https://ampcode.com/threads/T-63ef3a7d-8efe-472d-97ed-6ac95bd8318b
Co-authored-by: Amp <amp@ampcode.com>
2025-11-02 08:55:34 -08:00
Steve Yegge
d240439868 fix: Resolve Windows test failures and lint warnings
- Fix Windows binary path issues (bd.exe vs bd)
- Skip scripttest on Windows (requires Unix shell)
- Skip file lock tests on Windows (platform locking differences)
- Fix registry tests to use USERPROFILE on Windows
- Fix 8 unparam lint warnings by marking unused params with _

All changes are platform-aware and maintain functionality.

Amp-Thread-ID: https://ampcode.com/threads/T-bc27021a-65db-4b64-a3f3-4e8d7bc8aa0d
Co-authored-by: Amp <amp@ampcode.com>
2025-11-02 08:30:31 -08:00
Steve Yegge
a4b4778ced test: improve coverage for importer and sqlite utils
- Fix TestImportIssues_Update by adding timestamps to test issue
- Add comprehensive tests for sqlite utility functions (IsUniqueConstraintError, QueryContext, BeginTx, ExecInTransaction)
- Coverage improvements: sqlite util.go 0% -> 95%, sqlite package 56.4% -> 57.1%

Amp-Thread-ID: https://ampcode.com/threads/T-17e6a3e4-f881-4f53-b670-bdd796d58f68
Co-authored-by: Amp <amp@ampcode.com>
2025-11-01 11:35:31 -07:00
Steve Yegge
253caef761 Fix bd-e55c: Import respects updated_at timestamps
- Import now checks timestamps before updating issues
- Only applies updates if incoming version is newer than local
- Prevents older remote versions from overwriting newer local changes
- Added comprehensive tests for timestamp precedence
- Fixes issue where git pull would revert local changes to open status
2025-10-31 18:06:05 -07:00
Steve Yegge
f1dd1e4446 test: improve internal/importer coverage from 33.5% to 70.5%
- Add comprehensive tests for ImportIssues main function
- Test basic import, updates, dry-run, dependencies, labels
- Test helper functions: getOrCreateStore, GetPrefixList
- All tests passing
2025-10-31 18:01:43 -07:00
Steve Yegge
54391798a0 Add tests for internal/importer field comparison functions
- Added tests for equalPtrStr (0% -> 100% coverage)
- Added tests for equalIssueType (50% -> 100% coverage)
- Improved internal/importer coverage from 31.2% to 33.5%
- Overall project coverage increased to 46.0%
2025-10-31 17:54:14 -07:00
Steve Yegge
8bf6b1eb63 Add unit tests for autoimport, importer, and main CLI
Amp-Thread-ID: https://ampcode.com/threads/T-b89cad6b-636f-477f-925d-4c3e3f769215
Co-authored-by: Amp <amp@ampcode.com>
2025-10-31 17:17:33 -07:00
Steve Yegge
d5488cb97f Remove collision-era language from docs and code
- Updated FAQ.md, ADVANCED.md, TROUBLESHOOTING.md to explain hash IDs eliminate collisions
- Removed --resolve-collisions references from all documentation and examples
- Renamed handleCollisions() to detectUpdates() to reflect update semantics
- Updated test names: TestAutoImportWithCollision → TestAutoImportWithUpdate
- Clarified: with hash IDs, same-ID = update operation, not collision

Closes: bd-50a7, bd-b84f, bd-bda8, bd-650c, bd-3ef2, bd-c083, bd-85a6
2025-10-31 14:24:50 -07:00
Steve Yegge
eb18d342f4 Fix: Treat same-ID updates as normal updates, not collisions (bd-0134cc5a) 2025-10-31 13:39:38 -07:00
Steve Yegge
0725c33fcc Remove vestigial --resolve-collisions flag
Hash-based IDs make collision resolution unnecessary. The flag was
already non-functional (handleCollisions returns error on collision
regardless of flag value).

Removed:
- --resolve-collisions flag from bd import
- ResolveCollisions field from ImportOptions and importer.Options
- All references in daemon, auto-import, and tests
- Updated error messages to reflect hash IDs don't collide

All import tests pass.

Amp-Thread-ID: https://ampcode.com/threads/T-47dfa0cc-bb71-4467-ac86-f0966a7c5d58
Co-authored-by: Amp <amp@ampcode.com>
2025-10-31 01:07:42 -07:00
Steve Yegge
64fe51d6bb Remove obsolete collision remapping code and tests
- Deleted collision remapping tests (obsolete with hash IDs bd-8e05)
- Simplified collision.go from 704 to 138 lines
- Removed RemapCollisions, ScoreCollisions, and reference update code
- Removed issue_counters table dependencies (bd-807b)
- Added COLLISION_MATH.md documentation
- Fixed RenameCounterPrefix and ResetCounter to be no-ops
- Closed bd-a58f, bd-3d65, bd-807b

Hash-based IDs make collision remapping unnecessary since collisions
are extremely rare (same ID = same content).

Amp-Thread-ID: https://ampcode.com/threads/T-cbb0f111-6a95-4598-b03e-c137112f9875
Co-authored-by: Amp <amp@ampcode.com>
2025-10-31 00:19:42 -07:00
Steve Yegge
5d137ffeeb Remove sequential ID generation and SyncAllCounters (bd-c7af, bd-8e05, bd-4c74)
- Removed SyncAllCounters() and all call sites (already no-op with hash IDs)
- Removed AllocateNextID() and getNextIDForPrefix() - sequential ID generation
- Removed collision remapping logic in internal/storage/sqlite/collision.go
- Removed rename collision handling in internal/importer/importer.go
- Removed branch-merge example (collision resolution no longer needed)
- Updated EXTENDING.md to remove counter sync examples

These were all deprecated code paths for sequential IDs that are obsolete
with hash-based IDs. Hash ID collisions are handled by extending the hash,
not by remapping to new sequential IDs.
2025-10-30 22:24:42 -07:00
Steve Yegge
c34b93fa1a Fix bd-160: Implement JSONL integrity validation and prevent export deduplication data loss
## Problem
Export deduplication feature broke when JSONL and export_hashes diverged
(e.g., after git pull/reset). This caused exports to skip issues that
weren't actually in the file, leading to silent data loss.

## Solution
1. JSONL integrity validation before every export
   - Store JSONL file hash after export
   - Validate hash before export, clear export_hashes if mismatch
   - Automatically recovers from git operations changing JSONL

2. Clear export_hashes on all imports
   - Prevents stale hashes from causing future export failures
   - Import operations invalidate export_hashes state

3. Add Storage interface methods:
   - GetJSONLFileHash/SetJSONLFileHash for integrity tracking
   - ClearAllExportHashes for recovery

## Tests Added
- TestJSONLIntegrityValidation: Unit tests for validation logic
- TestImportClearsExportHashes: Verifies imports clear hashes
- TestExportIntegrityAfterJSONLTruncation: Simulates git reset (would have caught bd-160)
- TestExportIntegrityAfterJSONLDeletion: Tests recovery from file deletion
- TestMultipleExportsStayConsistent: Tests repeated export integrity

## Follow-up
Created bd-179 epic for remaining integration test gaps (multi-repo sync,
daemon auto-sync, corruption recovery tests).

Closes bd-160
2025-10-29 21:57:15 -07:00
Steve Yegge
55f803a7c9 Fix multi-round convergence for N-way collisions (bd-108)
- Add AllocateNextID() public method to SQLiteStorage for cross-package ID allocation
- Enhance handleRename() to handle collision during rename with retry logic
- Fix stale ID map issue by removing deleted IDs from dbByID after rename
- Update edge case tests to use convergence rounds consistently
- All N-way collision tests now pass (TestFiveCloneCollision, TestEdgeCases)
2025-10-29 11:08:28 -07:00
Steve Yegge
ebb425388c bd-109: Add retry logic and race condition handling for N-way collisions
- Added ExecInTransaction helper for atomic database operations
- Added IsUniqueConstraintError to detect UNIQUE constraint violations
- Wrapped RemapCollisions with retry logic (3 attempts with counter sync)
- Enhanced handleRename to detect race conditions where target ID exists
- Added defensive checks for when old ID has been deleted by another clone

Progress: Improves N-way collision handling, though full solution requires
more work (tracked in bd-108). Tests now reach later convergence rounds
before hitting complex collision scenarios.

Amp-Thread-ID: https://ampcode.com/threads/T-2b850a80-f8bd-4e38-b661-e33d1cfa7281
Co-authored-by: Amp <amp@ampcode.com>
2025-10-29 10:45:25 -07:00
Steve Yegge
ff02615f61 Implement content-first idempotent import (bd-98)
- Refactored upsertIssues to match by content hash first, then by ID
- Added buildHashMap, buildIDMap, and handleRename helper functions
- Import now detects and handles renames (same content, different ID)
- Importing same data multiple times is idempotent (reports Unchanged)
- Exported BuildReplacementCache and ReplaceIDReferencesWithCache for reuse
- All 30+ existing import tests pass
- Improved convergence for N-way collision scenarios

Changes:
- internal/importer/importer.go: Content-first matching in upsertIssues
- internal/storage/sqlite/collision.go: Exported helper functions
- internal/storage/sqlite/collision_test.go: Updated function names

Amp-Thread-ID: https://ampcode.com/threads/T-3df96ad8-7c0e-4190-87b5-6d5327718f0a
Co-authored-by: Amp <amp@ampcode.com>
2025-10-28 20:40:36 -07:00
Steve Yegge
b0d28bbdbd Remove spurious collision-related code after ultrathink review
After 2 weeks of collision/stale-data fixes, reviewed all changes to identify
spurious code that is no longer needed after content-hash resolution was implemented.

**Removed:**
1. countReferences() function from collision.go (lines 274-328)
   - Was used for reference-count based collision scoring
   - Completely unused after switching to content-hash based resolution (commit 2e87329)
   - Still exists in duplicates.go for deduplication (different use case)

2. ReferenceScore field from CollisionDetail struct
   - Marked as DEPRECATED but never removed
   - No longer used by ScoreCollisions() which now uses content hashing

3. TestCountReferences and TestCountReferencesWordBoundary tests
   - Tested the now-deleted countReferences() function
   - No longer relevant

**Fixed:**
- Updated CheckpointWAL comments to remove misleading "staleness detection" claim
  - Staleness detection uses metadata (last_import_time), NOT file mtime
  - CheckpointWAL is still valuable for data persistence and WAL size reduction
  - Comments now accurately reflect actual benefits

**Verified:**
- All tests pass (internal/storage/sqlite)
- Content-hash collision resolution still works correctly
- No behavioral changes, just cleanup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 20:07:01 -07:00
Steve Yegge
9644d61de2 Make DetectCollisions read-only (bd-96)
- Add RenameDetail type to track content matches with different IDs
- Remove deletion logic from DetectCollisions (now read-only)
- Create ApplyCollisionResolution to handle all modifications
- Update importer.go to use two-phase approach (detect then apply)
- Fix dependency preservation in RemapCollisions
  - Collect all dependencies before CASCADE DELETE
  - Recreate with updated IDs after remapping
- Add tests: TestDetectCollisionsReadOnly, TestApplyCollisionResolution
- Update collision tests for content-hash scoring behavior
- Create bd-100 to track fixing autoimport tests
2025-10-28 19:16:51 -07:00
Steve Yegge
d9eb273e15 Complete bd-95: Add content-addressable identity (ContentHash field) 2025-10-28 18:57:16 -07:00
Steve Yegge
a46c2f79a9 Resolve merge conflicts: use importer package 2025-10-27 22:44:40 -07:00
Steve Yegge
adfe177dba Fix bd-132: Implement daemon auto-import after git pull
- Created internal/importer package with all import logic
- Moved import phases from cmd/bd to internal/importer
- Implemented real importFunc in daemon's checkAndAutoImportIfStale()
- Added single-flight concurrency guard to prevent parallel imports
- Added fast mtime check to avoid unnecessary file reads (99% of requests <0.1ms)
- Fixed import options: RenameOnImport=true instead of SkipPrefixValidation
- Added export trigger after ID remapping to prevent collision loops
- Fixed memory storage interface: added GetDirtyIssueHash, GetExportHash, SetExportHash
- Updated GetDependencyTree signature for reverse parameter

Performance:
- Mtime check: ~0.01ms per request
- Import when needed: ~10-100ms (rare, only after git pull)
- Throughput maintained: 4300+ issues/sec
- No duplicate work with single-flight guard

Fixes critical data corruption bug where daemon served stale data after
git pull, causing fresh JSONL changes to be overwritten.

Amp-Thread-ID: https://ampcode.com/threads/T-71224a2d-b2d7-4173-b21e-449b64f9dd71
Co-authored-by: Amp <amp@ampcode.com>
2025-10-27 16:29:12 -07:00