Amp-Thread-ID: https://ampcode.com/threads/T-14d3a206-aeac-4499-8ae9-47f3715e18fa Co-authored-by: Amp <amp@ampcode.com>
16 KiB
bd-3xq: Import Failure on Missing Parent Issues - Deep Analysis
Issue ID: bd-3xq Analysis Date: 2025-11-04 Status: P0 Bug
Executive Summary
The beads import process fails atomically when the JSONL file references deleted parent issues, blocking all imports. This is caused by overly strict parent validation in two critical code paths. The root issue is a design tension between referential integrity and operational flexibility.
Key Finding: The current implementation prioritizes database integrity over forward-compatibility, making normal operations like bd-delete potentially destructive to future imports.
Problem Deep Dive
The Failure Scenario
- User deletes old/obsolete issues via
bd-deletefor hygiene ✓ (valid operation) - Issues remain in git history but are removed from database ✓ (expected)
- JSONL file in git contains child issues whose parents were deleted ✗ (orphaned references)
- Auto-import fails completely:
parent issue bd-1f4086c5 does not exist✗ - Database becomes stuck - 296 issues in DB, newer data in JSONL cannot sync ✗
Technical Root Cause
Parent validation occurs in two critical locations:
1. internal/storage/sqlite/ids.go:189-202 - In EnsureIDs()
// For hierarchical IDs (bd-a3f8e9.1), validate parent exists
if strings.Contains(issues[i].ID, ".") {
// Extract parent ID (everything before the last dot)
lastDot := strings.LastIndex(issues[i].ID, ".")
parentID := issues[i].ID[:lastDot]
var parentCount int
err := conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM issues WHERE id = ?`, parentID).Scan(&parentCount)
if err != nil {
return fmt.Errorf("failed to check parent existence: %w", err)
}
if parentCount == 0 {
return fmt.Errorf("parent issue %s does not exist", parentID) // ⚠️ BLOCKS ENTIRE IMPORT
}
}
2. internal/storage/sqlite/sqlite.go:182-196 - In CreateIssue()
// For hierarchical IDs (bd-a3f8e9.1), validate parent exists
if strings.Contains(issue.ID, ".") {
// Extract parent ID (everything before the last dot)
lastDot := strings.LastIndex(issue.ID, ".")
parentID := issue.ID[:lastDot]
var parentCount int
err = conn.QueryRowContext(ctx, `SELECT COUNT(*) FROM issues WHERE id = ?`, parentID).Scan(&parentCount)
if err != nil {
return fmt.Errorf("failed to check parent existence: %w", err)
}
if parentCount == 0 {
return fmt.Errorf("parent issue %s does not exist", parentID) // ⚠️ BLOCKS CREATION
}
}
Analysis: Both functions perform identical validation, creating a redundant but reinforced barrier. This is defensive programming taken too far - it prevents valid evolution scenarios.
Critical Insight: The Import Ordering Bug
Hidden Problem in importer.go:534-546
The upsertIssues() function has a latent bug that compounds the parent validation issue:
// Batch create all new issues
if len(newIssues) > 0 {
if err := sqliteStore.CreateIssues(ctx, newIssues, "import"); err != nil {
return fmt.Errorf("error creating issues: %w", err)
}
result.Created += len(newIssues)
}
The Problem: newIssues is not sorted by hierarchy depth before batch creation!
If the import includes:
bd-abc123(parent)bd-abc123.1(child)
And they happen to be ordered [child, parent] in the slice, the import will fail even though both parent and child are present in the batch.
Why This Matters: Even if we relax parent validation to allow missing parents, we still need proper topological sorting to handle parent-child pairs in the same batch.
Design Analysis: Three Competing Forces
1. Referential Integrity (Current Priority)
- Goal: Prevent orphaned children in the database
- Benefit: Clean, consistent data structure
- Cost: Blocks valid operations, makes deletion risky
2. Operational Flexibility (Sacrificed)
- Goal: Allow normal database maintenance (deletions, pruning)
- Benefit: Database hygiene, reduced clutter
- Cost: Currently incompatible with strict integrity
3. Multi-Repo Sync (Broken)
- Goal: Share issues across clones with different histories
- Benefit: Collaboration, distributed workflow
- Cost: Different deletion states across clones break imports
Current State: Force 1 wins at the expense of Forces 2 and 3.
Solution Space Analysis
Option 1: Strict Validation with Import-Time Parent Creation ⭐
Approach: Keep strict validation but auto-resurrect deleted parents during import.
How It Works:
- When importing child with missing parent, check git history
- If parent found in JSONL history, resurrect it as a tombstone
- Tombstone: status=
deleted, minimal metadata, preserved for structure - Child import succeeds with valid parent reference
Pros:
- ✅ Maintains referential integrity
- ✅ Allows forward-rolling imports
- ✅ Preserves dependency tree structure
- ✅ Minimal code changes
Cons:
- ⚠️ Database accumulates tombstones (but they're marked deleted)
- ⚠️ Requires git history access (already available)
- ⚠️ Slight complexity increase
Code Changes Required:
- Modify
EnsureIDs()andCreateIssue()to accept a "resurrect" mode - Add
TryResurrectParent(ctx, parentID)function - Parse JSONL history to find deleted parent
- Create parent with
status="deleted"and flagis_tombstone=true
Risk Level: Low - Backwards compatible, preserves semantics
Option 2: Relaxed Validation - Skip Orphans
Approach: Log warning and skip orphaned children during import.
How It Works:
- Remove
if parentCount == 0error return - Replace with:
log.Warnf("Skipping orphaned issue %s (parent %s not found)", childID, parentID) - Continue import with other issues
- Report skipped issues at end
Pros:
- ✅ Simplest implementation
- ✅ Unblocks imports immediately
- ✅ No data corruption
Cons:
- ❌ Silently loses data (orphaned issues)
- ❌ Hard to notice what was skipped
- ❌ Breaks user expectations (import should import everything)
Risk Level: Medium - Data loss risk
Option 3: Relaxed Validation - Allow Orphans
Approach: Import orphaned children without parent validation.
How It Works:
- Remove parent existence check entirely
- Allow
bd-abc123.1to exist withoutbd-abc123 - UI/CLI queries handle missing parents gracefully
Pros:
- ✅ Maximum flexibility
- ✅ Simple code change
- ✅ Unblocks all scenarios
Cons:
- ❌ Breaks dependency tree integrity
- ❌ UI/CLI must handle orphans everywhere
- ❌ Hierarchical ID semantics become meaningless
- ❌ Risk of cascading failures in tree operations
Risk Level: High - Semantic corruption
Option 4: Convert Hierarchical to Top-Level
Approach: When parent missing, flatten child ID to top-level.
How It Works:
- Detect orphaned child:
bd-abc123.1 - Convert to top-level:
bd-abc123-1(dot → dash) - Import as independent issue
- Log transformation
Pros:
- ✅ Preserves all issues
- ✅ Maintains uniqueness
- ✅ No data loss
Cons:
- ❌ Changes issue IDs (breaks references)
- ❌ Loses hierarchical relationship
- ❌ Confusing for users
Risk Level: Medium - ID stability risk
Option 5: Two-Pass Import with Topological Sort ⭐⭐
Approach: Sort issues by hierarchy depth before batch creation.
How It Works:
- Pre-process phase: Separate issues into depth buckets
- Depth 0:
bd-abc123(no dots) - Depth 1:
bd-abc123.1(one dot) - Depth 2:
bd-abc123.1.2(two dots)
- Depth 0:
- Import phase: Create in depth order (0 → 1 → 2)
- Parent resolution: For missing parents, try:
- Option A: Resurrect from JSONL (Option 1)
- Option B: Skip with warning (Option 2)
- Option C: Create placeholder parent
Pros:
- ✅ Fixes latent import ordering bug
- ✅ Handles parent-child pairs in same batch
- ✅ Can combine with other options (1, 2, or 3)
- ✅ More robust import pipeline
Cons:
- ⚠️ Requires refactoring
upsertIssues() - ⚠️ Slight performance overhead (sorting)
Code Changes Required:
// In upsertIssues() before batch creation:
// Sort newIssues by hierarchy depth to ensure parents are created first
sort.Slice(newIssues, func(i, j int) bool {
depthI := strings.Count(newIssues[i].ID, ".")
depthJ := strings.Count(newIssues[j].ID, ".")
if depthI != depthJ {
return depthI < depthJ // Shallower first
}
return newIssues[i].ID < newIssues[j].ID // Stable sort
})
// Then batch create by depth level
for depth := 0; depth <= 3; depth++ { // Max depth 3
var batchForDepth []*types.Issue
for _, issue := range newIssues {
if strings.Count(issue.ID, ".") == depth {
batchForDepth = append(batchForDepth, issue)
}
}
if len(batchForDepth) > 0 {
if err := sqliteStore.CreateIssues(ctx, batchForDepth, "import"); err != nil {
return fmt.Errorf("error creating depth-%d issues: %w", depth, err)
}
result.Created += len(batchForDepth)
}
}
Risk Level: Low - Fixes existing bug, improves robustness
Recommended Solution: Hybrid Approach 🎯
Combine Options 1 + 5: Two-pass import with parent resurrection.
Implementation Plan
Phase 1: Fix Import Ordering (Option 5)
- Refactor
upsertIssues()to sort by depth - Add depth-based batch creation
- Add tests for parent-child pairs in same batch
Phase 2: Add Parent Resurrection (Option 1)
- Create
TryResurrectParent(ctx, parentID)function - Modify
EnsureIDs()to call resurrection before validation - Add
is_tombstoneflag to schema (optional) - Log resurrected parents for transparency
Phase 3: Make Configurable
- Add config option:
import.orphan_handlingstrict: Current behavior (fail on missing parent)resurrect: Auto-resurrect from JSONL (default)skip: Skip orphaned issues with warningallow: Allow orphans (relaxed mode)
Benefits of Hybrid Approach
- ✅ Fixes latent ordering bug (prevents future issues)
- ✅ Handles deleted parents gracefully
- ✅ Maintains referential integrity
- ✅ Provides user control via config
- ✅ Backwards compatible (strict mode available)
- ✅ Enables multi-repo workflows
Edge Cases to Consider
1. Parent Deleted in Multiple Levels
Scenario: bd-abc.1.2 exists but both bd-abc and bd-abc.1 are deleted.
Resolution: Recursive resurrection - resurrect entire chain.
2. Parent Never Existed in JSONL
Scenario: JSONL corruption or manual ID manipulation.
Resolution:
- If
resurrectmode: Skip with error (can't resurrect what doesn't exist) - If
skipmode: Skip orphan - If
allowmode: Import anyway (dangerous)
3. Concurrent Import from Different Clones
Scenario: Two clones import same JSONL with missing parents simultaneously.
Resolution: Resurrection is idempotent - second clone sees parent already exists (created by first clone). No conflict.
4. Parent Deleted After Child Import
Scenario: Import creates bd-abc.1, then user deletes bd-abc.
Resolution: Foreign key constraint prevents deletion (if enabled). If disabled, creates orphan in DB.
Recommendation: Add ON DELETE CASCADE or ON DELETE RESTRICT to child_counters table.
Schema Considerations
Current Schema (schema.go)
CREATE TABLE IF NOT EXISTS child_counters (
parent_id TEXT PRIMARY KEY,
next_counter INTEGER NOT NULL DEFAULT 1,
FOREIGN KEY(parent_id) REFERENCES issues(id)
);
Issue: No ON DELETE clause - undefined behavior when parent deleted.
Recommended Schema Change
CREATE TABLE IF NOT EXISTS child_counters (
parent_id TEXT PRIMARY KEY,
next_counter INTEGER NOT NULL DEFAULT 1,
FOREIGN KEY(parent_id) REFERENCES issues(id) ON DELETE CASCADE
);
Reason: When parent deleted, child counter should also be deleted. If parent is resurrected, counter gets recreated from scratch.
Performance Impact Analysis
Current Import (Broken)
- Time: O(n) where n = number of issues
- Fails on first orphan
Two-Pass Import (Option 5)
- Sorting: O(n log n)
- Depth-based batching: O(n × d) where d = max depth (3)
- Total: O(n log n) - negligible for typical datasets (<10k issues)
Parent Resurrection (Option 1)
- JSONL parse: Already done
- Parent lookup: O(1) hash map lookup
- Resurrection: O(1) single insert
- Total: O(1) per orphan - minimal overhead
Conclusion: Performance impact is negligible (<5% overhead for typical imports).
Testing Strategy
Unit Tests Required
-
Test Import Ordering
- Import
[child, parent]- should succeed - Import
[parent.1.2, parent, parent.1]- should sort correctly
- Import
-
Test Parent Resurrection
- Import child with deleted parent - should resurrect
- Import child with never-existed parent - should fail gracefully
-
Test Config Modes
- Test
strict,resurrect,skip,allowmodes - Verify error messages and logging
- Test
-
Test Edge Cases
- Multi-level deletion (
bd-abc.1.2withbd-abcandbd-abc.1deleted) - Concurrent imports with same orphans
- JSONL corruption scenarios
- Multi-level deletion (
Integration Tests Required
-
Multi-Repo Sync
- Clone A deletes issue
- Clone B imports Clone A's JSONL
- Verify: Clone B handles deletion gracefully
-
Round-Trip Fidelity
- Export → Delete parent → Import → Verify structure
Code Files Affected
Must Modify
internal/importer/importer.go:534-546- Add topological sortinternal/storage/sqlite/ids.go:189-202- Add resurrection optioninternal/storage/sqlite/sqlite.go:182-196- Add resurrection option
Should Modify
internal/storage/sqlite/schema.go:35-49- AddON DELETE CASCADEinternal/types/types.go- AddIsTombstone boolfield (optional)
New Files Needed
internal/storage/sqlite/resurrection.go- Parent resurrection logicinternal/importer/sort.go- Topological sort utilities
Migration Path
For Existing Databases
Problem: Databases might already have orphaned children (if foreign keys were disabled during development).
Solution: Add migration to detect and fix orphans:
-- Find orphaned children
SELECT id FROM issues
WHERE id LIKE '%.%'
AND substr(id, 1, instr(id || '.', '.') - 1) NOT IN (SELECT id FROM issues);
-- Option A: Delete orphans
DELETE FROM issues WHERE id IN (...);
-- Option B: Convert to top-level
UPDATE issues SET id = replace(id, '.', '-') WHERE id IN (...);
Recommendation: Run detection query, log results, let user decide action.
Conclusion
bd-3xq reveals a fundamental design flaw: The system prioritizes database integrity over operational flexibility, making normal operations (deletion) risky for future imports.
The hybrid solution (Options 1 + 5) is strongly recommended because it:
- Fixes the latent import ordering bug that affects everyone
- Enables graceful handling of deleted parents
- Maintains referential integrity through resurrection
- Provides configuration options for different use cases
- Enables multi-repo workflows (bd-4ms)
- Has minimal performance impact
- Is backwards compatible
Estimated Implementation Time:
- Phase 1 (sorting): 4-6 hours
- Phase 2 (resurrection): 6-8 hours
- Phase 3 (config): 2-3 hours
- Testing: 8-10 hours
- Total: 2-3 days for complete solution
Priority: P0 - Blocks multi-repo work (bd-4ms) and makes bd-delete risky
References
- bd-3xq: This issue
- bd-4ms: Multi-repo support (blocked by this issue)
- bd-a101: Separate branch workflow (blocked by this issue)
- bd-8e05: Hash-based ID migration (related context)
- bd-95: Content hash computation (resurrection uses this)
Analysis completed: 2025-11-04 Analyzed by: Claude (Sonnet 4.5)