Files
beads/cmd/bd/doctor/fix/docs.md
matt wilkie a22d949cbd Fix bd-in7q: prevent migrate-tombstones from corrupting deletions manifest (#554)
Root cause: bd doctor hydrate was re-adding migrated tombstones to the
deletions manifest because getCurrentJSONLIDs() included all issues,
including tombstones. When compared against git history, tombstones
appeared as 'deleted' and were incorrectly added to the manifest as new
deletions, corrupting the database on next sync.

Fix: Skip tombstone-status issues in getCurrentJSONLIDs() so they don't
participate in deletion detection. Tombstones represent already-recorded
deletions/migrations and shouldn't be treated as active issues.

Changes:
- cmd/bd/doctor/fix/deletions.go: Skip tombstones in getCurrentJSONLIDs()
- cmd/bd/doctor/fix/deletions_test.go: New tests for tombstone skipping
- cmd/bd/migrate_tombstones_test.go: Test that tombstones are valid

This fixes the bug where 'bd migrate-tombstones' followed by 'bd sync'
would add thousands of deletion records with author 'bd-doctor-hydrate'
2025-12-14 14:11:27 -08:00

7.5 KiB

Noridoc: cmd/bd/doctor/fix

Path: @/cmd/bd/doctor/fix

Overview

The cmd/bd/doctor/fix directory contains automated remediation functions for issues detected by the bd doctor command. Each module handles a specific category of issues (deletions manifest, database config, sync branch, etc.) and provides functions to automatically fix problems found in beads workspaces.

How it fits into the larger codebase

  • Integration with Doctor Detection: The @/cmd/bd/doctor.go command runs checks to identify workspace problems, then calls functions from this package when --fix flag is used. The doctor command is the orchestrator that determines which issues exist and which fixes to apply.

  • Dependency on Core Libraries: The fix functions use core libraries like @/internal/deletions (for reading/writing deletion manifests), @/internal/types (for issue data structures), and git operations via exec.Command.

  • Data Persistence Points: Each fix module directly modifies persistent workspace state: deletions manifest, database files, JSONL files, and git branch configuration. Changes are written to disk and persisted in the git repository.

  • Deletion Tracking Architecture: The deletions manifest (@/internal/deletions/deletions.go) is an append-only log tracking issue deletions. The fix in deletions.go is critical to maintaining the integrity of this log by preventing tombstones from being incorrectly re-added to it after bd migrate-tombstones runs.

  • Tombstone System: The fix works in concert with the tombstone system (@/internal/types/types.go - Status == StatusTombstone). Tombstones represent soft-deleted issues that contain deletion metadata. The fix prevents tombstones from being confused with actively deleted issues during deletion hydration.

Core Implementation

Deletions Manifest Hydration (deletions.go):

  1. HydrateDeletionsManifest() (lines 16-96):

    • Entry point called by bd doctor --fix when "Deletions Manifest" issue is detected
    • Compares current JSONL IDs (read from issues.jsonl) against historical IDs from git history
    • Finds IDs that existed in history but are missing from current JSONL (legitimate deletions)
    • Adds these missing IDs to the deletions manifest with author "bd-doctor-hydrate"
    • Skips IDs already present in the existing deletions manifest to avoid duplicates
  2. getCurrentJSONLIDs() (lines 98-135):

    • Reads current issues.jsonl file line-by-line as JSON
    • Parses each line to extract ID and Status fields
    • CRITICAL FIX (bd-in7q): Skips issues with Status == "tombstone" (lines 127-131)
    • Returns a set of "currently active" issue IDs
    • Gracefully handles missing files (returns empty set) and malformed JSON lines (skips them)
    • This is where the bd-in7q fix is implemented - tombstones are not considered "currently active" and won't be flagged as deleted
  3. getHistoricalJSONLIDs() (lines 137-148):

    • Delegates to getHistoricalIDsViaDiff() to extract all IDs ever present in JSONL from git history
    • Uses git log to find all commits that modified the JSONL file
  4. getHistoricalIDsViaDiff() (lines 178-232):

    • Walks git history commit-by-commit (memory efficient)
    • For each commit touching the JSONL file, parses JSON to extract IDs
    • Uses looksLikeIssueID() validation to avoid false positives from JSON containing ID-like strings
    • Returns complete set of all IDs ever present in the repo history
  5. looksLikeIssueID() (lines 150-176):

    • Validates that a string matches the issue ID format: prefix-suffix
    • Prefix must be alphanumeric with underscores, suffix must be base36 hash or number with optional dots for child issues
    • Used to filter out false positives when parsing JSON

Test Coverage (deletions_test.go):

The test file covers edge cases and validates the bd-in7q fix:

  • TestGetCurrentJSONLIDs_SkipsTombstones: Core fix validation - verifies tombstones are excluded from current IDs
  • TestGetCurrentJSONLIDs_HandlesEmptyFile: Graceful handling of empty JSONL files
  • TestGetCurrentJSONLIDs_HandlesMissingFile: Graceful handling when JSONL doesn't exist
  • TestGetCurrentJSONLIDs_SkipsInvalidJSON: Malformed JSON lines are skipped without failing

Things to Know

The bd-in7q Bug and Fix:

The bug occurred because bd migrate-tombstones converts deletion records from the legacy deletions.jsonl file into inline tombstone entries in issues.jsonl. Without the fix, the sequence would be:

  1. User runs bd migrate-tombstones → creates tombstones in JSONL with status: "tombstone"
  2. User runs bd sync → triggers bd doctor hydrate
  3. getCurrentJSONLIDs() was reading ALL issues including tombstones
  4. Comparison logic sees tombstones are no longer in git history commit 0 (before migration)
  5. They're flagged as "deleted" and re-added to deletions manifest with author "bd-doctor-hydrate"
  6. Next sync applies these deletion records, marking issues as deleted in the database
  7. Result: thousands of false deletion records corrupt the manifest and database state

The fix simply filters out Status == "tombstone" issues in getCurrentJSONLIDs() (line 129). This ensures tombstones (which represent already-recorded deletions) never participate in deletion detection. They're semantically invisible to the deletion tracking system.

Why Tombstones Exist:

@/internal/types/types.go defines StatusTombstone as part of the system (bd-vw8). Tombstones are soft-deleted issues that retain all metadata (ID, DeletedBy, DeletedAt, DeleteReason) for audit trails and conflict resolution. They differ from entries in the deletions manifest, which are just an ID + deletion metadata without the original issue content.

Append-Only Nature of Deletions Manifest:

The deletions manifest (@/internal/deletions/deletions.go) is append-only. When a duplicate deletion is added, the last write wins (line 81 in deletions.go). This design assumes deletions are only recorded once, which the fix preserves by skipping tombstones.

Missing File Handling:

The getCurrentJSONLIDs() function returns an empty set when the JSONL file doesn't exist (lines 104-105). This is intentional - it allows hydration to work on repos that have never had issues.json yet. Only getHistoricalIDsViaDiff() will find historical IDs from git.

ID Format Validation:

The looksLikeIssueID() function validates format strictly (lines 150-176). This prevents parsing errors from embedded JSON with accidental ID-like strings. Example: if issue description contains "id":"some-text", it won't be treated as an issue ID.

Integration with Migrate Tombstones:

The @/cmd/bd/migrate_tombstones.go command creates tombstones using convertDeletionRecordToTombstone() (lines 268-284). These tombstones have Status == types.StatusTombstone. The fix works because migrate-tombstones sets this status correctly (verified by TestMigrateTombstones_TombstonesAreValid() in migrate_tombstones_test.go).

State Machine for Deleted Issues:

There are now two ways an issue can be marked as deleted:

  1. Database state: Issue has status = "tombstone" in the database (from @/internal/storage/sqlite)
  2. Manifest state: Issue ID appears in deletions.jsonl (from @/internal/deletions)

The deletion hydration logic treats deletions manifest as the source of truth for what SHOULD be deleted, then applies those deletions to the database. The fix ensures the manifest only contains legitimate deletions, not tombstones that were migrated from the manifest.

Created and maintained by Nori.