Critical P0 bug analysis: silent data loss when .beads/ removed - Root cause: autoimport.go hardcoded to issues.jsonl, git has beads.jsonl - Oracle-reviewed fixes with implementation refinements - Epic structure ready: 5 child issues, 5-7 hours estimated - Comprehensive test cases for all scenarios Amp-Thread-ID: https://ampcode.com/threads/T-57e73277-9112-42fd-a3c1-a1d1f5a22c8b Co-authored-by: Amp <amp@ampcode.com>
485 lines
16 KiB
Markdown
485 lines
16 KiB
Markdown
# Database Re-initialization Bug Investigation
|
|
|
|
**Date**: 2024-10-24
|
|
**Severity**: P0 Critical
|
|
**Status**: Under Investigation
|
|
|
|
## Problem Statement
|
|
|
|
When `.beads/` directory is removed and daemon auto-starts, it creates an **empty database** instead of importing from git-tracked JSONL file. This causes silent data loss.
|
|
|
|
## What Happened
|
|
|
|
1. **Initial State**: ~/src/fred/beads had polluted database with 202 issues
|
|
2. **Action Taken**: Removed `.beads/` directory to clean pollution: `rm -rf .beads/`
|
|
3. **Session Restart**: Amp session restarted, working directory: `/Users/stevey/src/fred/beads`
|
|
4. **Auto-Init Triggered**: Daemon auto-started and created fresh database
|
|
5. **Result**: Empty database (0 issues) despite `.beads/beads.jsonl` in git with 111 issues
|
|
|
|
## Root Cause Analysis
|
|
|
|
### Key Observations
|
|
|
|
1. **File Naming Confusion**
|
|
- Git history shows rename: `issues.jsonl → beads.jsonl` (commit d1d3fcd)
|
|
- Daemon created new `issues.jsonl` (empty)
|
|
- Auto-import may be looking for wrong filename
|
|
|
|
2. **Auto-Import Failed**
|
|
- `bd init` ran successfully
|
|
- Auto-import from git did NOT trigger
|
|
- Expected behavior: should import from `.beads/beads.jsonl` in git
|
|
|
|
3. **Daemon Startup Sequence**
|
|
```
|
|
[2025-10-24 13:19:42] Daemon started
|
|
[2025-10-24 13:19:42] Using database: /Users/stevey/src/fred/beads/.beads/bd.db
|
|
[2025-10-24 13:19:42] Database opened
|
|
[2025-10-24 13:19:42] Exported to JSONL (exported 0 issues to empty file)
|
|
```
|
|
|
|
4. **Multiple Database Problem**
|
|
- Three separate beads databases detected:
|
|
- `~/src/beads/.beads/bd.db` (4.2MB, 112 issues) ✅ CORRECT
|
|
- `~/src/fred/beads/.beads/bd.db` (155KB, 0 issues) ❌ EMPTY
|
|
- `~/src/original/beads/.beads/bd.db` ❌ UNKNOWN STATE
|
|
|
|
## Expected Behavior
|
|
|
|
When `.beads/` directory is missing but git has tracked JSONL:
|
|
|
|
1. `bd init` should detect git-tracked JSONL file
|
|
2. Auto-import should trigger immediately
|
|
3. Database should be populated from git history
|
|
4. User should see: "Imported N issues from git"
|
|
|
|
## Actual Behavior
|
|
|
|
1. `bd init` creates empty database
|
|
2. Auto-import does NOT trigger
|
|
3. Database remains empty (0 issues)
|
|
4. Silent data loss - user unaware issues are missing
|
|
|
|
## Impact
|
|
|
|
- **Silent Data Loss**: Users lose entire issue database without warning
|
|
- **Multi-Workspace Confusion**: Per-project daemons don't handle missing DB correctly
|
|
- **Git Sync Broken**: Auto-import from git not working as expected
|
|
- **User Trust**: Critical failure mode that breaks core workflow
|
|
|
|
## Recovery Steps Taken
|
|
|
|
1. Restored from git: `git restore .beads/beads.jsonl` ❌ File already in git, not in working tree
|
|
2. Extracted from git history: `git show HEAD:.beads/beads.jsonl > /tmp/backup.jsonl`
|
|
3. Manual import with collision resolution: `bd import -i /tmp/backup.jsonl --resolve-collisions`
|
|
4. Final state: 194 issues recovered (had stale backup)
|
|
|
|
## Correct Recovery (Final)
|
|
|
|
1. Removed bad database: `rm -f .beads/beads.db`
|
|
2. Git pull to get latest: `git pull origin main` (got 111 issues from ~/src/beads)
|
|
3. Re-init with correct prefix: `bd init --prefix bd`
|
|
4. Import from git-tracked JSONL: `bd import -i .beads/beads.jsonl`
|
|
5. ✅ Result: 112 issues (111 + external_ref epic from main database)
|
|
|
|
## Technical Investigation Needed
|
|
|
|
### 1. Auto-Import Logic
|
|
- Where is auto-import triggered? (`bd init` command? daemon startup?)
|
|
- What file does it look for? (`issues.jsonl` vs `beads.jsonl`)
|
|
- Why didn't it run when `.beads/` was missing?
|
|
|
|
### 2. Daemon Initialization
|
|
- Should daemon auto-import on first startup?
|
|
- Should daemon detect missing database and import from git?
|
|
- Per-project daemon handling when DB missing
|
|
|
|
### 3. File Naming
|
|
- When did `issues.jsonl → beads.jsonl` rename happen?
|
|
- Are all code paths updated to use correct filename?
|
|
- Is auto-import looking for old filename?
|
|
|
|
### 4. Git Integration
|
|
- Should `bd init` check for tracked JSONL in git?
|
|
- Should init fail if git has JSONL but DB is empty after init?
|
|
- Add warning: "JSONL found in git but not imported"?
|
|
|
|
## Proposed Fixes (Oracle-Reviewed)
|
|
|
|
### Fix A: checkGitForIssues() Filename Detection (P0, Simple, <1h)
|
|
|
|
**Current Code** (autoimport.go:70-76):
|
|
```go
|
|
relPath, err := filepath.Rel(gitRoot, filepath.Join(beadsDir, "issues.jsonl"))
|
|
```
|
|
|
|
**Fixed Code**:
|
|
```go
|
|
// Try canonical JSONL filenames in precedence order
|
|
relBeads, err := filepath.Rel(gitRoot, beadsDir)
|
|
if err != nil {
|
|
return 0, ""
|
|
}
|
|
|
|
candidates := []string{
|
|
filepath.Join(relBeads, "beads.jsonl"),
|
|
filepath.Join(relBeads, "issues.jsonl"),
|
|
}
|
|
|
|
for _, relPath := range candidates {
|
|
cmd := exec.Command("git", "show", fmt.Sprintf("HEAD:%s", relPath))
|
|
output, err := cmd.Output()
|
|
if err == nil && len(output) > 0 {
|
|
lines := bytes.Count(output, []byte("\n"))
|
|
return lines, relPath
|
|
}
|
|
}
|
|
|
|
return 0, ""
|
|
```
|
|
|
|
**Impact**: Auto-import will now detect beads.jsonl in git
|
|
|
|
---
|
|
|
|
### Fix B: findJSONLPath() Consults Git HEAD (P0, Simple-Medium, 1-2h)
|
|
|
|
**Current Code** (main.go:898-912):
|
|
```go
|
|
func findJSONLPath() string {
|
|
jsonlPath := beads.FindJSONLPath(dbPath)
|
|
// Creates directory but doesn't check git
|
|
return jsonlPath
|
|
}
|
|
```
|
|
|
|
**Fixed Code**:
|
|
```go
|
|
func findJSONLPath() string {
|
|
// First check for existing local JSONL files
|
|
jsonlPath := beads.FindJSONLPath(dbPath)
|
|
|
|
dbDir := filepath.Dir(dbPath)
|
|
|
|
// If local file exists, use it
|
|
if _, err := os.Stat(jsonlPath); err == nil {
|
|
return jsonlPath
|
|
}
|
|
|
|
// No local JSONL - check git HEAD for tracked filename
|
|
if gitJSONL := checkGitForJSONLFilename(); gitJSONL != "" {
|
|
jsonlPath = filepath.Join(dbDir, filepath.Base(gitJSONL))
|
|
}
|
|
|
|
// Ensure directory exists
|
|
if err := os.MkdirAll(dbDir, 0755); err == nil {
|
|
// Verify we didn't pick the wrong file
|
|
// ...error checking...
|
|
}
|
|
|
|
return jsonlPath
|
|
}
|
|
```
|
|
|
|
**Impact**: Daemon/CLI will export to beads.jsonl (not issues.jsonl) when git tracks beads.jsonl
|
|
|
|
---
|
|
|
|
### Fix C: Init Safety Check (P0, Simple, <1h)
|
|
|
|
**Location**: cmd/bd/init.go after line 150
|
|
|
|
**Add After Import Attempt**:
|
|
```go
|
|
// Safety check: verify import succeeded
|
|
stats, err := store.GetStatistics(ctx)
|
|
if err == nil && stats.TotalIssues == 0 {
|
|
// DB empty after init - check if git has issues we failed to import
|
|
recheck, _ := checkGitForIssues()
|
|
if recheck > 0 {
|
|
fmt.Fprintf(os.Stderr, "\n❌ ERROR: Database empty but git has %d issues!\n", recheck)
|
|
fmt.Fprintf(os.Stderr, "Auto-import failed. Manual recovery:\n")
|
|
fmt.Fprintf(os.Stderr, " git show HEAD:%s | bd import -i /dev/stdin\n", jsonlPath)
|
|
fmt.Fprintf(os.Stderr, "Or:\n")
|
|
fmt.Fprintf(os.Stderr, " bd import -i %s\n", jsonlPath)
|
|
os.Exit(1)
|
|
}
|
|
}
|
|
```
|
|
|
|
**Impact**: Prevents silent data loss by failing loudly with recovery instructions
|
|
|
|
---
|
|
|
|
### Fix D: Daemon Startup Import (P1, Simple, <1h)
|
|
|
|
**Location**: cmd/bd/daemon.go after DB open (around line 914)
|
|
|
|
**Add After Database Open**:
|
|
```go
|
|
// Check for empty DB with issues in git
|
|
ctx := context.Background()
|
|
stats, err := store.GetStatistics(ctx)
|
|
if err == nil && stats.TotalIssues == 0 {
|
|
issueCount, jsonlPath := checkGitForIssues()
|
|
if issueCount > 0 {
|
|
log(fmt.Sprintf("Empty database but git has %d issues, importing...", issueCount))
|
|
if err := importFromGit(ctx, dbPath, store, jsonlPath); err != nil {
|
|
log(fmt.Sprintf("Warning: startup import failed: %v", err))
|
|
} else {
|
|
log(fmt.Sprintf("Successfully imported %d issues from git", issueCount))
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Impact**: Daemon auto-recovers from empty DB on startup
|
|
|
|
### Medium Term (P1)
|
|
1. **Multiple database warning** (bd-112)
|
|
- Detect multiple `.beads/` in workspace hierarchy
|
|
- Warn user on startup
|
|
- Prevent accidental database pollution
|
|
|
|
2. **Better error messages**
|
|
- `bd init`: "Warning: found beads.jsonl in git with N issues"
|
|
- `bd stats`: "Warning: database empty but git has tracked JSONL"
|
|
- Guide user to recovery path
|
|
|
|
### Implementation Refinements (Critical)
|
|
|
|
**Fix B Missing Helper Function**:
|
|
The oracle's Fix B pseudocode calls `checkGitForJSONLFilename()` which doesn't exist. Need to add:
|
|
```go
|
|
// checkGitForJSONLFilename returns just the filename from git HEAD check
|
|
func checkGitForJSONLFilename() string {
|
|
_, relPath := checkGitForIssues()
|
|
if relPath == "" {
|
|
return ""
|
|
}
|
|
return filepath.Base(relPath)
|
|
}
|
|
```
|
|
|
|
**Alternative Simpler Approach for Fix B**:
|
|
Instead of making `findJSONLPath()` git-aware, ensure import immediately exports to local file:
|
|
```go
|
|
// In cmd/bd/init.go after successful importFromGit (line 148):
|
|
if err := importFromGit(ctx, initDBPath, store, jsonlPath); err != nil {
|
|
// ...error handling...
|
|
} else {
|
|
// CRITICAL: Immediately export to local to prevent daemon race
|
|
localPath := filepath.Join(".beads", filepath.Base(jsonlPath))
|
|
if err := exportToJSONL(ctx, store, localPath); err != nil {
|
|
fmt.Fprintf(os.Stderr, "Warning: failed to export after import: %v\n", err)
|
|
}
|
|
fmt.Fprintf(os.Stderr, "✓ Successfully imported %d issues from git.\n\n", issueCount)
|
|
}
|
|
```
|
|
|
|
**Race Condition Warning**:
|
|
After `rm -rf .beads/`, there's a timing window:
|
|
1. `bd init` runs, imports from git's `beads.jsonl`
|
|
2. Import schedules auto-flush (5-second debounce)
|
|
3. Daemon auto-starts before flush completes
|
|
4. Daemon calls `findJSONLPath()` → no local file yet → creates wrong `issues.jsonl`
|
|
|
|
**Solution**: Import must **immediately create local JSONL** (no debounce) to win the race.
|
|
|
|
**Revised Priority**:
|
|
- Fix A: P0 - Blocks everything, enables git detection
|
|
- Fix C: P0 - Prevents silent failures, critical safety net
|
|
- Fix B: P0 - Prevents wrong file creation (OR immediate export)
|
|
- Fix D: P1 - Nice recovery but redundant if A+B+C work
|
|
|
|
### Precedence Rules (All Fixes)
|
|
|
|
**When checking git HEAD**:
|
|
1. First try `.beads/beads.jsonl`
|
|
2. Then try `.beads/issues.jsonl`
|
|
3. Ignore non-canonical names (archive.jsonl, backup.jsonl, etc.)
|
|
|
|
**When multiple local JSONL files exist**:
|
|
- Use existing `beads.FindJSONLPath()` glob behavior (first match)
|
|
- This preserves backward compatibility
|
|
|
|
### Long Term (P2)
|
|
1. **Unified JSONL naming**
|
|
- Standardize on one filename (recommend `beads.jsonl`)
|
|
- Migration path for old `issues.jsonl`
|
|
- Update all code paths consistently
|
|
- Optional: Store chosen JSONL filename in DB metadata
|
|
|
|
2. **Git-aware init** ✅ PARTIALLY DONE
|
|
- `bd init` should be git-aware ✅ EXISTS (commit 7f82708)
|
|
- Detect tracked JSONL and import automatically ❌ BROKEN (wrong filename)
|
|
- Make this the default happy path ✅ WILL BE FIXED by Fix A
|
|
|
|
## Implementation Plan (Epic Structure)
|
|
|
|
**Epic**: Fix database reinitialization data loss bug
|
|
|
|
**Child Issues** (in dependency order):
|
|
1. **Fix A**: checkGitForIssues() filename detection (P0, <1h)
|
|
- Update autoimport.go:70-96 to try beads.jsonl then issues.jsonl
|
|
- Test: verify detects both filenames in git
|
|
- Blocks: Fix C (needs working detection)
|
|
|
|
2. **Fix B-Alt**: Immediate export after import (P0, <1h)
|
|
- In init.go after importFromGit(), immediately call exportToJSONL()
|
|
- Prevents daemon race condition
|
|
- Simpler than making findJSONLPath() git-aware
|
|
- Test: verify local JSONL created with correct filename
|
|
|
|
3. **Fix C**: Init safety check (P0, <1h)
|
|
- Add post-init verification in init.go
|
|
- Error and exit if DB empty but git has issues
|
|
- Depends: Fix A (uses checkGitForIssues)
|
|
- Test: verify fails loudly when import fails
|
|
|
|
4. **Fix D**: Daemon startup import (P1, <1h)
|
|
- Add empty-DB check on daemon startup
|
|
- Auto-import if git has issues
|
|
- Depends: Fix A (uses checkGitForIssues)
|
|
- Test: verify daemon recovers from empty DB
|
|
|
|
5. **Integration tests** (P0, 1-2h)
|
|
- Test fresh clone scenario
|
|
- Test `rm -rf .beads/` scenario
|
|
- Test daemon race condition (start daemon immediately after init)
|
|
- Test both beads.jsonl and issues.jsonl in git
|
|
|
|
**Estimated Total**: 5-7 hours
|
|
|
|
## Related Issues
|
|
|
|
- **bd-112**: Warn when multiple beads databases detected (filed in ~/src/beads)
|
|
- **GH #142**: External_ref import feature (not directly related but shows import complexity)
|
|
- Commit d1d3fcd: Renamed `issues.jsonl → beads.jsonl`
|
|
- Commit 7f82708: "Fix bd init to auto-import issues from git on fresh clone"
|
|
|
|
## Test Cases Needed
|
|
|
|
1. **Fresh Clone Scenario**
|
|
```bash
|
|
git clone repo
|
|
cd repo
|
|
bd init
|
|
# Should auto-import from .beads/beads.jsonl
|
|
# Should create local .beads/beads.jsonl immediately
|
|
bd stats --json | jq '.total_issues' # Should match git count
|
|
```
|
|
|
|
2. **Database Removal Scenario (Primary Bug)**
|
|
```bash
|
|
rm -rf .beads/
|
|
bd init
|
|
# Should detect git-tracked JSONL and import
|
|
bd stats --json | jq '.total_issues' # Should be >0, not 0
|
|
ls .beads/*.jsonl # Should be beads.jsonl, NOT issues.jsonl
|
|
```
|
|
|
|
3. **Race Condition Scenario (Daemon Startup)**
|
|
```bash
|
|
rm -rf .beads/
|
|
bd init & # Start init in background
|
|
sleep 0.1
|
|
bd ready # Triggers daemon auto-start
|
|
wait
|
|
# Daemon should NOT create issues.jsonl
|
|
# Should use beads.jsonl from git
|
|
ls .beads/*.jsonl
|
|
```
|
|
|
|
4. **Legacy Filename Support (issues.jsonl)**
|
|
```bash
|
|
# Git has .beads/issues.jsonl (not beads.jsonl)
|
|
rm -rf .beads/
|
|
bd init
|
|
# Should still import correctly
|
|
ls .beads/*.jsonl # Should be issues.jsonl (matches git)
|
|
```
|
|
|
|
5. **Multiple Workspace Scenario**
|
|
```bash
|
|
# Two separate clones
|
|
~/src/beads/ # database 1
|
|
~/src/fred/beads/ # database 2
|
|
# Each should maintain separate state correctly
|
|
# Each should use correct JSONL filename from its own git
|
|
```
|
|
|
|
6. **Daemon Restart Scenario**
|
|
```bash
|
|
bd daemon --stop
|
|
rm .beads/bd.db
|
|
bd daemon # auto-start
|
|
# Should import from git on startup
|
|
bd stats --json | jq '.total_issues' # Should be >0
|
|
```
|
|
|
|
7. **Init Safety Check Scenario**
|
|
```bash
|
|
# Simulate import failure
|
|
rm -rf .beads/
|
|
chmod 000 .beads # Prevent creation
|
|
bd init 2>&1 | grep ERROR
|
|
# Should fail with clear error, not silent success
|
|
```
|
|
|
|
## Root Cause Analysis - CONFIRMED
|
|
|
|
### Primary Bug: Hardcoded Filename in checkGitForIssues()
|
|
|
|
**File**: `cmd/bd/autoimport.go:76`
|
|
**Problem**: Hardcoded to `"issues.jsonl"` but git tracks `"beads.jsonl"`
|
|
|
|
```go
|
|
// Line 76 - HARDCODED FILENAME
|
|
relPath, err := filepath.Rel(gitRoot, filepath.Join(beadsDir, "issues.jsonl"))
|
|
```
|
|
|
|
### Secondary Bug: Daemon Creates Wrong JSONL File
|
|
|
|
**File**: `cmd/bd/main.go:findJSONLPath()`, `beads.go:FindJSONLPath()`
|
|
**Problem**: When no local JSONL exists, defaults to `"issues.jsonl"` without checking git HEAD
|
|
|
|
**Code Flow**:
|
|
1. `FindJSONLPath()` globs for `*.jsonl` in `.beads/` (line 137)
|
|
2. If none found, defaults to `"issues.jsonl"` (line 144)
|
|
3. Daemon exports to empty `issues.jsonl`, ignoring `beads.jsonl` in git
|
|
|
|
### Why Auto-Import Failed
|
|
|
|
1. **bd init** called `checkGitForIssues()` → looked for `HEAD:.beads/issues.jsonl`
|
|
2. Git only has `HEAD:.beads/beads.jsonl` → check returned 0 issues
|
|
3. No import triggered, DB stayed empty
|
|
4. Daemon started, called `findJSONLPath()` → found no local JSONL
|
|
5. Defaulted to `issues.jsonl`, exported 0 issues to empty file
|
|
6. **Silent data loss complete**
|
|
|
|
## Questions for Investigation
|
|
|
|
1. ✅ Why did auto-import not trigger after `bd init`?
|
|
- **ANSWERED**: checkGitForIssues() hardcoded to issues.jsonl, git has beads.jsonl
|
|
2. ✅ Is there auto-import code that's not being called?
|
|
- **ANSWERED**: Auto-import code ran but found 0 issues due to wrong filename
|
|
3. ✅ When should daemon vs CLI handle import?
|
|
- **ANSWERED**: Both should handle; daemon on startup if DB empty + git has JSONL
|
|
4. ✅ Should we enforce single JSONL filename across codebase?
|
|
- **ANSWERED**: Support both with precedence: beads.jsonl > issues.jsonl
|
|
5. ✅ How do we prevent this silent data loss in future?
|
|
- **ANSWERED**: See proposed fixes below
|
|
|
|
## Severity Justification: P0
|
|
|
|
This is a **critical data loss bug**:
|
|
- ✅ Silent failure (no error, no warning)
|
|
- ✅ Complete data loss (0 issues after 202)
|
|
- ✅ Core workflow broken (init + auto-import)
|
|
- ✅ Multi-workspace scenarios broken
|
|
- ✅ User cannot recover without manual intervention
|
|
- ✅ Breaks trust in beads reliability
|
|
|
|
**Recommendation**: Investigate and fix immediately before 1.0 release.
|