Add DATABASE_REINIT_BUG investigation and fixes
Critical P0 bug analysis: silent data loss when .beads/ removed - Root cause: autoimport.go hardcoded to issues.jsonl, git has beads.jsonl - Oracle-reviewed fixes with implementation refinements - Epic structure ready: 5 child issues, 5-7 hours estimated - Comprehensive test cases for all scenarios Amp-Thread-ID: https://ampcode.com/threads/T-57e73277-9112-42fd-a3c1-a1d1f5a22c8b Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
484
DATABASE_REINIT_BUG.md
Normal file
484
DATABASE_REINIT_BUG.md
Normal file
@@ -0,0 +1,484 @@
|
||||
# Database Re-initialization Bug Investigation
|
||||
|
||||
**Date**: 2024-10-24
|
||||
**Severity**: P0 Critical
|
||||
**Status**: Under Investigation
|
||||
|
||||
## Problem Statement
|
||||
|
||||
When `.beads/` directory is removed and daemon auto-starts, it creates an **empty database** instead of importing from git-tracked JSONL file. This causes silent data loss.
|
||||
|
||||
## What Happened
|
||||
|
||||
1. **Initial State**: ~/src/fred/beads had polluted database with 202 issues
|
||||
2. **Action Taken**: Removed `.beads/` directory to clean pollution: `rm -rf .beads/`
|
||||
3. **Session Restart**: Amp session restarted, working directory: `/Users/stevey/src/fred/beads`
|
||||
4. **Auto-Init Triggered**: Daemon auto-started and created fresh database
|
||||
5. **Result**: Empty database (0 issues) despite `.beads/beads.jsonl` in git with 111 issues
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Key Observations
|
||||
|
||||
1. **File Naming Confusion**
|
||||
- Git history shows rename: `issues.jsonl → beads.jsonl` (commit d1d3fcd)
|
||||
- Daemon created new `issues.jsonl` (empty)
|
||||
- Auto-import may be looking for wrong filename
|
||||
|
||||
2. **Auto-Import Failed**
|
||||
- `bd init` ran successfully
|
||||
- Auto-import from git did NOT trigger
|
||||
- Expected behavior: should import from `.beads/beads.jsonl` in git
|
||||
|
||||
3. **Daemon Startup Sequence**
|
||||
```
|
||||
[2025-10-24 13:19:42] Daemon started
|
||||
[2025-10-24 13:19:42] Using database: /Users/stevey/src/fred/beads/.beads/bd.db
|
||||
[2025-10-24 13:19:42] Database opened
|
||||
[2025-10-24 13:19:42] Exported to JSONL (exported 0 issues to empty file)
|
||||
```
|
||||
|
||||
4. **Multiple Database Problem**
|
||||
- Three separate beads databases detected:
|
||||
- `~/src/beads/.beads/bd.db` (4.2MB, 112 issues) ✅ CORRECT
|
||||
- `~/src/fred/beads/.beads/bd.db` (155KB, 0 issues) ❌ EMPTY
|
||||
- `~/src/original/beads/.beads/bd.db` ❌ UNKNOWN STATE
|
||||
|
||||
## Expected Behavior
|
||||
|
||||
When `.beads/` directory is missing but git has tracked JSONL:
|
||||
|
||||
1. `bd init` should detect git-tracked JSONL file
|
||||
2. Auto-import should trigger immediately
|
||||
3. Database should be populated from git history
|
||||
4. User should see: "Imported N issues from git"
|
||||
|
||||
## Actual Behavior
|
||||
|
||||
1. `bd init` creates empty database
|
||||
2. Auto-import does NOT trigger
|
||||
3. Database remains empty (0 issues)
|
||||
4. Silent data loss - user unaware issues are missing
|
||||
|
||||
## Impact
|
||||
|
||||
- **Silent Data Loss**: Users lose entire issue database without warning
|
||||
- **Multi-Workspace Confusion**: Per-project daemons don't handle missing DB correctly
|
||||
- **Git Sync Broken**: Auto-import from git not working as expected
|
||||
- **User Trust**: Critical failure mode that breaks core workflow
|
||||
|
||||
## Recovery Steps Taken
|
||||
|
||||
1. Restored from git: `git restore .beads/beads.jsonl` ❌ File already in git, not in working tree
|
||||
2. Extracted from git history: `git show HEAD:.beads/beads.jsonl > /tmp/backup.jsonl`
|
||||
3. Manual import with collision resolution: `bd import -i /tmp/backup.jsonl --resolve-collisions`
|
||||
4. Final state: 194 issues recovered (had stale backup)
|
||||
|
||||
## Correct Recovery (Final)
|
||||
|
||||
1. Removed bad database: `rm -f .beads/beads.db`
|
||||
2. Git pull to get latest: `git pull origin main` (got 111 issues from ~/src/beads)
|
||||
3. Re-init with correct prefix: `bd init --prefix bd`
|
||||
4. Import from git-tracked JSONL: `bd import -i .beads/beads.jsonl`
|
||||
5. ✅ Result: 112 issues (111 + external_ref epic from main database)
|
||||
|
||||
## Technical Investigation Needed
|
||||
|
||||
### 1. Auto-Import Logic
|
||||
- Where is auto-import triggered? (`bd init` command? daemon startup?)
|
||||
- What file does it look for? (`issues.jsonl` vs `beads.jsonl`)
|
||||
- Why didn't it run when `.beads/` was missing?
|
||||
|
||||
### 2. Daemon Initialization
|
||||
- Should daemon auto-import on first startup?
|
||||
- Should daemon detect missing database and import from git?
|
||||
- Per-project daemon handling when DB missing
|
||||
|
||||
### 3. File Naming
|
||||
- When did `issues.jsonl → beads.jsonl` rename happen?
|
||||
- Are all code paths updated to use correct filename?
|
||||
- Is auto-import looking for old filename?
|
||||
|
||||
### 4. Git Integration
|
||||
- Should `bd init` check for tracked JSONL in git?
|
||||
- Should init fail if git has JSONL but DB is empty after init?
|
||||
- Add warning: "JSONL found in git but not imported"?
|
||||
|
||||
## Proposed Fixes (Oracle-Reviewed)
|
||||
|
||||
### Fix A: checkGitForIssues() Filename Detection (P0, Simple, <1h)
|
||||
|
||||
**Current Code** (autoimport.go:70-76):
|
||||
```go
|
||||
relPath, err := filepath.Rel(gitRoot, filepath.Join(beadsDir, "issues.jsonl"))
|
||||
```
|
||||
|
||||
**Fixed Code**:
|
||||
```go
|
||||
// Try canonical JSONL filenames in precedence order
|
||||
relBeads, err := filepath.Rel(gitRoot, beadsDir)
|
||||
if err != nil {
|
||||
return 0, ""
|
||||
}
|
||||
|
||||
candidates := []string{
|
||||
filepath.Join(relBeads, "beads.jsonl"),
|
||||
filepath.Join(relBeads, "issues.jsonl"),
|
||||
}
|
||||
|
||||
for _, relPath := range candidates {
|
||||
cmd := exec.Command("git", "show", fmt.Sprintf("HEAD:%s", relPath))
|
||||
output, err := cmd.Output()
|
||||
if err == nil && len(output) > 0 {
|
||||
lines := bytes.Count(output, []byte("\n"))
|
||||
return lines, relPath
|
||||
}
|
||||
}
|
||||
|
||||
return 0, ""
|
||||
```
|
||||
|
||||
**Impact**: Auto-import will now detect beads.jsonl in git
|
||||
|
||||
---
|
||||
|
||||
### Fix B: findJSONLPath() Consults Git HEAD (P0, Simple-Medium, 1-2h)
|
||||
|
||||
**Current Code** (main.go:898-912):
|
||||
```go
|
||||
func findJSONLPath() string {
|
||||
jsonlPath := beads.FindJSONLPath(dbPath)
|
||||
// Creates directory but doesn't check git
|
||||
return jsonlPath
|
||||
}
|
||||
```
|
||||
|
||||
**Fixed Code**:
|
||||
```go
|
||||
func findJSONLPath() string {
|
||||
// First check for existing local JSONL files
|
||||
jsonlPath := beads.FindJSONLPath(dbPath)
|
||||
|
||||
dbDir := filepath.Dir(dbPath)
|
||||
|
||||
// If local file exists, use it
|
||||
if _, err := os.Stat(jsonlPath); err == nil {
|
||||
return jsonlPath
|
||||
}
|
||||
|
||||
// No local JSONL - check git HEAD for tracked filename
|
||||
if gitJSONL := checkGitForJSONLFilename(); gitJSONL != "" {
|
||||
jsonlPath = filepath.Join(dbDir, filepath.Base(gitJSONL))
|
||||
}
|
||||
|
||||
// Ensure directory exists
|
||||
if err := os.MkdirAll(dbDir, 0755); err == nil {
|
||||
// Verify we didn't pick the wrong file
|
||||
// ...error checking...
|
||||
}
|
||||
|
||||
return jsonlPath
|
||||
}
|
||||
```
|
||||
|
||||
**Impact**: Daemon/CLI will export to beads.jsonl (not issues.jsonl) when git tracks beads.jsonl
|
||||
|
||||
---
|
||||
|
||||
### Fix C: Init Safety Check (P0, Simple, <1h)
|
||||
|
||||
**Location**: cmd/bd/init.go after line 150
|
||||
|
||||
**Add After Import Attempt**:
|
||||
```go
|
||||
// Safety check: verify import succeeded
|
||||
stats, err := store.GetStatistics(ctx)
|
||||
if err == nil && stats.TotalIssues == 0 {
|
||||
// DB empty after init - check if git has issues we failed to import
|
||||
recheck, _ := checkGitForIssues()
|
||||
if recheck > 0 {
|
||||
fmt.Fprintf(os.Stderr, "\n❌ ERROR: Database empty but git has %d issues!\n", recheck)
|
||||
fmt.Fprintf(os.Stderr, "Auto-import failed. Manual recovery:\n")
|
||||
fmt.Fprintf(os.Stderr, " git show HEAD:%s | bd import -i /dev/stdin\n", jsonlPath)
|
||||
fmt.Fprintf(os.Stderr, "Or:\n")
|
||||
fmt.Fprintf(os.Stderr, " bd import -i %s\n", jsonlPath)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Impact**: Prevents silent data loss by failing loudly with recovery instructions
|
||||
|
||||
---
|
||||
|
||||
### Fix D: Daemon Startup Import (P1, Simple, <1h)
|
||||
|
||||
**Location**: cmd/bd/daemon.go after DB open (around line 914)
|
||||
|
||||
**Add After Database Open**:
|
||||
```go
|
||||
// Check for empty DB with issues in git
|
||||
ctx := context.Background()
|
||||
stats, err := store.GetStatistics(ctx)
|
||||
if err == nil && stats.TotalIssues == 0 {
|
||||
issueCount, jsonlPath := checkGitForIssues()
|
||||
if issueCount > 0 {
|
||||
log(fmt.Sprintf("Empty database but git has %d issues, importing...", issueCount))
|
||||
if err := importFromGit(ctx, dbPath, store, jsonlPath); err != nil {
|
||||
log(fmt.Sprintf("Warning: startup import failed: %v", err))
|
||||
} else {
|
||||
log(fmt.Sprintf("Successfully imported %d issues from git", issueCount))
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Impact**: Daemon auto-recovers from empty DB on startup
|
||||
|
||||
### Medium Term (P1)
|
||||
1. **Multiple database warning** (bd-112)
|
||||
- Detect multiple `.beads/` in workspace hierarchy
|
||||
- Warn user on startup
|
||||
- Prevent accidental database pollution
|
||||
|
||||
2. **Better error messages**
|
||||
- `bd init`: "Warning: found beads.jsonl in git with N issues"
|
||||
- `bd stats`: "Warning: database empty but git has tracked JSONL"
|
||||
- Guide user to recovery path
|
||||
|
||||
### Implementation Refinements (Critical)
|
||||
|
||||
**Fix B Missing Helper Function**:
|
||||
The oracle's Fix B pseudocode calls `checkGitForJSONLFilename()` which doesn't exist. Need to add:
|
||||
```go
|
||||
// checkGitForJSONLFilename returns just the filename from git HEAD check
|
||||
func checkGitForJSONLFilename() string {
|
||||
_, relPath := checkGitForIssues()
|
||||
if relPath == "" {
|
||||
return ""
|
||||
}
|
||||
return filepath.Base(relPath)
|
||||
}
|
||||
```
|
||||
|
||||
**Alternative Simpler Approach for Fix B**:
|
||||
Instead of making `findJSONLPath()` git-aware, ensure import immediately exports to local file:
|
||||
```go
|
||||
// In cmd/bd/init.go after successful importFromGit (line 148):
|
||||
if err := importFromGit(ctx, initDBPath, store, jsonlPath); err != nil {
|
||||
// ...error handling...
|
||||
} else {
|
||||
// CRITICAL: Immediately export to local to prevent daemon race
|
||||
localPath := filepath.Join(".beads", filepath.Base(jsonlPath))
|
||||
if err := exportToJSONL(ctx, store, localPath); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Warning: failed to export after import: %v\n", err)
|
||||
}
|
||||
fmt.Fprintf(os.Stderr, "✓ Successfully imported %d issues from git.\n\n", issueCount)
|
||||
}
|
||||
```
|
||||
|
||||
**Race Condition Warning**:
|
||||
After `rm -rf .beads/`, there's a timing window:
|
||||
1. `bd init` runs, imports from git's `beads.jsonl`
|
||||
2. Import schedules auto-flush (5-second debounce)
|
||||
3. Daemon auto-starts before flush completes
|
||||
4. Daemon calls `findJSONLPath()` → no local file yet → creates wrong `issues.jsonl`
|
||||
|
||||
**Solution**: Import must **immediately create local JSONL** (no debounce) to win the race.
|
||||
|
||||
**Revised Priority**:
|
||||
- Fix A: P0 - Blocks everything, enables git detection
|
||||
- Fix C: P0 - Prevents silent failures, critical safety net
|
||||
- Fix B: P0 - Prevents wrong file creation (OR immediate export)
|
||||
- Fix D: P1 - Nice recovery but redundant if A+B+C work
|
||||
|
||||
### Precedence Rules (All Fixes)
|
||||
|
||||
**When checking git HEAD**:
|
||||
1. First try `.beads/beads.jsonl`
|
||||
2. Then try `.beads/issues.jsonl`
|
||||
3. Ignore non-canonical names (archive.jsonl, backup.jsonl, etc.)
|
||||
|
||||
**When multiple local JSONL files exist**:
|
||||
- Use existing `beads.FindJSONLPath()` glob behavior (first match)
|
||||
- This preserves backward compatibility
|
||||
|
||||
### Long Term (P2)
|
||||
1. **Unified JSONL naming**
|
||||
- Standardize on one filename (recommend `beads.jsonl`)
|
||||
- Migration path for old `issues.jsonl`
|
||||
- Update all code paths consistently
|
||||
- Optional: Store chosen JSONL filename in DB metadata
|
||||
|
||||
2. **Git-aware init** ✅ PARTIALLY DONE
|
||||
- `bd init` should be git-aware ✅ EXISTS (commit 7f82708)
|
||||
- Detect tracked JSONL and import automatically ❌ BROKEN (wrong filename)
|
||||
- Make this the default happy path ✅ WILL BE FIXED by Fix A
|
||||
|
||||
## Implementation Plan (Epic Structure)
|
||||
|
||||
**Epic**: Fix database reinitialization data loss bug
|
||||
|
||||
**Child Issues** (in dependency order):
|
||||
1. **Fix A**: checkGitForIssues() filename detection (P0, <1h)
|
||||
- Update autoimport.go:70-96 to try beads.jsonl then issues.jsonl
|
||||
- Test: verify detects both filenames in git
|
||||
- Blocks: Fix C (needs working detection)
|
||||
|
||||
2. **Fix B-Alt**: Immediate export after import (P0, <1h)
|
||||
- In init.go after importFromGit(), immediately call exportToJSONL()
|
||||
- Prevents daemon race condition
|
||||
- Simpler than making findJSONLPath() git-aware
|
||||
- Test: verify local JSONL created with correct filename
|
||||
|
||||
3. **Fix C**: Init safety check (P0, <1h)
|
||||
- Add post-init verification in init.go
|
||||
- Error and exit if DB empty but git has issues
|
||||
- Depends: Fix A (uses checkGitForIssues)
|
||||
- Test: verify fails loudly when import fails
|
||||
|
||||
4. **Fix D**: Daemon startup import (P1, <1h)
|
||||
- Add empty-DB check on daemon startup
|
||||
- Auto-import if git has issues
|
||||
- Depends: Fix A (uses checkGitForIssues)
|
||||
- Test: verify daemon recovers from empty DB
|
||||
|
||||
5. **Integration tests** (P0, 1-2h)
|
||||
- Test fresh clone scenario
|
||||
- Test `rm -rf .beads/` scenario
|
||||
- Test daemon race condition (start daemon immediately after init)
|
||||
- Test both beads.jsonl and issues.jsonl in git
|
||||
|
||||
**Estimated Total**: 5-7 hours
|
||||
|
||||
## Related Issues
|
||||
|
||||
- **bd-112**: Warn when multiple beads databases detected (filed in ~/src/beads)
|
||||
- **GH #142**: External_ref import feature (not directly related but shows import complexity)
|
||||
- Commit d1d3fcd: Renamed `issues.jsonl → beads.jsonl`
|
||||
- Commit 7f82708: "Fix bd init to auto-import issues from git on fresh clone"
|
||||
|
||||
## Test Cases Needed
|
||||
|
||||
1. **Fresh Clone Scenario**
|
||||
```bash
|
||||
git clone repo
|
||||
cd repo
|
||||
bd init
|
||||
# Should auto-import from .beads/beads.jsonl
|
||||
# Should create local .beads/beads.jsonl immediately
|
||||
bd stats --json | jq '.total_issues' # Should match git count
|
||||
```
|
||||
|
||||
2. **Database Removal Scenario (Primary Bug)**
|
||||
```bash
|
||||
rm -rf .beads/
|
||||
bd init
|
||||
# Should detect git-tracked JSONL and import
|
||||
bd stats --json | jq '.total_issues' # Should be >0, not 0
|
||||
ls .beads/*.jsonl # Should be beads.jsonl, NOT issues.jsonl
|
||||
```
|
||||
|
||||
3. **Race Condition Scenario (Daemon Startup)**
|
||||
```bash
|
||||
rm -rf .beads/
|
||||
bd init & # Start init in background
|
||||
sleep 0.1
|
||||
bd ready # Triggers daemon auto-start
|
||||
wait
|
||||
# Daemon should NOT create issues.jsonl
|
||||
# Should use beads.jsonl from git
|
||||
ls .beads/*.jsonl
|
||||
```
|
||||
|
||||
4. **Legacy Filename Support (issues.jsonl)**
|
||||
```bash
|
||||
# Git has .beads/issues.jsonl (not beads.jsonl)
|
||||
rm -rf .beads/
|
||||
bd init
|
||||
# Should still import correctly
|
||||
ls .beads/*.jsonl # Should be issues.jsonl (matches git)
|
||||
```
|
||||
|
||||
5. **Multiple Workspace Scenario**
|
||||
```bash
|
||||
# Two separate clones
|
||||
~/src/beads/ # database 1
|
||||
~/src/fred/beads/ # database 2
|
||||
# Each should maintain separate state correctly
|
||||
# Each should use correct JSONL filename from its own git
|
||||
```
|
||||
|
||||
6. **Daemon Restart Scenario**
|
||||
```bash
|
||||
bd daemon --stop
|
||||
rm .beads/bd.db
|
||||
bd daemon # auto-start
|
||||
# Should import from git on startup
|
||||
bd stats --json | jq '.total_issues' # Should be >0
|
||||
```
|
||||
|
||||
7. **Init Safety Check Scenario**
|
||||
```bash
|
||||
# Simulate import failure
|
||||
rm -rf .beads/
|
||||
chmod 000 .beads # Prevent creation
|
||||
bd init 2>&1 | grep ERROR
|
||||
# Should fail with clear error, not silent success
|
||||
```
|
||||
|
||||
## Root Cause Analysis - CONFIRMED
|
||||
|
||||
### Primary Bug: Hardcoded Filename in checkGitForIssues()
|
||||
|
||||
**File**: `cmd/bd/autoimport.go:76`
|
||||
**Problem**: Hardcoded to `"issues.jsonl"` but git tracks `"beads.jsonl"`
|
||||
|
||||
```go
|
||||
// Line 76 - HARDCODED FILENAME
|
||||
relPath, err := filepath.Rel(gitRoot, filepath.Join(beadsDir, "issues.jsonl"))
|
||||
```
|
||||
|
||||
### Secondary Bug: Daemon Creates Wrong JSONL File
|
||||
|
||||
**File**: `cmd/bd/main.go:findJSONLPath()`, `beads.go:FindJSONLPath()`
|
||||
**Problem**: When no local JSONL exists, defaults to `"issues.jsonl"` without checking git HEAD
|
||||
|
||||
**Code Flow**:
|
||||
1. `FindJSONLPath()` globs for `*.jsonl` in `.beads/` (line 137)
|
||||
2. If none found, defaults to `"issues.jsonl"` (line 144)
|
||||
3. Daemon exports to empty `issues.jsonl`, ignoring `beads.jsonl` in git
|
||||
|
||||
### Why Auto-Import Failed
|
||||
|
||||
1. **bd init** called `checkGitForIssues()` → looked for `HEAD:.beads/issues.jsonl`
|
||||
2. Git only has `HEAD:.beads/beads.jsonl` → check returned 0 issues
|
||||
3. No import triggered, DB stayed empty
|
||||
4. Daemon started, called `findJSONLPath()` → found no local JSONL
|
||||
5. Defaulted to `issues.jsonl`, exported 0 issues to empty file
|
||||
6. **Silent data loss complete**
|
||||
|
||||
## Questions for Investigation
|
||||
|
||||
1. ✅ Why did auto-import not trigger after `bd init`?
|
||||
- **ANSWERED**: checkGitForIssues() hardcoded to issues.jsonl, git has beads.jsonl
|
||||
2. ✅ Is there auto-import code that's not being called?
|
||||
- **ANSWERED**: Auto-import code ran but found 0 issues due to wrong filename
|
||||
3. ✅ When should daemon vs CLI handle import?
|
||||
- **ANSWERED**: Both should handle; daemon on startup if DB empty + git has JSONL
|
||||
4. ✅ Should we enforce single JSONL filename across codebase?
|
||||
- **ANSWERED**: Support both with precedence: beads.jsonl > issues.jsonl
|
||||
5. ✅ How do we prevent this silent data loss in future?
|
||||
- **ANSWERED**: See proposed fixes below
|
||||
|
||||
## Severity Justification: P0
|
||||
|
||||
This is a **critical data loss bug**:
|
||||
- ✅ Silent failure (no error, no warning)
|
||||
- ✅ Complete data loss (0 issues after 202)
|
||||
- ✅ Core workflow broken (init + auto-import)
|
||||
- ✅ Multi-workspace scenarios broken
|
||||
- ✅ User cannot recover without manual intervention
|
||||
- ✅ Breaks trust in beads reliability
|
||||
|
||||
**Recommendation**: Investigate and fix immediately before 1.0 release.
|
||||
Reference in New Issue
Block a user