Files
beads/GIT_WORKFLOW.md
Steve Yegge 15afb5ad17 Implement JSONL export/import and shift to text-first architecture
This is a fundamental architectural shift from binary SQLite to JSONL as
the source of truth for git workflows.

## New Features

- `bd export --format=jsonl` - Export issues to JSON Lines format
- `bd import` - Import issues from JSONL (create new, update existing)
- `--skip-existing` flag for import to only create new issues

## Architecture Change

**Before:** Binary SQLite database committed to git
**After:** JSONL text files as source of truth, SQLite as ephemeral cache

Benefits:
- Git-friendly text format with clean diffs
- AI-resolvable merge conflicts (append-only is 95% conflict-free)
- Human-readable issue tracking in git
- No binary merge conflicts

## Documentation

- Updated README with JSONL-first workflow and git hooks
- Added TEXT_FORMATS.md analyzing JSONL vs CSV vs binary
- Updated GIT_WORKFLOW.md with historical context
- .gitignore now excludes *.db, includes .beads/*.jsonl

## Implementation Details

- Export sorts issues by ID for consistent diffs
- Import handles both creates and updates atomically
- Proper handling of pointer fields (EstimatedMinutes)
- All tests passing

## Breaking Changes

- Database files (*.db) should now be gitignored
- Use export/import workflow for git collaboration
- Git hooks recommended for automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 01:17:50 -07:00

10 KiB

Git Workflow for bd Databases

Note

: This document contains historical analysis of binary SQLite workflows. The current recommended approach is JSONL-first (see README.md). This document is kept for reference and understanding the design decisions.

TL;DR

Current Recommendation (2025): Use JSONL text format as source of truth. See README.md for the current workflow.

Historical Analysis Below: This documents the binary SQLite approach and why we moved to JSONL.


The Problem

SQLite databases are binary files. Git cannot automatically merge them like text files.

$ git merge feature-branch
warning: Cannot merge binary files: .beads/myapp.db (HEAD vs. feature-branch)
CONFLICT (content): Merge conflict in .beads/myapp.db

When two developers create issues concurrently and try to merge:

  • Git detects a conflict
  • You must choose "ours" or "theirs" (lose one side's changes)
  • OR manually export/import data (tedious)

Works for: 2-10 developers, <500 issues, low-medium velocity

The Protocol

  1. One person owns the database per branch
  2. Pull before creating issues
  3. Push immediately after creating issues
  4. Use short-lived feature branches

Workflow

# Developer A
git pull origin main
bd create "Fix navbar bug" -p 1
git add .beads/myapp.db
git commit -m "Add issue: Fix navbar bug"
git push origin main

# Developer B (same time)
git pull origin main  # Gets A's changes first
bd create "Add dark mode" -p 2
git add .beads/myapp.db
git commit -m "Add issue: Add dark mode"
git push origin main  # No conflict!

Handling Conflicts

If you DO get a conflict:

# Option 1: Take remote (lose your local changes)
git checkout --theirs .beads/myapp.db
bd list  # Verify what you got
git commit

# Option 2: Export your changes, take theirs, reimport
bd list --json > my-issues.json
git checkout --theirs .beads/myapp.db
# Manually recreate your issues
bd create "My issue that got lost"
git add .beads/myapp.db && git commit

# Option 3: Union merge with custom script (see below)

Pros

  • Simple: No infrastructure needed
  • Fast: SQLite is incredibly fast
  • Offline-first: Works without network
  • Atomic: Database transactions guarantee consistency
  • Rich queries: Full SQL power

Cons

  • Binary conflicts require manual resolution
  • Diffs are opaque (can't see changes in git diff)
  • Database size grows over time (but SQLite VACUUM helps)
  • Git LFS might be needed for large projects (>100MB)

Size Analysis

Empty database: 80KB 100 issues: ~120KB (adds ~400 bytes per issue) 1000 issues: ~500KB 10,000 issues: ~5MB

Recommendation: Use binary in git up to ~500 issues or 5MB.


Works for: 5-50 developers, any number of issues

Implementation

Create bd export and bd import commands:

# Export to text format (JSON Lines or SQL)
bd export > .beads/myapp.jsonl

# Import from text
bd import < .beads/myapp.jsonl

Workflow

# Before committing
bd export > .beads/myapp.jsonl
git add .beads/myapp.jsonl
git commit -m "Add issues"

# After pulling
bd import < .beads/myapp.jsonl

Advanced: Keep Both

.beads/
├── myapp.db          # Binary database (in .gitignore)
├── myapp.jsonl       # Text export (in git)
└── sync.sh           # Script to sync between formats

Pros

  • Git can merge text files
  • Diffs are readable
  • Conflicts are easier to resolve
  • Scales to any team size

Cons

  • Requires discipline (must export before commit)
  • Slower (export/import overhead)
  • Two sources of truth (can get out of sync)
  • Merge conflicts still happen (but mergeable)

Solution 3: Shared Database Server (Enterprise)

Works for: 10+ developers, high velocity, need real-time sync

Options

  1. PostgreSQL Backend (future bd feature)

    export BD_DATABASE=postgresql://host/db
    bd create "Issue"  # Goes to shared Postgres
    
  2. SQLite on Shared Filesystem

    export BD_DATABASE=/mnt/shared/myapp.db
    bd create "Issue"  # Multiple writers work fine with WAL
    
  3. bd Server Mode (future feature)

    bd serve --port 8080  # Run bd as HTTP API
    bd --remote=http://localhost:8080 create "Issue"
    

Pros

  • True concurrent access
  • No merge conflicts
  • Real-time updates
  • Centralized audit trail

Cons

  • Requires infrastructure
  • Not offline-first
  • More complex
  • Needs authentication/authorization

Solution 4: Hybrid - Short-Lived Branches

Works for: Any team size, best of both worlds

Strategy

  1. main branch: Contains source of truth database
  2. Feature branches: Don't commit database changes
  3. Issue creation: Only on main branch
# Working on feature
git checkout -b feature-dark-mode
# ... make code changes ...
git commit -m "Implement dark mode"

# Need to create issue? Switch to main first
git checkout main
git pull
bd create "Bug found in dark mode"
git add .beads/myapp.db
git commit -m "Add issue"
git push

git checkout feature-dark-mode
# Continue working

Pros

  • No database merge conflicts (database only on main)
  • Simple mental model
  • Works with existing git workflows

Cons

  • Issues not tied to feature branches
  • Requires discipline

Solo Developer

Binary in git - Just commit it. No conflicts possible.

2-5 Developers (Startup)

Binary in git with protocol - Pull before creating issues, push immediately.

5-20 Developers (Growing Team)

Text export format - Export to JSON Lines, commit that. Binary in .gitignore.

20+ Developers (Enterprise)

Shared database - PostgreSQL backend or bd server mode.


Scaling Analysis

How far can binary-in-git scale?

Experiment: Simulate concurrent developers

# 10 developers each creating 10 issues
# If they all pull at same time, create issues, push sequentially:
# - Developer 1: pushes successfully
# - Developer 2: pulls, gets conflict, resolves, pushes
# - Developer 3: pulls, gets conflict, resolves, pushes
# ...
# Result: 9/10 developers hit conflicts

# If they coordinate (pull, create, push immediately):
# - Success rate: ~80-90% (depends on timing)
# - Failed pushes just retry after pull

# Conclusion: Works up to ~10 concurrent developers with retry logic

Rule of Thumb:

  • 1-5 devs: 95% conflict-free with protocol
  • 5-10 devs: 80% conflict-free, need retry automation
  • 10+ devs: <50% conflict-free, text export recommended

Git LFS

For very large projects (>1000 issues, >5MB database):

# .gitattributes
*.db filter=lfs diff=lfs merge=lfs -text

git lfs track "*.db"
git add .gitattributes
git commit -m "Track SQLite with LFS"

Pros

  • Keeps git repo small
  • Handles large binaries efficiently

Cons

  • Requires Git LFS setup
  • Still can't merge binaries
  • LFS storage costs money (GitHub/GitLab)

Custom Merge Driver

For advanced users, create a custom git merge driver:

# .gitattributes
*.db merge=bd-merge

# .git/config
[merge "bd-merge"]
    name = bd database merger
    driver = bd-merge-tool %O %A %B %P

Where bd-merge-tool is a script that:

  1. Exports both databases to JSON
  2. Merges JSON (using git's text merge)
  3. Imports merged JSON to database
  4. Handles conflicts intelligently (e.g., keep both issues if IDs differ)

This could be a future bd feature:

bd merge-databases base.db ours.db theirs.db > merged.db

For the beads Project Itself

Recommendation: Binary in git with protocol

Rationale:

  • Small team (1-2 primary developers)
  • Low-medium velocity (~10-50 issues total)
  • Want dogfooding (eat our own food)
  • Want simplicity (no export/import overhead)
  • Database will stay small (<1MB)

Protocol for beads Contributors

  1. Pull before creating issues

    git pull origin main
    
  2. Create issue

    bd create "Add PostgreSQL backend" -p 2 -t feature
    
  3. Commit and push immediately

    git add .beads/bd.db
    git commit -m "Add issue: PostgreSQL backend"
    git push origin main
    
  4. If push fails (someone beat you)

    git pull --rebase origin main
    # Resolve conflict by taking theirs
    git checkout --theirs .beads/bd.db
    # Recreate your issue
    bd create "Add PostgreSQL backend" -p 2 -t feature
    git add .beads/bd.db
    git rebase --continue
    git push origin main
    
  5. For feature branches

    • Don't commit database changes
    • Create issues on main branch only
    • Reference issue IDs in commits: git commit -m "Implement bd-42"

Future Enhancements

bd export/import (Priority: Medium)

# JSON Lines format (one issue per line)
bd export --format=jsonl > issues.jsonl
bd import < issues.jsonl

# SQL format (full dump)
bd export --format=sql > issues.sql
bd import < issues.sql

# Delta export (only changes since last)
bd export --since=2025-10-01 > delta.jsonl

bd sync (Priority: High)

Automatic export before git commit:

# .git/hooks/pre-commit
#!/bin/bash
if [ -f .beads/*.db ]; then
    bd export > .beads/issues.jsonl
    git add .beads/issues.jsonl
fi

bd merge-databases (Priority: Low)

bd merge-databases --ours=.beads/bd.db --theirs=/tmp/bd.db --output=merged.db
# Intelligently merges:
# - Same issue ID, different fields: prompt user
# - Different issue IDs: keep both
# - Conflicting dependencies: resolve automatically

Conclusion

For beads itself: Binary in git works great. Just commit .beads/bd.db.

For bd users:

  • Small teams: Binary in git with simple protocol
  • Medium teams: Text export format
  • Large teams: Shared database server

The key insight: SQLite is amazing for local storage, but git wasn't designed for binary merges. Accept this tradeoff and use the right solution for your team size.

Document in README: Add a "Git Workflow" section explaining binary vs text approaches and when to use each.