# Architecture This document describes bd's overall architecture - the data model, sync mechanism, and how components fit together. For internal implementation details (FlushManager, Blocked Cache), see [INTERNALS.md](INTERNALS.md). ## The Three-Layer Data Model bd's core design enables a distributed, git-backed issue tracker that feels like a centralized database. The "magic" comes from three synchronized layers: ``` ┌─────────────────────────────────────────────────────────────────┐ │ CLI Layer │ │ │ │ bd create, list, update, close, ready, show, dep, sync, ... │ │ - Cobra commands in cmd/bd/ │ │ - All commands support --json for programmatic use │ │ - Tries daemon RPC first, falls back to direct DB access │ └──────────────────────────────┬──────────────────────────────────┘ │ v ┌─────────────────────────────────────────────────────────────────┐ │ SQLite Database │ │ (.beads/beads.db) │ │ │ │ - Local working copy (gitignored) │ │ - Fast queries, indexes, foreign keys │ │ - Issues, dependencies, labels, comments, events │ │ - Each machine has its own copy │ └──────────────────────────────┬──────────────────────────────────┘ │ auto-sync (5s debounce) │ v ┌─────────────────────────────────────────────────────────────────┐ │ JSONL File │ │ (.beads/beads.jsonl) │ │ │ │ - Git-tracked source of truth │ │ - One JSON line per entity (issue, dep, label, comment) │ │ - Merge-friendly: additions rarely conflict │ │ - Shared across machines via git push/pull │ └──────────────────────────────┬──────────────────────────────────┘ │ git push/pull │ v ┌─────────────────────────────────────────────────────────────────┐ │ Remote Repository │ │ (GitHub, GitLab, etc.) │ │ │ │ - Stores JSONL as part of normal repo history │ │ - All collaborators share the same issue database │ │ - Protected branch support via separate sync branch │ └─────────────────────────────────────────────────────────────────┘ ``` ### Why This Design? **SQLite for speed:** Local queries complete in milliseconds. Complex dependency graphs, full-text search, and joins are fast. **JSONL for git:** One entity per line means git diffs are readable and merges usually succeed automatically. No binary database files in version control. **Git for distribution:** No special sync server needed. Issues travel with your code. Offline work just works. ## Write Path When you create or modify an issue: ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ CLI Command │───▶│ SQLite Write │───▶│ Mark Dirty │ │ (bd create) │ │ (immediate) │ │ (trigger sync) │ └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ 5-second debounce │ v ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Git Commit │◀───│ JSONL Export │◀───│ FlushManager │ │ (git hooks) │ │ (incremental) │ │ (background) │ └─────────────────┘ └─────────────────┘ └─────────────────┘ ``` 1. **Command executes:** `bd create "New feature"` writes to SQLite immediately 2. **Mark dirty:** The operation marks the database as needing export 3. **Debounce window:** Wait 5 seconds for batch operations (configurable) 4. **Export to JSONL:** Only changed entities are appended/updated 5. **Git commit:** If git hooks are installed, changes auto-commit Key implementation: - Export: `cmd/bd/export.go`, `cmd/bd/autoflush.go` - FlushManager: `internal/flush/` (see [INTERNALS.md](INTERNALS.md)) - Dirty tracking: `internal/storage/sqlite/dirty_issues.go` ## Read Path When you query issues after a `git pull`: ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ git pull │───▶│ Auto-Import │───▶│ SQLite Update │ │ (new JSONL) │ │ (on next cmd) │ │ (merge logic) │ └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ v ┌─────────────────┐ │ CLI Query │ │ (bd ready) │ └─────────────────┘ ``` 1. **Git pull:** Fetches updated JSONL from remote 2. **Auto-import detection:** First bd command checks if JSONL is newer than DB 3. **Import to SQLite:** Parse JSONL, merge with local state using content hashes 4. **Query:** Commands read from fast local SQLite Key implementation: - Import: `cmd/bd/import.go`, `cmd/bd/autoimport.go` - Auto-import logic: `internal/autoimport/autoimport.go` - Collision detection: `internal/importer/importer.go` ## Hash-Based Collision Prevention The key insight that enables distributed operation: **content-based hashing for deduplication**. ### The Problem Sequential IDs (bd-1, bd-2, bd-3) cause collisions when multiple agents create issues concurrently: ``` Branch A: bd create "Add OAuth" → bd-10 Branch B: bd create "Add Stripe" → bd-10 (collision!) ``` ### The Solution Hash-based IDs derived from random UUIDs ensure uniqueness: ``` Branch A: bd create "Add OAuth" → bd-a1b2 Branch B: bd create "Add Stripe" → bd-f14c (no collision) ``` ### How It Works 1. **Issue creation:** Generate random UUID, derive short hash as ID 2. **Progressive scaling:** IDs start at 4 chars, grow to 5-6 chars as database grows 3. **Content hashing:** Each issue has a content hash for change detection 4. **Import merge:** Same ID + different content = update, same ID + same content = skip ``` ┌─────────────────────────────────────────────────────────────────┐ │ Import Logic │ │ │ │ For each issue in JSONL: │ │ 1. Compute content hash │ │ 2. Look up existing issue by ID │ │ 3. Compare hashes: │ │ - Same hash → skip (already imported) │ │ - Different hash → update (newer version) │ │ - No match → create (new issue) │ └─────────────────────────────────────────────────────────────────┘ ``` This eliminates the need for central coordination while ensuring all machines converge to the same state. See [COLLISION_MATH.md](COLLISION_MATH.md) for birthday paradox calculations on hash length vs collision probability. ## Daemon Architecture Each workspace runs its own background daemon for auto-sync: ``` ┌─────────────────────────────────────────────────────────────────┐ │ Per-Workspace Daemon │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ RPC Server │ │ Auto-Sync │ │ Background │ │ │ │ (bd.sock) │ │ Manager │ │ Tasks │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ │ │ │ │ v │ │ ┌─────────────┐ │ │ │ SQLite │ │ │ │ Database │ │ │ └─────────────┘ │ └─────────────────────────────────────────────────────────────────┘ CLI commands ───RPC───▶ Daemon ───SQL───▶ Database or CLI commands ───SQL───▶ Database (if daemon unavailable) ``` **Why daemons?** - Batches multiple operations before export - Holds database connection open (faster queries) - Coordinates auto-sync timing - One daemon per workspace (LSP-like model) **Communication:** - Unix domain socket at `.beads/bd.sock` (Windows: named pipes) - Protocol defined in `internal/rpc/protocol.go` - CLI tries daemon first, falls back to direct DB access **Lifecycle:** - Auto-starts on first bd command (unless `BEADS_NO_DAEMON=1`) - Auto-restarts after version upgrades - Managed via `bd daemons` command See [DAEMON.md](DAEMON.md) for operational details. ## Data Types Core types in `internal/types/types.go`: | Type | Description | Key Fields | |------|-------------|------------| | **Issue** | Work item | ID, Title, Description, Status, Priority, Type | | **Dependency** | Relationship | FromID, ToID, Type (blocks/related/parent-child/discovered-from) | | **Label** | Tag | Name, Color, Description | | **Comment** | Discussion | IssueID, Author, Content, Timestamp | | **Event** | Audit trail | IssueID, Type, Data, Timestamp | ### Dependency Types | Type | Semantic | Affects `bd ready`? | |------|----------|---------------------| | `blocks` | Issue X must close before Y starts | Yes | | `parent-child` | Hierarchical (epic/subtask) | Yes (children blocked if parent blocked) | | `related` | Soft link for reference | No | | `discovered-from` | Found during work on parent | No | ### Status Flow ``` open ──▶ in_progress ──▶ closed │ │ └────────────────────────┘ (reopen) ``` ## Directory Structure ``` .beads/ ├── beads.db # SQLite database (gitignored) ├── beads.jsonl # JSONL source of truth (git-tracked) ├── bd.sock # Daemon socket (gitignored) ├── daemon.log # Daemon logs (gitignored) ├── config.yaml # Project config (optional) └── export_hashes.db # Export tracking (gitignored) ``` ## Key Code Paths | Area | Files | |------|-------| | CLI entry | `cmd/bd/main.go` | | Storage interface | `internal/storage/storage.go` | | SQLite implementation | `internal/storage/sqlite/` | | RPC protocol | `internal/rpc/protocol.go`, `server_*.go` | | Export logic | `cmd/bd/export.go`, `autoflush.go` | | Import logic | `cmd/bd/import.go`, `internal/importer/` | | Auto-sync | `internal/autoimport/`, `internal/flush/` | ## Related Documentation - [INTERNALS.md](INTERNALS.md) - FlushManager, Blocked Cache implementation details - [DAEMON.md](DAEMON.md) - Daemon management and configuration - [EXTENDING.md](EXTENDING.md) - Adding custom tables to SQLite - [TROUBLESHOOTING.md](TROUBLESHOOTING.md) - Recovery procedures and common issues - [FAQ.md](FAQ.md) - Common questions about the architecture - [COLLISION_MATH.md](COLLISION_MATH.md) - Hash collision probability analysis