# Architecture This document describes bd's overall architecture - the data model, sync mechanism, and how components fit together. For internal implementation details (FlushManager, Blocked Cache), see [INTERNALS.md](INTERNALS.md). ## The Three-Layer Data Model bd's core design enables a distributed, git-backed issue tracker that feels like a centralized database. The "magic" comes from three synchronized layers: ``` ┌─────────────────────────────────────────────────────────────────┐ │ CLI Layer │ │ │ │ bd create, list, update, close, ready, show, dep, sync, ... │ │ - Cobra commands in cmd/bd/ │ │ - All commands support --json for programmatic use │ │ - Tries daemon RPC first, falls back to direct DB access │ └──────────────────────────────┬──────────────────────────────────┘ │ v ┌─────────────────────────────────────────────────────────────────┐ │ SQLite Database │ │ (.beads/beads.db) │ │ │ │ - Local working copy (gitignored) │ │ - Fast queries, indexes, foreign keys │ │ - Issues, dependencies, labels, comments, events │ │ - Each machine has its own copy │ └──────────────────────────────┬──────────────────────────────────┘ │ auto-sync (5s debounce) │ v ┌─────────────────────────────────────────────────────────────────┐ │ JSONL File │ │ (.beads/issues.jsonl) │ │ │ │ - Git-tracked source of truth │ │ - One JSON line per entity (issue, dep, label, comment) │ │ - Merge-friendly: additions rarely conflict │ │ - Shared across machines via git push/pull │ └──────────────────────────────┬──────────────────────────────────┘ │ git push/pull │ v ┌─────────────────────────────────────────────────────────────────┐ │ Remote Repository │ │ (GitHub, GitLab, etc.) │ │ │ │ - Stores JSONL as part of normal repo history │ │ - All collaborators share the same issue database │ │ - Protected branch support via separate sync branch │ └─────────────────────────────────────────────────────────────────┘ ``` ### Why This Design? **SQLite for speed:** Local queries complete in milliseconds. Complex dependency graphs, full-text search, and joins are fast. **JSONL for git:** One entity per line means git diffs are readable and merges usually succeed automatically. No binary database files in version control. **Git for distribution:** No special sync server needed. Issues travel with your code. Offline work just works. ## Write Path When you create or modify an issue: ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ CLI Command │───▶│ SQLite Write │───▶│ Mark Dirty │ │ (bd create) │ │ (immediate) │ │ (trigger sync) │ └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ 5-second debounce │ v ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Git Commit │◀───│ JSONL Export │◀───│ FlushManager │ │ (git hooks) │ │ (incremental) │ │ (background) │ └─────────────────┘ └─────────────────┘ └─────────────────┘ ``` 1. **Command executes:** `bd create "New feature"` writes to SQLite immediately 2. **Mark dirty:** The operation marks the database as needing export 3. **Debounce window:** Wait 5 seconds for batch operations (configurable) 4. **Export to JSONL:** Only changed entities are appended/updated 5. **Git commit:** If git hooks are installed, changes auto-commit Key implementation: - Export: `cmd/bd/export.go`, `cmd/bd/autoflush.go` - FlushManager: `internal/flush/` (see [INTERNALS.md](INTERNALS.md)) - Dirty tracking: `internal/storage/sqlite/dirty_issues.go` ## Read Path When you query issues after a `git pull`: ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ git pull │───▶│ Auto-Import │───▶│ SQLite Update │ │ (new JSONL) │ │ (on next cmd) │ │ (merge logic) │ └─────────────────┘ └─────────────────┘ └────────┬────────┘ │ v ┌─────────────────┐ │ CLI Query │ │ (bd ready) │ └─────────────────┘ ``` 1. **Git pull:** Fetches updated JSONL from remote 2. **Auto-import detection:** First bd command checks if JSONL is newer than DB 3. **Import to SQLite:** Parse JSONL, merge with local state using content hashes 4. **Query:** Commands read from fast local SQLite Key implementation: - Import: `cmd/bd/import.go`, `cmd/bd/autoimport.go` - Auto-import logic: `internal/autoimport/autoimport.go` - Collision detection: `internal/importer/importer.go` ## Hash-Based Collision Prevention The key insight that enables distributed operation: **content-based hashing for deduplication**. ### The Problem Sequential IDs (bd-1, bd-2, bd-3) cause collisions when multiple agents create issues concurrently: ``` Branch A: bd create "Add OAuth" → bd-10 Branch B: bd create "Add Stripe" → bd-10 (collision!) ``` ### The Solution Hash-based IDs derived from random UUIDs ensure uniqueness: ``` Branch A: bd create "Add OAuth" → bd-a1b2 Branch B: bd create "Add Stripe" → bd-f14c (no collision) ``` ### How It Works 1. **Issue creation:** Generate random UUID, derive short hash as ID 2. **Progressive scaling:** IDs start at 4 chars, grow to 5-6 chars as database grows 3. **Content hashing:** Each issue has a content hash for change detection 4. **Import merge:** Same ID + different content = update, same ID + same content = skip ``` ┌─────────────────────────────────────────────────────────────────┐ │ Import Logic │ │ │ │ For each issue in JSONL: │ │ 1. Compute content hash │ │ 2. Look up existing issue by ID │ │ 3. Compare hashes: │ │ - Same hash → skip (already imported) │ │ - Different hash → update (newer version) │ │ - No match → create (new issue) │ └─────────────────────────────────────────────────────────────────┘ ``` This eliminates the need for central coordination while ensuring all machines converge to the same state. See [COLLISION_MATH.md](COLLISION_MATH.md) for birthday paradox calculations on hash length vs collision probability. ## Daemon Architecture Each workspace runs its own background daemon for auto-sync: ``` ┌─────────────────────────────────────────────────────────────────┐ │ Per-Workspace Daemon │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ RPC Server │ │ Auto-Sync │ │ Background │ │ │ │ (bd.sock) │ │ Manager │ │ Tasks │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ │ │ │ │ v │ │ ┌─────────────┐ │ │ │ SQLite │ │ │ │ Database │ │ │ └─────────────┘ │ └─────────────────────────────────────────────────────────────────┘ CLI commands ───RPC───▶ Daemon ───SQL───▶ Database or CLI commands ───SQL───▶ Database (if daemon unavailable) ``` **Why daemons?** - Batches multiple operations before export - Holds database connection open (faster queries) - Coordinates auto-sync timing - One daemon per workspace (LSP-like model) **Communication:** - Unix domain socket at `.beads/bd.sock` (Windows: named pipes) - Protocol defined in `internal/rpc/protocol.go` - CLI tries daemon first, falls back to direct DB access **Lifecycle:** - Auto-starts on first bd command (unless `BEADS_NO_DAEMON=1`) - Auto-restarts after version upgrades - Managed via `bd daemons` command See [DAEMON.md](DAEMON.md) for operational details. ## Data Types Core types in `internal/types/types.go`: | Type | Description | Key Fields | |------|-------------|------------| | **Issue** | Work item | ID, Title, Description, Status, Priority, Type | | **Dependency** | Relationship | FromID, ToID, Type (blocks/related/parent-child/discovered-from) | | **Label** | Tag | Name, Color, Description | | **Comment** | Discussion | IssueID, Author, Content, Timestamp | | **Event** | Audit trail | IssueID, Type, Data, Timestamp | ### Dependency Types | Type | Semantic | Affects `bd ready`? | |------|----------|---------------------| | `blocks` | Issue X must close before Y starts | Yes | | `parent-child` | Hierarchical (epic/subtask) | Yes (children blocked if parent blocked) | | `related` | Soft link for reference | No | | `discovered-from` | Found during work on parent | No | ### Status Flow ``` open ──▶ in_progress ──▶ closed │ │ └────────────────────────┘ (reopen) ``` ### JSONL Issue Schema Each issue in `.beads/issues.jsonl` is a JSON object with the following fields. Fields marked with `(optional)` use `omitempty` and are excluded when empty/zero. **Core Identification:** | Field | Type | Description | |-------|------|-------------| | `id` | string | Unique identifier (e.g., `bd-a1b2`) | **Issue Content:** | Field | Type | Description | |-------|------|-------------| | `title` | string | Issue title (required) | | `description` | string | Detailed description (optional) | | `design` | string | Design notes (optional) | | `acceptance_criteria` | string | Acceptance criteria (optional) | | `notes` | string | Additional notes (optional) | **Status & Workflow:** | Field | Type | Description | |-------|------|-------------| | `status` | string | Current status: `open`, `in_progress`, `blocked`, `deferred`, `closed`, `tombstone`, `pinned`, `hooked` (optional, defaults to `open`) | | `priority` | int | Priority 0-4 where 0=critical, 4=backlog | | `issue_type` | string | Type: `bug`, `feature`, `task`, `epic`, `chore`, `message`, `merge-request`, `molecule`, `gate`, `agent`, `role`, `convoy` (optional, defaults to `task`) | **Assignment:** | Field | Type | Description | |-------|------|-------------| | `assignee` | string | Assigned user/agent (optional) | | `estimated_minutes` | int | Time estimate in minutes (optional) | **Timestamps:** | Field | Type | Description | |-------|------|-------------| | `created_at` | RFC3339 | When issue was created | | `created_by` | string | Who created the issue (optional) | | `updated_at` | RFC3339 | Last modification time | | `closed_at` | RFC3339 | When issue was closed (optional, set when status=closed) | | `close_reason` | string | Reason provided when closing (optional) | **External Integration:** | Field | Type | Description | |-------|------|-------------| | `external_ref` | string | External reference (e.g., `gh-9`, `jira-ABC`) (optional) | **Relational Data:** | Field | Type | Description | |-------|------|-------------| | `labels` | []string | Tags attached to the issue (optional) | | `dependencies` | []Dependency | Relationships to other issues (optional) | | `comments` | []Comment | Discussion comments (optional) | **Tombstone Fields (soft-delete):** | Field | Type | Description | |-------|------|-------------| | `deleted_at` | RFC3339 | When deleted (optional, set when status=tombstone) | | `deleted_by` | string | Who deleted (optional) | | `delete_reason` | string | Why deleted (optional) | | `original_type` | string | Issue type before deletion (optional) | **Note:** Fields with `json:"-"` tags (like `content_hash`, `source_repo`, `id_prefix`) are internal and never exported to JSONL. ## Directory Structure ``` .beads/ ├── beads.db # SQLite database (gitignored) ├── issues.jsonl # JSONL source of truth (git-tracked) ├── bd.sock # Daemon socket (gitignored) ├── daemon.log # Daemon logs (gitignored) ├── config.yaml # Project config (optional) └── export_hashes.db # Export tracking (gitignored) ``` ## Key Code Paths | Area | Files | |------|-------| | CLI entry | `cmd/bd/main.go` | | Storage interface | `internal/storage/storage.go` | | SQLite implementation | `internal/storage/sqlite/` | | RPC protocol | `internal/rpc/protocol.go`, `server_*.go` | | Export logic | `cmd/bd/export.go`, `autoflush.go` | | Import logic | `cmd/bd/import.go`, `internal/importer/` | | Auto-sync | `internal/autoimport/`, `internal/flush/` | ## Wisps and Molecules **Molecules** are template work items that define structured workflows. When spawned, they create **wisps** - ephemeral child issues that track execution steps. > **For full documentation** on the molecular chemistry metaphor (protos, pour, bond, squash, burn), see [MOLECULES.md](MOLECULES.md). ### Wisp Lifecycle ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ bd mol wisp │───▶│ Wisp Issues │───▶│ bd mol squash │ │ (from template) │ │ (local-only) │ │ (→ digest) │ └─────────────────┘ └─────────────────┘ └─────────────────┘ ``` 1. **Create:** Create wisps from a molecule template 2. **Execute:** Agent works through wisp steps (local SQLite only) 3. **Squash:** Compress wisps into a permanent digest issue ### Why Wisps Don't Sync Wisps are intentionally **local-only**: - They exist only in the spawning agent's SQLite database - They are **never exported to JSONL** - They cannot resurrect from other clones (they were never there) - They are **hard-deleted** when squashed (no tombstones needed) This design enables: - **Fast local iteration:** No sync overhead during execution - **Clean history:** Only the digest (outcome) enters git - **Agent isolation:** Each agent's execution trace is private - **Bounded storage:** Wisps don't accumulate across clones ### Wisp vs Regular Issue Deletion | Aspect | Regular Issues | Wisps | |--------|---------------|-------| | Exported to JSONL | Yes | No | | Tombstone on delete | Yes | No | | Can resurrect | Yes (without tombstone) | No (never synced) | | Deletion method | `CreateTombstone()` | `DeleteIssue()` (hard delete) | The `bd mol squash` command uses hard delete intentionally - tombstones would be wasted overhead for data that never leaves the local database. ### Future Directions - **Separate wisp repo:** Keep wisps in a dedicated ephemeral git repo - **Digest migration:** Explicit step to promote digests to main repo - **Wisp retention:** Option to preserve wisps in local git history ## Related Documentation - [MOLECULES.md](MOLECULES.md) - Molecular chemistry metaphor (protos, pour, bond, squash, burn) - [INTERNALS.md](INTERNALS.md) - FlushManager, Blocked Cache implementation details - [DAEMON.md](DAEMON.md) - Daemon management and configuration - [EXTENDING.md](EXTENDING.md) - Adding custom tables to SQLite - [TROUBLESHOOTING.md](TROUBLESHOOTING.md) - Recovery procedures and common issues - [FAQ.md](FAQ.md) - Common questions about the architecture - [COLLISION_MATH.md](COLLISION_MATH.md) - Hash collision probability analysis