Files
beads/docs/ARCHITECTURE.md
2025-11-25 11:35:26 -08:00

15 KiB

Architecture

This document describes bd's overall architecture - the data model, sync mechanism, and how components fit together. For internal implementation details (FlushManager, Blocked Cache), see INTERNALS.md.

The Three-Layer Data Model

bd's core design enables a distributed, git-backed issue tracker that feels like a centralized database. The "magic" comes from three synchronized layers:

┌─────────────────────────────────────────────────────────────────┐
│                        CLI Layer                                 │
│                                                                  │
│  bd create, list, update, close, ready, show, dep, sync, ...    │
│  - Cobra commands in cmd/bd/                                     │
│  - All commands support --json for programmatic use              │
│  - Tries daemon RPC first, falls back to direct DB access        │
└──────────────────────────────┬──────────────────────────────────┘
                               │
                               v
┌─────────────────────────────────────────────────────────────────┐
│                     SQLite Database                              │
│                     (.beads/beads.db)                            │
│                                                                  │
│  - Local working copy (gitignored)                               │
│  - Fast queries, indexes, foreign keys                           │
│  - Issues, dependencies, labels, comments, events                │
│  - Each machine has its own copy                                 │
└──────────────────────────────┬──────────────────────────────────┘
                               │
                         auto-sync
                        (5s debounce)
                               │
                               v
┌─────────────────────────────────────────────────────────────────┐
│                       JSONL File                                 │
│                   (.beads/beads.jsonl)                           │
│                                                                  │
│  - Git-tracked source of truth                                   │
│  - One JSON line per entity (issue, dep, label, comment)         │
│  - Merge-friendly: additions rarely conflict                     │
│  - Shared across machines via git push/pull                      │
└──────────────────────────────┬──────────────────────────────────┘
                               │
                          git push/pull
                               │
                               v
┌─────────────────────────────────────────────────────────────────┐
│                     Remote Repository                            │
│                    (GitHub, GitLab, etc.)                        │
│                                                                  │
│  - Stores JSONL as part of normal repo history                   │
│  - All collaborators share the same issue database               │
│  - Protected branch support via separate sync branch             │
└─────────────────────────────────────────────────────────────────┘

Why This Design?

SQLite for speed: Local queries complete in milliseconds. Complex dependency graphs, full-text search, and joins are fast.

JSONL for git: One entity per line means git diffs are readable and merges usually succeed automatically. No binary database files in version control.

Git for distribution: No special sync server needed. Issues travel with your code. Offline work just works.

Write Path

When you create or modify an issue:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   CLI Command   │───▶│  SQLite Write   │───▶│  Mark Dirty     │
│   (bd create)   │    │  (immediate)    │    │  (trigger sync) │
└─────────────────┘    └─────────────────┘    └────────┬────────┘
                                                       │
                                              5-second debounce
                                                       │
                                                       v
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Git Commit    │◀───│  JSONL Export   │◀───│  FlushManager   │
│   (git hooks)   │    │  (incremental)  │    │  (background)   │
└─────────────────┘    └─────────────────┘    └─────────────────┘
  1. Command executes: bd create "New feature" writes to SQLite immediately
  2. Mark dirty: The operation marks the database as needing export
  3. Debounce window: Wait 5 seconds for batch operations (configurable)
  4. Export to JSONL: Only changed entities are appended/updated
  5. Git commit: If git hooks are installed, changes auto-commit

Key implementation:

  • Export: cmd/bd/export.go, cmd/bd/autoflush.go
  • FlushManager: internal/flush/ (see INTERNALS.md)
  • Dirty tracking: internal/storage/sqlite/dirty_issues.go

Read Path

When you query issues after a git pull:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   git pull      │───▶│  Auto-Import    │───▶│  SQLite Update  │
│   (new JSONL)   │    │  (on next cmd)  │    │  (merge logic)  │
└─────────────────┘    └─────────────────┘    └────────┬────────┘
                                                       │
                                                       v
                                               ┌─────────────────┐
                                               │  CLI Query      │
                                               │  (bd ready)     │
                                               └─────────────────┘
  1. Git pull: Fetches updated JSONL from remote
  2. Auto-import detection: First bd command checks if JSONL is newer than DB
  3. Import to SQLite: Parse JSONL, merge with local state using content hashes
  4. Query: Commands read from fast local SQLite

Key implementation:

  • Import: cmd/bd/import.go, cmd/bd/autoimport.go
  • Auto-import logic: internal/autoimport/autoimport.go
  • Collision detection: internal/importer/importer.go

Hash-Based Collision Prevention

The key insight that enables distributed operation: content-based hashing for deduplication.

The Problem

Sequential IDs (bd-1, bd-2, bd-3) cause collisions when multiple agents create issues concurrently:

Branch A: bd create "Add OAuth"   → bd-10
Branch B: bd create "Add Stripe"  → bd-10 (collision!)

The Solution

Hash-based IDs derived from random UUIDs ensure uniqueness:

Branch A: bd create "Add OAuth"   → bd-a1b2
Branch B: bd create "Add Stripe"  → bd-f14c (no collision)

How It Works

  1. Issue creation: Generate random UUID, derive short hash as ID
  2. Progressive scaling: IDs start at 4 chars, grow to 5-6 chars as database grows
  3. Content hashing: Each issue has a content hash for change detection
  4. Import merge: Same ID + different content = update, same ID + same content = skip
┌─────────────────────────────────────────────────────────────────┐
│                        Import Logic                              │
│                                                                  │
│  For each issue in JSONL:                                       │
│    1. Compute content hash                                       │
│    2. Look up existing issue by ID                               │
│    3. Compare hashes:                                            │
│       - Same hash → skip (already imported)                      │
│       - Different hash → update (newer version)                  │
│       - No match → create (new issue)                            │
└─────────────────────────────────────────────────────────────────┘

This eliminates the need for central coordination while ensuring all machines converge to the same state.

See COLLISION_MATH.md for birthday paradox calculations on hash length vs collision probability.

Daemon Architecture

Each workspace runs its own background daemon for auto-sync:

┌─────────────────────────────────────────────────────────────────┐
│                     Per-Workspace Daemon                         │
│                                                                  │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
│  │ RPC Server  │    │  Auto-Sync  │    │  Background │         │
│  │ (bd.sock)   │    │  Manager    │    │  Tasks      │         │
│  └─────────────┘    └─────────────┘    └─────────────┘         │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                  │
│                            │                                     │
│                            v                                     │
│                   ┌─────────────┐                                │
│                   │   SQLite    │                                │
│                   │   Database  │                                │
│                   └─────────────┘                                │
└─────────────────────────────────────────────────────────────────┘

     CLI commands ───RPC───▶ Daemon ───SQL───▶ Database
                              or
     CLI commands ───SQL───▶ Database (if daemon unavailable)

Why daemons?

  • Batches multiple operations before export
  • Holds database connection open (faster queries)
  • Coordinates auto-sync timing
  • One daemon per workspace (LSP-like model)

Communication:

  • Unix domain socket at .beads/bd.sock (Windows: named pipes)
  • Protocol defined in internal/rpc/protocol.go
  • CLI tries daemon first, falls back to direct DB access

Lifecycle:

  • Auto-starts on first bd command (unless BEADS_NO_DAEMON=1)
  • Auto-restarts after version upgrades
  • Managed via bd daemons command

See DAEMON.md for operational details.

Data Types

Core types in internal/types/types.go:

Type Description Key Fields
Issue Work item ID, Title, Description, Status, Priority, Type
Dependency Relationship FromID, ToID, Type (blocks/related/parent-child/discovered-from)
Label Tag Name, Color, Description
Comment Discussion IssueID, Author, Content, Timestamp
Event Audit trail IssueID, Type, Data, Timestamp

Dependency Types

Type Semantic Affects bd ready?
blocks Issue X must close before Y starts Yes
parent-child Hierarchical (epic/subtask) Yes (children blocked if parent blocked)
related Soft link for reference No
discovered-from Found during work on parent No

Status Flow

open ──▶ in_progress ──▶ closed
  │                        │
  └────────────────────────┘
         (reopen)

Directory Structure

.beads/
├── beads.db          # SQLite database (gitignored)
├── beads.jsonl       # JSONL source of truth (git-tracked)
├── bd.sock           # Daemon socket (gitignored)
├── daemon.log        # Daemon logs (gitignored)
├── config.yaml       # Project config (optional)
└── export_hashes.db  # Export tracking (gitignored)

Key Code Paths

Area Files
CLI entry cmd/bd/main.go
Storage interface internal/storage/storage.go
SQLite implementation internal/storage/sqlite/
RPC protocol internal/rpc/protocol.go, server_*.go
Export logic cmd/bd/export.go, autoflush.go
Import logic cmd/bd/import.go, internal/importer/
Auto-sync internal/autoimport/, internal/flush/