Create docs/MOLECULES.md with comprehensive coverage of: - Layer cake architecture (formulas → protos → molecules → epics → issues) - Phase metaphor (solid/proto, liquid/mol, vapor/wisp) - Phase transitions (pour, wisp create, squash, burn) - Bonding patterns (proto+proto, proto+mol, mol+mol) - Agent pitfalls (temporal language, forgetting to squash) - Orphan vs stale matrix - Progress tracking (computed, not stored) - Parallelism model (default parallel, opt-in sequential) Update CLI_REFERENCE.md with Molecular Chemistry section covering: - Proto/template commands - Pour command - Wisp commands - Bonding commands - Squash and burn commands Update ARCHITECTURE.md with cross-reference to new MOLECULES.md. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
17 KiB
Architecture
This document describes bd's overall architecture - the data model, sync mechanism, and how components fit together. For internal implementation details (FlushManager, Blocked Cache), see INTERNALS.md.
The Three-Layer Data Model
bd's core design enables a distributed, git-backed issue tracker that feels like a centralized database. The "magic" comes from three synchronized layers:
┌─────────────────────────────────────────────────────────────────┐
│ CLI Layer │
│ │
│ bd create, list, update, close, ready, show, dep, sync, ... │
│ - Cobra commands in cmd/bd/ │
│ - All commands support --json for programmatic use │
│ - Tries daemon RPC first, falls back to direct DB access │
└──────────────────────────────┬──────────────────────────────────┘
│
v
┌─────────────────────────────────────────────────────────────────┐
│ SQLite Database │
│ (.beads/beads.db) │
│ │
│ - Local working copy (gitignored) │
│ - Fast queries, indexes, foreign keys │
│ - Issues, dependencies, labels, comments, events │
│ - Each machine has its own copy │
└──────────────────────────────┬──────────────────────────────────┘
│
auto-sync
(5s debounce)
│
v
┌─────────────────────────────────────────────────────────────────┐
│ JSONL File │
│ (.beads/issues.jsonl) │
│ │
│ - Git-tracked source of truth │
│ - One JSON line per entity (issue, dep, label, comment) │
│ - Merge-friendly: additions rarely conflict │
│ - Shared across machines via git push/pull │
└──────────────────────────────┬──────────────────────────────────┘
│
git push/pull
│
v
┌─────────────────────────────────────────────────────────────────┐
│ Remote Repository │
│ (GitHub, GitLab, etc.) │
│ │
│ - Stores JSONL as part of normal repo history │
│ - All collaborators share the same issue database │
│ - Protected branch support via separate sync branch │
└─────────────────────────────────────────────────────────────────┘
Why This Design?
SQLite for speed: Local queries complete in milliseconds. Complex dependency graphs, full-text search, and joins are fast.
JSONL for git: One entity per line means git diffs are readable and merges usually succeed automatically. No binary database files in version control.
Git for distribution: No special sync server needed. Issues travel with your code. Offline work just works.
Write Path
When you create or modify an issue:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI Command │───▶│ SQLite Write │───▶│ Mark Dirty │
│ (bd create) │ │ (immediate) │ │ (trigger sync) │
└─────────────────┘ └─────────────────┘ └────────┬────────┘
│
5-second debounce
│
v
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Git Commit │◀───│ JSONL Export │◀───│ FlushManager │
│ (git hooks) │ │ (incremental) │ │ (background) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Command executes:
bd create "New feature"writes to SQLite immediately - Mark dirty: The operation marks the database as needing export
- Debounce window: Wait 5 seconds for batch operations (configurable)
- Export to JSONL: Only changed entities are appended/updated
- Git commit: If git hooks are installed, changes auto-commit
Key implementation:
- Export:
cmd/bd/export.go,cmd/bd/autoflush.go - FlushManager:
internal/flush/(see INTERNALS.md) - Dirty tracking:
internal/storage/sqlite/dirty_issues.go
Read Path
When you query issues after a git pull:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ git pull │───▶│ Auto-Import │───▶│ SQLite Update │
│ (new JSONL) │ │ (on next cmd) │ │ (merge logic) │
└─────────────────┘ └─────────────────┘ └────────┬────────┘
│
v
┌─────────────────┐
│ CLI Query │
│ (bd ready) │
└─────────────────┘
- Git pull: Fetches updated JSONL from remote
- Auto-import detection: First bd command checks if JSONL is newer than DB
- Import to SQLite: Parse JSONL, merge with local state using content hashes
- Query: Commands read from fast local SQLite
Key implementation:
- Import:
cmd/bd/import.go,cmd/bd/autoimport.go - Auto-import logic:
internal/autoimport/autoimport.go - Collision detection:
internal/importer/importer.go
Hash-Based Collision Prevention
The key insight that enables distributed operation: content-based hashing for deduplication.
The Problem
Sequential IDs (bd-1, bd-2, bd-3) cause collisions when multiple agents create issues concurrently:
Branch A: bd create "Add OAuth" → bd-10
Branch B: bd create "Add Stripe" → bd-10 (collision!)
The Solution
Hash-based IDs derived from random UUIDs ensure uniqueness:
Branch A: bd create "Add OAuth" → bd-a1b2
Branch B: bd create "Add Stripe" → bd-f14c (no collision)
How It Works
- Issue creation: Generate random UUID, derive short hash as ID
- Progressive scaling: IDs start at 4 chars, grow to 5-6 chars as database grows
- Content hashing: Each issue has a content hash for change detection
- Import merge: Same ID + different content = update, same ID + same content = skip
┌─────────────────────────────────────────────────────────────────┐
│ Import Logic │
│ │
│ For each issue in JSONL: │
│ 1. Compute content hash │
│ 2. Look up existing issue by ID │
│ 3. Compare hashes: │
│ - Same hash → skip (already imported) │
│ - Different hash → update (newer version) │
│ - No match → create (new issue) │
└─────────────────────────────────────────────────────────────────┘
This eliminates the need for central coordination while ensuring all machines converge to the same state.
See COLLISION_MATH.md for birthday paradox calculations on hash length vs collision probability.
Daemon Architecture
Each workspace runs its own background daemon for auto-sync:
┌─────────────────────────────────────────────────────────────────┐
│ Per-Workspace Daemon │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ RPC Server │ │ Auto-Sync │ │ Background │ │
│ │ (bd.sock) │ │ Manager │ │ Tasks │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
│ │ │
│ v │
│ ┌─────────────┐ │
│ │ SQLite │ │
│ │ Database │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘
CLI commands ───RPC───▶ Daemon ───SQL───▶ Database
or
CLI commands ───SQL───▶ Database (if daemon unavailable)
Why daemons?
- Batches multiple operations before export
- Holds database connection open (faster queries)
- Coordinates auto-sync timing
- One daemon per workspace (LSP-like model)
Communication:
- Unix domain socket at
.beads/bd.sock(Windows: named pipes) - Protocol defined in
internal/rpc/protocol.go - CLI tries daemon first, falls back to direct DB access
Lifecycle:
- Auto-starts on first bd command (unless
BEADS_NO_DAEMON=1) - Auto-restarts after version upgrades
- Managed via
bd daemonscommand
See DAEMON.md for operational details.
Data Types
Core types in internal/types/types.go:
| Type | Description | Key Fields |
|---|---|---|
| Issue | Work item | ID, Title, Description, Status, Priority, Type |
| Dependency | Relationship | FromID, ToID, Type (blocks/related/parent-child/discovered-from) |
| Label | Tag | Name, Color, Description |
| Comment | Discussion | IssueID, Author, Content, Timestamp |
| Event | Audit trail | IssueID, Type, Data, Timestamp |
Dependency Types
| Type | Semantic | Affects bd ready? |
|---|---|---|
blocks |
Issue X must close before Y starts | Yes |
parent-child |
Hierarchical (epic/subtask) | Yes (children blocked if parent blocked) |
related |
Soft link for reference | No |
discovered-from |
Found during work on parent | No |
Status Flow
open ──▶ in_progress ──▶ closed
│ │
└────────────────────────┘
(reopen)
Directory Structure
.beads/
├── beads.db # SQLite database (gitignored)
├── issues.jsonl # JSONL source of truth (git-tracked)
├── bd.sock # Daemon socket (gitignored)
├── daemon.log # Daemon logs (gitignored)
├── config.yaml # Project config (optional)
└── export_hashes.db # Export tracking (gitignored)
Key Code Paths
| Area | Files |
|---|---|
| CLI entry | cmd/bd/main.go |
| Storage interface | internal/storage/storage.go |
| SQLite implementation | internal/storage/sqlite/ |
| RPC protocol | internal/rpc/protocol.go, server_*.go |
| Export logic | cmd/bd/export.go, autoflush.go |
| Import logic | cmd/bd/import.go, internal/importer/ |
| Auto-sync | internal/autoimport/, internal/flush/ |
Wisps and Molecules
Molecules are template work items that define structured workflows. When spawned, they create wisps - ephemeral child issues that track execution steps.
For full documentation on the molecular chemistry metaphor (protos, pour, bond, squash, burn), see MOLECULES.md.
Wisp Lifecycle
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ bd wisp create │───▶│ Wisp Issues │───▶│ bd mol squash │
│ (from template) │ │ (local-only) │ │ (→ digest) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Create: Create wisps from a molecule template
- Execute: Agent works through wisp steps (local SQLite only)
- Squash: Compress wisps into a permanent digest issue
Why Wisps Don't Sync
Wisps are intentionally local-only:
- They exist only in the spawning agent's SQLite database
- They are never exported to JSONL
- They cannot resurrect from other clones (they were never there)
- They are hard-deleted when squashed (no tombstones needed)
This design enables:
- Fast local iteration: No sync overhead during execution
- Clean history: Only the digest (outcome) enters git
- Agent isolation: Each agent's execution trace is private
- Bounded storage: Wisps don't accumulate across clones
Wisp vs Regular Issue Deletion
| Aspect | Regular Issues | Wisps |
|---|---|---|
| Exported to JSONL | Yes | No |
| Tombstone on delete | Yes | No |
| Can resurrect | Yes (without tombstone) | No (never synced) |
| Deletion method | CreateTombstone() |
DeleteIssue() (hard delete) |
The bd mol squash command uses hard delete intentionally - tombstones would be wasted overhead for data that never leaves the local database.
Future Directions
- Separate wisp repo: Keep wisps in a dedicated ephemeral git repo
- Digest migration: Explicit step to promote digests to main repo
- Wisp retention: Option to preserve wisps in local git history
Related Documentation
- MOLECULES.md - Molecular chemistry metaphor (protos, pour, bond, squash, burn)
- INTERNALS.md - FlushManager, Blocked Cache implementation details
- DAEMON.md - Daemon management and configuration
- EXTENDING.md - Adding custom tables to SQLite
- TROUBLESHOOTING.md - Recovery procedures and common issues
- FAQ.md - Common questions about the architecture
- COLLISION_MATH.md - Hash collision probability analysis