Condense COMPACTION.md into README and make README more succinct

2025-10-16 15:22:44 -07:00
parent 1eb59fa120
commit a7a4600b31
3 changed files with 22 additions and 647 deletions
@@ -81,8 +81,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Community
 - Merged PR #31: Windows Defender mitigation for export
 - Merged PR #37: Fix NULL handling in statistics
 - Merged PR #38: Nix flake for declarative builds
 - Merged PR #40: MCP integration test fixes
 - Merged PR #45: Label and title filtering for bd list
 - Merged PR #46: Add --format flag to bd list
 - Merged PR #47: Error handling consistency
 - Merged PR #48: Cyclomatic complexity reduction
@@ -1,451 +0,0 @@
 # Database Compaction Guide
 ## Overview
 Beads compaction is **agentic memory decay** - your database naturally forgets fine-grained details of old work while preserving the essential context agents need. This keeps your database lightweight and fast, even after thousands of issues.
 ### Key Concepts
 - **Semantic compression**: Claude Haiku summarizes issues intelligently, preserving decisions and outcomes
 - **Two-tier system**: Gradual decay from full detail → summary → ultra-brief
 - **Permanent decay**: Original content is discarded to save space (not reversible)
 - **Safe by design**: Dry-run preview, eligibility checks, git history preserves old versions
 ## How It Works
 ### Tier 1: Semantic Compression (30+ days)
 **Target**: Closed issues 30+ days old with no open dependents
 **Process**:
 1. Check eligibility (closed, 30+ days, no blockers)
 2. Send to Claude Haiku for summarization
 3. Replace verbose fields with concise summary
 4. Store original size for statistics
 **Result**: 70-80% space reduction
 **Example**:
 *Before (856 bytes):*
 ```
 Title: Fix authentication race condition in login flow
 Description: Users report intermittent 401 errors during concurrent
 login attempts. The issue occurs when multiple requests hit the auth
 middleware simultaneously...
 Design: [15 lines of implementation details]
 Acceptance Criteria: [8 test scenarios]
 Notes: [debugging session notes]
 ```
 *After (171 bytes):*
 ```
 Title: Fix authentication race condition in login flow
 Description: Fixed race condition in auth middleware causing 401s
 during concurrent logins. Added mutex locks and updated tests.
 Resolution: Deployed in v1.2.3.
 ```
 ### Tier 2: Ultra Compression (90+ days)
 **Target**: Tier 1 issues 90+ days old, rarely referenced
 **Process**:
 1. Verify existing Tier 1 compaction
 2. Check reference frequency (git commits, other issues)
 3. Ultra-compress to single paragraph
 4. Optionally prune events (keep created/closed only)
 **Result**: 90-95% space reduction
 **Example**:
 *After Tier 2 (43 bytes):*
 ```
 Description: Auth race condition fixed, deployed v1.2.3.
 ```
 ## CLI Reference
 ### Preview Candidates
 ```bash
 # See what would be compacted
 bd compact --dry-run --all
 # Check Tier 2 candidates
 bd compact --dry-run --all --tier 2
 # Preview specific issue
 bd compact --dry-run --id bd-42
 ```
 ### Compact Issues
 ```bash
 # Compact all eligible issues (Tier 1)
 bd compact --all
 # Compact specific issue
 bd compact --id bd-42
 # Force compact (bypass checks - use with caution)
 bd compact --id bd-42 --force
 # Tier 2 ultra-compression
 bd compact --all --tier 2
 # Control parallelism
 bd compact --all --workers 10 --batch-size 20
 ```
 ### Statistics & Monitoring
 ```bash
 # Show compaction stats
 bd compact --stats
 # Output:
 # Total issues: 2,438
 # Compacted: 847 (34.7%)
 #   Tier 1: 812 issues
 #   Tier 2: 35 issues
 # Space saved: 1.2 MB (68% reduction)
 # Estimated cost: $0.85
 ```
 ## Eligibility Rules
 ### Tier 1 Eligibility
 - ✅ Status: `closed`
 - ✅ Age: 30+ days since `closed_at`
 - ✅ Dependents: No open issues depending on this one
 - ✅ Not already compacted
 ### Tier 2 Eligibility
 - ✅ Already Tier 1 compacted
 - ✅ Age: 90+ days since `closed_at`
 - ✅ Low reference frequency:
  - Mentioned in <5 git commits in last 90 days, OR
  - Referenced by <3 issues created in last 90 days
 ## Configuration
 ### API Key Setup
 **Option 1: Environment variable (recommended)**
 ```bash
 export ANTHROPIC_API_KEY="sk-ant-..."
 ```
 Add to your shell profile (`~/.zshrc`, `~/.bashrc`, etc.) for persistence.
 **Option 2: CI/CD environments**
 ```yaml
 # GitHub Actions
 env:
  ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
 # GitLab CI
 variables:
  ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
 ```
 ### Parallel Processing
 Control performance vs. API rate limits:
 ```bash
 # Default: 5 workers, 10 issues per batch
 bd compact --all
 # High throughput (watch rate limits!)
 bd compact --all --workers 20 --batch-size 50
 # Conservative (avoid rate limits)
 bd compact --all --workers 2 --batch-size 5
 ```
 ## Cost Analysis
 ### Pricing Basics
 Compaction uses Claude Haiku (~$1 per 1M input tokens, ~$5 per 1M output tokens).
 Typical issue:
 - Input: ~500 tokens (issue content)
 - Output: ~100 tokens (summary)
 - Cost per issue: ~$0.001 (0.1¢)
 ### Cost Examples
 | Issues | Est. Cost | Time (5 workers) |
 |--------|-----------|------------------|
 | 100    | $0.10     | ~2 minutes       |
 | 1,000  | $1.00     | ~20 minutes      |
 | 10,000 | $10.00    | ~3 hours         |
 ### Monthly Cost Estimate
 If you close 50 issues/month and compact monthly:
 - **Monthly cost**: $0.05
 - **Annual cost**: $0.60
 Even large teams (500 issues/month) pay ~$6/year.
 ### Space Savings
 | Database Size | Issues | After Tier 1 | After Tier 2 |
 |---------------|--------|--------------|--------------|
 | 10 MB         | 2,000  | 3 MB (-70%)  | 1 MB (-90%)  |
 | 100 MB        | 20,000 | 30 MB (-70%) | 10 MB (-90%) |
 | 1 GB          | 200,000| 300 MB (-70%)| 100 MB (-90%)|
 ## Automation
 ### Monthly Cron Job
 ```bash
 #!/bin/bash
 # /etc/cron.monthly/bd-compact.sh
 export ANTHROPIC_API_KEY="sk-ant-..."
 cd /path/to/your/repo
 # Compact Tier 1
 bd compact --all 2>&1 | tee -a ~/.bd-compact.log
 # Commit results
 git add .beads/issues.jsonl issues.db
 git commit -m "Monthly compaction: $(date +%Y-%m)"
 git push
 ```
 Make executable:
 ```bash
 chmod +x /etc/cron.monthly/bd-compact.sh
 ```
 ### Automated Workflow Script
 ```bash
 #!/bin/bash
 # examples/compaction/workflow.sh
 # Exit on error
 set -e
 echo "=== BD Compaction Workflow ==="
 echo "Date: $(date)"
 echo
 # Check API key
 if [ -z "$ANTHROPIC_API_KEY" ]; then
  echo "Error: ANTHROPIC_API_KEY not set"
  exit 1
 fi
 # Preview candidates
 echo "--- Preview Tier 1 Candidates ---"
 bd compact --dry-run --all
 read -p "Proceed with Tier 1 compaction? (y/N) " -n 1 -r
 echo
 if [[ $REPLY =~ ^[Yy]$ ]]; then
  echo "--- Running Tier 1 Compaction ---"
  bd compact --all
 fi
 # Preview Tier 2
 echo
 echo "--- Preview Tier 2 Candidates ---"
 bd compact --dry-run --all --tier 2
 read -p "Proceed with Tier 2 compaction? (y/N) " -n 1 -r
 echo
 if [[ $REPLY =~ ^[Yy]$ ]]; then
  echo "--- Running Tier 2 Compaction ---"
  bd compact --all --tier 2
 fi
 # Show stats
 echo
 echo "--- Final Statistics ---"
 bd compact --stats
 echo
 echo "=== Compaction Complete ==="
 ```
 ### Pre-commit Hook (Automatic)
 ```bash
 #!/bin/bash
 # .git/hooks/pre-commit
 # Auto-compact before each commit (optional, experimental)
 if command -v bd &> /dev/null && [ -n "$ANTHROPIC_API_KEY" ]; then
  bd compact --all --dry-run > /dev/null 2>&1
  # Only compact if >10 eligible issues
  ELIGIBLE=$(bd compact --dry-run --all --json 2>/dev/null | jq '. | length')
  if [ "$ELIGIBLE" -gt 10 ]; then
    echo "Auto-compacting $ELIGIBLE eligible issues..."
    bd compact --all
  fi
 fi
 ```
 ## Safety & Recovery
 ### Git History
 Compaction is permanent - the original content is discarded to save space. However, you can recover old versions from git history:
 ```bash
 # View issue before compaction
 git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
 # Checkout old version
 git checkout <commit-hash> -- .beads/issues.jsonl
 # Or use git show
 git show <commit-hash>:.beads/issues.jsonl | grep -A 50 "bd-42"
 ```
 ### Verification
 After compaction, verify with:
 ```bash
 # Check compaction stats
 bd compact --stats
 # Spot-check compacted issues
 bd show bd-42
 ```
 ## Troubleshooting
 ### "ANTHROPIC_API_KEY not set"
 ```bash
 export ANTHROPIC_API_KEY="sk-ant-..."
 # Add to ~/.zshrc or ~/.bashrc for persistence
 ```
 ### Rate Limit Errors
 Reduce parallelism:
 ```bash
 bd compact --all --workers 2 --batch-size 5
 ```
 Or add delays between batches (future enhancement).
 ### Issue Not Eligible
 Check eligibility:
 ```bash
 bd compact --dry-run --id bd-42
 ```
 Force compact (if you know what you're doing):
 ```bash
 bd compact --id bd-42 --force
 ```
 ## FAQ
 ### When should I compact?
 - **Small projects (<500 issues)**: Rarely needed, maybe annually
 - **Medium projects (500-5000 issues)**: Every 3-6 months
 - **Large projects (5000+ issues)**: Monthly or quarterly
 - **High-velocity teams**: Set up automated monthly compaction
 ### Can I recover compacted issues?
 Compaction is permanent, but you can recover from git history:
 ```bash
 git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
 ```
 ### What happens to dependencies?
 Dependencies are preserved. Compaction only affects the issue's text fields (description, design, notes, acceptance criteria).
 ### Does compaction affect git history?
 No. Old versions of issues remain in git history. Compaction only affects the current state in `.beads/issues.jsonl` and `issues.db`.
 ### Should I commit compacted issues?
 **Yes.** Compaction modifies both the database and JSONL. Commit and push:
 ```bash
 git add .beads/issues.jsonl issues.db
 git commit -m "Compact old closed issues"
 git push
 ```
 ### What if my team disagrees on compaction frequency?
 Use `bd compact --dry-run` to preview. Discuss the candidates before running. Since compaction is permanent, get team consensus first.
 ### Can I compact open issues?
 No. Compaction only works on closed issues to ensure active work retains full detail.
 ### How does Tier 2 decide "rarely referenced"?
 It checks:
 1. Git commits mentioning the issue ID in last 90 days
 2. Other issues referencing it in descriptions/notes
 If references are low (< 5 commits or < 3 issues), it's eligible for Tier 2.
 ### Does compaction slow down queries?
 No. Compaction reduces database size, making queries faster. Agents benefit from smaller context when reading issues.
 ### Can I customize the summarization prompt?
 Not yet, but it's planned (bd-264). The current prompt is optimized for preserving key decisions and outcomes.
 ## Best Practices
 1. **Start with dry-run**: Always preview before compacting
 2. **Compact regularly**: Monthly or quarterly depending on project size
 3. **Monitor costs**: Use `bd compact --stats` to track savings
 4. **Automate it**: Set up cron jobs for hands-off maintenance
 5. **Commit results**: Always commit and push after compaction
 6. **Team communication**: Let team know before large compaction runs (it's permanent!)
 ## Examples
 See [examples/compaction/](examples/compaction/) for:
 - `workflow.sh` - Interactive compaction workflow
 - `cron-compact.sh` - Automated monthly compaction
 - `auto-compact.sh` - Smart auto-compaction with thresholds
 ## Related Documentation
 - [README.md](README.md) - Quick start and overview
 - [EXTENDING.md](EXTENDING.md) - Database schema and extensions
 - [GIT_WORKFLOW.md](GIT_WORKFLOW.md) - Multi-machine collaboration
 ## Contributing
 Found a bug or have ideas for improving compaction? Open an issue or PR!
 Common enhancement requests:
 - Custom summarization prompts (bd-264)
 - Alternative LLM backends (local models)
 - Configurable eligibility rules
 - Compaction analytics dashboard
 - Optional snapshot retention for restore (if requested)
@@ -281,87 +281,7 @@ Options:
 #### Creating Issues from Markdown
-You can draft multiple issues in a markdown file and create them all at once. This is useful for planning features or converting written notes into tracked work.
+Draft multiple issues in a markdown file with `bd create -f file.md`. Format: `## Issue Title` creates new issue, optional sections: `### Priority`, `### Type`, `### Description`, `### Assignee`, `### Labels`, `### Dependencies`. Defaults: Priority=2, Type=task
 Markdown format:
 ```markdown
 ## Issue Title
 Optional description text here.
 ### Priority
 1
 ### Type
 feature
 ### Description
 More detailed description (overrides text after title).
 ### Design
 Design notes and implementation details.
 ### Acceptance Criteria
 - Must do this
 - Must do that
 ### Assignee
 username
 ### Labels
 label1, label2, label3
 ### Dependencies
 bd-10, bd-20
 ```
 Example markdown file (`auth-improvements.md`):
 ```markdown
 ## Add OAuth2 support
 We need to support OAuth2 authentication.
 ### Priority
 1
 ### Type
 feature
 ### Assignee
 alice
 ### Labels
 auth, high-priority
 ## Add rate limiting
 ### Priority
 0
 ### Type
 bug
 ### Description
 Auth endpoints are vulnerable to brute force attacks.
 ### Labels
 security, urgent
 ```
 Create all issues:
 ```bash
 bd create -f auth-improvements.md
 # ✓ Created 2 issues from auth-improvements.md:
 #   bd-42: Add OAuth2 support [P1, feature]
 #   bd-43: Add rate limiting [P0, bug]
 ```
 **Notes:**
 - Each `## Heading` creates a new issue
 - Sections (`### Priority`, `### Type`, etc.) are optional
 - Defaults: Priority=2, Type=task
 - Text immediately after the title becomes the description (unless overridden by `### Description`)
 - All standard issue fields are supported
 ### Viewing Issues
@@ -419,26 +339,7 @@ bd dep cycles
 #### Cycle Prevention
-beads maintains a directed acyclic graph (DAG) of dependencies and prevents cycles across **all** dependency types. This ensures:
+Beads maintains a DAG and prevents cycles across all dependency types. Cycles break ready work detection and tree traversals. Attempting to add a cycle-creating dependency returns an error
 - **Ready work is accurate**: Cycles can hide issues from `bd ready` by making them appear blocked when they're actually part of a circular dependency
 - **Dependencies are clear**: Circular dependencies are semantically confusing (if A depends on B and B depends on A, which should be done first?)
 - **Traversals work correctly**: Commands like `bd dep tree` rely on DAG structure
 **Example - Prevented Cycle:**
 ```bash
 bd dep add bd-1 bd-2           # bd-1 blocks on bd-2 ✓
 bd dep add bd-2 bd-3           # bd-2 blocks on bd-3 ✓
 bd dep add bd-3 bd-1           # ERROR: would create cycle bd-3 → bd-1 → bd-2 → bd-3 ✗
 ```
 Cross-type cycles are also prevented:
 ```bash
 bd dep add bd-1 bd-2 --type blocks           # bd-1 blocks on bd-2 ✓
 bd dep add bd-2 bd-1 --type parent-child     # ERROR: would create cycle ✗
 ```
 If you try to add a dependency that creates a cycle, you'll get a clear error message. After successfully adding dependencies, beads will warn you if any cycles are detected elsewhere in the graph.
 ### Finding Work
@@ -461,44 +362,26 @@ bd ready --json
 ### Compaction (Memory Decay)
-Beads can semantically compress old closed issues to keep the database lightweight. This is agentic memory decay - the database naturally forgets details over time while preserving essential context.
+Beads uses AI to compress old closed issues, keeping databases lightweight as they age. This is agentic memory decay - your database naturally forgets fine-grained details while preserving essential context agents need.
 ```bash
-# Preview what would be compacted
+bd compact --dry-run --all  # Preview candidates
-bd compact --dry-run --all
+bd compact --stats          # Show statistics  
-
+bd compact --all            # Compact eligible issues (30+ days closed)
-# Show compaction statistics
+bd compact --tier 2 --all   # Ultra-compress (90+ days, rarely referenced)
 bd compact --stats
 # Compact all eligible issues (30+ days closed, no open dependents)
 bd compact --all
 # Compact specific issue
 bd compact --id bd-42
 # Force compact (bypass eligibility checks)
 bd compact --id bd-42 --force
 # Tier 2 ultra-compression (90+ days, 95% reduction)
 bd compact --tier 2 --all
 ```
-Compaction uses Claude Haiku to semantically summarize issues:
+Uses Claude Haiku for semantic summarization. **Tier 1** (30+ days): 70-80% reduction. **Tier 2** (90+ days, low references): 90-95% reduction. Requires `ANTHROPIC_API_KEY`. Cost: ~$1 per 1,000 issues.
 - **Tier 1**: 70-80% space reduction (30+ days closed)
 - **Tier 2**: 90-95% space reduction (90+ days closed, rarely referenced)
-**Requirements:**
+Eligibility: Must be closed with no open dependents. Tier 2 requires low reference frequency (<5 commits or <3 issues in last 90 days).
 - Set `ANTHROPIC_API_KEY` environment variable
 - Cost: ~$1 per 1,000 issues compacted (Haiku pricing)
-**Eligibility:**
+**Permanent:** Original content is discarded. Recover old versions from git history if needed.
 - Status: closed
 - Tier 1: 30+ days since closed, no open dependents
 - Tier 2: 90+ days since closed, rarely referenced in commits/issues
-**Note:** Compaction is permanent graceful decay - original content is discarded to save space. Use git history to recover old versions if needed.
+**Automation:**
-
+```bash
-See [COMPACTION.md](COMPACTION.md) for detailed documentation, cost analysis, and automation examples.
+# Monthly cron
 0 0 1 * * bd compact --all && git add .beads && git commit -m "Monthly compaction"
 ```
 ## Database Discovery
@@ -602,49 +485,15 @@ The `discovered-from` type is particularly useful for AI-supervised workflows, w
 ## AI Agent Integration
-bd is designed to work seamlessly with AI coding agents:
+All commands support `--json` for programmatic use. Typical agent workflow: `bd ready --json` → `bd update --status in_progress` → `bd create` (discovered work) → `bd close`
 ```bash
 # Agent discovers ready work
 WORK=$(bd ready --limit 1 --json)
 ISSUE_ID=$(echo $WORK | jq -r '.[0].id')
 # Agent claims and starts work
 bd update $ISSUE_ID --status in_progress --json
 # Agent discovers new work while executing
 bd create "Fix bug found in testing" -t bug -p 0 --json > new_issue.json
 NEW_ID=$(cat new_issue.json | jq -r '.id')
 bd dep add $NEW_ID $ISSUE_ID --type discovered-from
 # Agent completes work
 bd close $ISSUE_ID --reason "Implemented and tested" --json
 ```
 The `--json` flag on every command makes bd perfect for programmatic workflows.
 ## Ready Work Algorithm
-An issue is "ready" if:
+Issue is "ready" if status is `open` and it has no open `blocks` dependencies.
 - Status is `open`
 - It has NO open `blocks` dependencies
 - All blockers are either closed or non-existent
 Example:
 ```
 bd-1 [open] ← blocks ← bd-2 [open] ← blocks ← bd-3 [open]
 ```
 Ready work: `[bd-1]`
 Blocked: `[bd-2, bd-3]`
 ## Issue Lifecycle
-```
+`open → in_progress → closed` (or `blocked` if has open blockers)
 open → in_progress → closed
       ↓
     blocked (manually set, or has open blockers)
 ```
 ## Architecture
@@ -706,36 +555,11 @@ This pattern enables powerful integrations while keeping bd simple and focused.
 ## Why bd?
-**bd is designed for AI coding agents, not humans.**
+**bd is designed for AI agents**, not humans. Traditional trackers (Jira, GitHub) require web UIs. bd provides `--json` on all commands, explicit dependency types, and `bd ready` for unblocked work detection. In agent workflows, issues are **memory** - preventing agents from forgetting tasks during long sessions
 Traditional issue trackers (Jira, GitHub Issues, Linear) assume humans are the primary users. Humans click through web UIs, drag cards on boards, and manually update status.
 bd assumes **AI agents are the primary users**, with humans supervising:
 - **Agents discover work** - `bd ready --json` gives agents unblocked tasks to execute
 - **Dependencies prevent wasted work** - Agents don't duplicate effort or work on blocked tasks
 - **Discovery during execution** - Agents create issues for work they discover while executing, linked with `discovered-from`
 - **Agents lose focus** - Long-running conversations can forget tasks; bd remembers everything
 - **Humans supervise** - Check on progress with `bd list` and `bd dep tree`, but don't micromanage
 In human-managed workflows, issues are planning artifacts. In agent-managed workflows, **issues are memory** - preventing agents from forgetting tasks during long coding sessions.
 Traditional issue trackers were built for human project managers. bd is built for autonomous agents.
 ## Architecture: JSONL + SQLite
-bd uses a dual-storage approach:
+**JSONL** (`.beads/issues.jsonl`) is source of truth, committed to git. **SQLite** (`.beads/*.db`) is ephemeral cache for fast queries, gitignored. Auto-export after CRUD (5s debounce), auto-import after `git pull`. No manual sync needed
 - **JSONL files** (`.beads/issues.jsonl`) - Source of truth, committed to git
 - **SQLite database** (`.beads/*.db`) - Ephemeral cache for fast queries, gitignored
 This gives you:
 - ✅ **Git-friendly storage** - Text diffs, AI-resolvable conflicts
 - ✅ **Fast queries** - SQLite indexes for dependency graphs
 - ✅ **Automatic sync** - Auto-export after CRUD ops, auto-import after pulls
 - ✅ **No daemon required** - In-process SQLite, ~10-100ms per command
 When you run `bd create`, it writes to SQLite. After 5 seconds of inactivity, changes automatically export to JSONL. After `git pull`, the next bd command automatically imports if JSONL is newer. No manual steps needed!
 ## Export/Import (JSONL Format)