Condense COMPACTION.md into README and make README more succinct

2025-10-16 15:22:44 -07:00
parent 1eb59fa120
commit a7a4600b31
3 changed files with 22 additions and 647 deletions
@@ -81,8 +81,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Community
 - Merged PR #31: Windows Defender mitigation for export
+- Merged PR #37: Fix NULL handling in statistics
 - Merged PR #38: Nix flake for declarative builds
 - Merged PR #40: MCP integration test fixes
+- Merged PR #45: Label and title filtering for bd list
 - Merged PR #46: Add --format flag to bd list
 - Merged PR #47: Error handling consistency
 - Merged PR #48: Cyclomatic complexity reduction
@@ -1,451 +0,0 @@
-# Database Compaction Guide
-
-## Overview
-
-Beads compaction is **agentic memory decay** - your database naturally forgets fine-grained details of old work while preserving the essential context agents need. This keeps your database lightweight and fast, even after thousands of issues.
-
-### Key Concepts
-
- **Semantic compression**: Claude Haiku summarizes issues intelligently, preserving decisions and outcomes
- **Two-tier system**: Gradual decay from full detail → summary → ultra-brief
- **Permanent decay**: Original content is discarded to save space (not reversible)
- **Safe by design**: Dry-run preview, eligibility checks, git history preserves old versions
-
-## How It Works
-
-### Tier 1: Semantic Compression (30+ days)
-
-**Target**: Closed issues 30+ days old with no open dependents
-
-**Process**:
-1. Check eligibility (closed, 30+ days, no blockers)
-2. Send to Claude Haiku for summarization
-3. Replace verbose fields with concise summary
-4. Store original size for statistics
-
-**Result**: 70-80% space reduction
-
-**Example**:
-
-*Before (856 bytes):*
-```
-Title: Fix authentication race condition in login flow
-Description: Users report intermittent 401 errors during concurrent
-login attempts. The issue occurs when multiple requests hit the auth
-middleware simultaneously...
-
-Design: [15 lines of implementation details]
-Acceptance Criteria: [8 test scenarios]
-Notes: [debugging session notes]
-```
-
-*After (171 bytes):*
-```
-Title: Fix authentication race condition in login flow
-Description: Fixed race condition in auth middleware causing 401s
-during concurrent logins. Added mutex locks and updated tests.
-Resolution: Deployed in v1.2.3.
-```
-
-### Tier 2: Ultra Compression (90+ days)
-
-**Target**: Tier 1 issues 90+ days old, rarely referenced
-
-**Process**:
-1. Verify existing Tier 1 compaction
-2. Check reference frequency (git commits, other issues)
-3. Ultra-compress to single paragraph
-4. Optionally prune events (keep created/closed only)
-
-**Result**: 90-95% space reduction
-
-**Example**:
-
-*After Tier 2 (43 bytes):*
-```
-Description: Auth race condition fixed, deployed v1.2.3.
-```
-
-## CLI Reference
-
-### Preview Candidates
-
-```bash
-# See what would be compacted
-bd compact --dry-run --all
-
-# Check Tier 2 candidates
-bd compact --dry-run --all --tier 2
-
-# Preview specific issue
-bd compact --dry-run --id bd-42
-```
-
-### Compact Issues
-
-```bash
-# Compact all eligible issues (Tier 1)
-bd compact --all
-
-# Compact specific issue
-bd compact --id bd-42
-
-# Force compact (bypass checks - use with caution)
-bd compact --id bd-42 --force
-
-# Tier 2 ultra-compression
-bd compact --all --tier 2
-
-# Control parallelism
-bd compact --all --workers 10 --batch-size 20
-```
-
-### Statistics & Monitoring
-
-```bash
-# Show compaction stats
-bd compact --stats
-
-# Output:
-# Total issues: 2,438
-# Compacted: 847 (34.7%)
-#   Tier 1: 812 issues
-#   Tier 2: 35 issues
-# Space saved: 1.2 MB (68% reduction)
-# Estimated cost: $0.85
-```
-
-## Eligibility Rules
-
-### Tier 1 Eligibility
-
- ✅ Status: `closed`
- ✅ Age: 30+ days since `closed_at`
- ✅ Dependents: No open issues depending on this one
- ✅ Not already compacted
-
-### Tier 2 Eligibility
-
- ✅ Already Tier 1 compacted
- ✅ Age: 90+ days since `closed_at`
- ✅ Low reference frequency:
-  - Mentioned in <5 git commits in last 90 days, OR
-  - Referenced by <3 issues created in last 90 days
-
-## Configuration
-
-### API Key Setup
-
-**Option 1: Environment variable (recommended)**
-
-```bash
-export ANTHROPIC_API_KEY="sk-ant-..."
-```
-
-Add to your shell profile (`~/.zshrc`, `~/.bashrc`, etc.) for persistence.
-
-**Option 2: CI/CD environments**
-
-```yaml
-# GitHub Actions
-env:
-  ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
-
-# GitLab CI
-variables:
-  ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
-```
-
-### Parallel Processing
-
-Control performance vs. API rate limits:
-
-```bash
-# Default: 5 workers, 10 issues per batch
-bd compact --all
-
-# High throughput (watch rate limits!)
-bd compact --all --workers 20 --batch-size 50
-
-# Conservative (avoid rate limits)
-bd compact --all --workers 2 --batch-size 5
-```
-
-## Cost Analysis
-
-### Pricing Basics
-
-Compaction uses Claude Haiku (~$1 per 1M input tokens, ~$5 per 1M output tokens).
-
-Typical issue:
- Input: ~500 tokens (issue content)
- Output: ~100 tokens (summary)
- Cost per issue: ~$0.001 (0.1¢)
-
-### Cost Examples
-
-| Issues | Est. Cost | Time (5 workers) |
-|--------|-----------|------------------|
-| 100    | $0.10     | ~2 minutes       |
-| 1,000  | $1.00     | ~20 minutes      |
-| 10,000 | $10.00    | ~3 hours         |
-
-### Monthly Cost Estimate
-
-If you close 50 issues/month and compact monthly:
- **Monthly cost**: $0.05
- **Annual cost**: $0.60
-
-Even large teams (500 issues/month) pay ~$6/year.
-
-### Space Savings
-
-| Database Size | Issues | After Tier 1 | After Tier 2 |
-|---------------|--------|--------------|--------------|
-| 10 MB         | 2,000  | 3 MB (-70%)  | 1 MB (-90%)  |
-| 100 MB        | 20,000 | 30 MB (-70%) | 10 MB (-90%) |
-| 1 GB          | 200,000| 300 MB (-70%)| 100 MB (-90%)|
-
-## Automation
-
-### Monthly Cron Job
-
-```bash
-#!/bin/bash
-# /etc/cron.monthly/bd-compact.sh
-
-export ANTHROPIC_API_KEY="sk-ant-..."
-cd /path/to/your/repo
-
-# Compact Tier 1
-bd compact --all 2>&1 | tee -a ~/.bd-compact.log
-
-# Commit results
-git add .beads/issues.jsonl issues.db
-git commit -m "Monthly compaction: $(date +%Y-%m)"
-git push
-```
-
-Make executable:
-```bash
-chmod +x /etc/cron.monthly/bd-compact.sh
-```
-
-### Automated Workflow Script
-
-```bash
-#!/bin/bash
-# examples/compaction/workflow.sh
-
-# Exit on error
-set -e
-
-echo "=== BD Compaction Workflow ==="
-echo "Date: $(date)"
-echo
-
-# Check API key
-if [ -z "$ANTHROPIC_API_KEY" ]; then
-  echo "Error: ANTHROPIC_API_KEY not set"
-  exit 1
-fi
-
-# Preview candidates
-echo "--- Preview Tier 1 Candidates ---"
-bd compact --dry-run --all
-
-read -p "Proceed with Tier 1 compaction? (y/N) " -n 1 -r
-echo
-if [[ $REPLY =~ ^[Yy]$ ]]; then
-  echo "--- Running Tier 1 Compaction ---"
-  bd compact --all
-fi
-
-# Preview Tier 2
-echo
-echo "--- Preview Tier 2 Candidates ---"
-bd compact --dry-run --all --tier 2
-
-read -p "Proceed with Tier 2 compaction? (y/N) " -n 1 -r
-echo
-if [[ $REPLY =~ ^[Yy]$ ]]; then
-  echo "--- Running Tier 2 Compaction ---"
-  bd compact --all --tier 2
-fi
-
-# Show stats
-echo
-echo "--- Final Statistics ---"
-bd compact --stats
-
-echo
-echo "=== Compaction Complete ==="
-```
-
-### Pre-commit Hook (Automatic)
-
-```bash
-#!/bin/bash
-# .git/hooks/pre-commit
-
-# Auto-compact before each commit (optional, experimental)
-if command -v bd &> /dev/null && [ -n "$ANTHROPIC_API_KEY" ]; then
-  bd compact --all --dry-run > /dev/null 2>&1
-  # Only compact if >10 eligible issues
-  ELIGIBLE=$(bd compact --dry-run --all --json 2>/dev/null | jq '. | length')
-  if [ "$ELIGIBLE" -gt 10 ]; then
-    echo "Auto-compacting $ELIGIBLE eligible issues..."
-    bd compact --all
-  fi
-fi
-```
-
-## Safety & Recovery
-
-### Git History
-
-Compaction is permanent - the original content is discarded to save space. However, you can recover old versions from git history:
-
-```bash
-# View issue before compaction
-git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
-
-# Checkout old version
-git checkout <commit-hash> -- .beads/issues.jsonl
-
-# Or use git show
-git show <commit-hash>:.beads/issues.jsonl | grep -A 50 "bd-42"
-```
-
-### Verification
-
-After compaction, verify with:
-
-```bash
-# Check compaction stats
-bd compact --stats
-
-# Spot-check compacted issues
-bd show bd-42
-```
-
-## Troubleshooting
-
-### "ANTHROPIC_API_KEY not set"
-
-```bash
-export ANTHROPIC_API_KEY="sk-ant-..."
-# Add to ~/.zshrc or ~/.bashrc for persistence
-```
-
-### Rate Limit Errors
-
-Reduce parallelism:
-```bash
-bd compact --all --workers 2 --batch-size 5
-```
-
-Or add delays between batches (future enhancement).
-
-### Issue Not Eligible
-
-Check eligibility:
-```bash
-bd compact --dry-run --id bd-42
-```
-
-Force compact (if you know what you're doing):
-```bash
-bd compact --id bd-42 --force
-```
-
-## FAQ
-
-### When should I compact?
-
- **Small projects (<500 issues)**: Rarely needed, maybe annually
- **Medium projects (500-5000 issues)**: Every 3-6 months
- **Large projects (5000+ issues)**: Monthly or quarterly
- **High-velocity teams**: Set up automated monthly compaction
-
-### Can I recover compacted issues?
-
-Compaction is permanent, but you can recover from git history:
-```bash
-git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
-```
-
-### What happens to dependencies?
-
-Dependencies are preserved. Compaction only affects the issue's text fields (description, design, notes, acceptance criteria).
-
-### Does compaction affect git history?
-
-No. Old versions of issues remain in git history. Compaction only affects the current state in `.beads/issues.jsonl` and `issues.db`.
-
-### Should I commit compacted issues?
-
-**Yes.** Compaction modifies both the database and JSONL. Commit and push:
-
-```bash
-git add .beads/issues.jsonl issues.db
-git commit -m "Compact old closed issues"
-git push
-```
-
-### What if my team disagrees on compaction frequency?
-
-Use `bd compact --dry-run` to preview. Discuss the candidates before running. Since compaction is permanent, get team consensus first.
-
-### Can I compact open issues?
-
-No. Compaction only works on closed issues to ensure active work retains full detail.
-
-### How does Tier 2 decide "rarely referenced"?
-
-It checks:
-1. Git commits mentioning the issue ID in last 90 days
-2. Other issues referencing it in descriptions/notes
-
-If references are low (< 5 commits or < 3 issues), it's eligible for Tier 2.
-
-### Does compaction slow down queries?
-
-No. Compaction reduces database size, making queries faster. Agents benefit from smaller context when reading issues.
-
-### Can I customize the summarization prompt?
-
-Not yet, but it's planned (bd-264). The current prompt is optimized for preserving key decisions and outcomes.
-
-## Best Practices
-
-1. **Start with dry-run**: Always preview before compacting
-2. **Compact regularly**: Monthly or quarterly depending on project size
-3. **Monitor costs**: Use `bd compact --stats` to track savings
-4. **Automate it**: Set up cron jobs for hands-off maintenance
-5. **Commit results**: Always commit and push after compaction
-6. **Team communication**: Let team know before large compaction runs (it's permanent!)
-
-## Examples
-
-See [examples/compaction/](examples/compaction/) for:
- `workflow.sh` - Interactive compaction workflow
- `cron-compact.sh` - Automated monthly compaction
- `auto-compact.sh` - Smart auto-compaction with thresholds
-
-## Related Documentation
-
- [README.md](README.md) - Quick start and overview
- [EXTENDING.md](EXTENDING.md) - Database schema and extensions
- [GIT_WORKFLOW.md](GIT_WORKFLOW.md) - Multi-machine collaboration
-
-## Contributing
-
-Found a bug or have ideas for improving compaction? Open an issue or PR!
-
-Common enhancement requests:
- Custom summarization prompts (bd-264)
- Alternative LLM backends (local models)
- Configurable eligibility rules
- Compaction analytics dashboard
- Optional snapshot retention for restore (if requested)
@@ -281,87 +281,7 @@ Options:

 #### Creating Issues from Markdown

-You can draft multiple issues in a markdown file and create them all at once. This is useful for planning features or converting written notes into tracked work.
-
-Markdown format:
-```markdown
-## Issue Title
-
-Optional description text here.
-
-### Priority
-1
-
-### Type
-feature
-
-### Description
-More detailed description (overrides text after title).
-
-### Design
-Design notes and implementation details.
-
-### Acceptance Criteria
- Must do this
- Must do that
-
-### Assignee
-username
-
-### Labels
-label1, label2, label3
-
-### Dependencies
-bd-10, bd-20
-```
-
-Example markdown file (`auth-improvements.md`):
-```markdown
-## Add OAuth2 support
-
-We need to support OAuth2 authentication.
-
-### Priority
-1
-
-### Type
-feature
-
-### Assignee
-alice
-
-### Labels
-auth, high-priority
-
-## Add rate limiting
-
-### Priority
-0
-
-### Type
-bug
-
-### Description
-Auth endpoints are vulnerable to brute force attacks.
-
-### Labels
-security, urgent
-```
-
-Create all issues:
-```bash
-bd create -f auth-improvements.md
-# ✓ Created 2 issues from auth-improvements.md:
-#   bd-42: Add OAuth2 support [P1, feature]
-#   bd-43: Add rate limiting [P0, bug]
-```
-
-**Notes:**
- Each `## Heading` creates a new issue
- Sections (`### Priority`, `### Type`, etc.) are optional
- Defaults: Priority=2, Type=task
- Text immediately after the title becomes the description (unless overridden by `### Description`)
- All standard issue fields are supported
+Draft multiple issues in a markdown file with `bd create -f file.md`. Format: `## Issue Title` creates new issue, optional sections: `### Priority`, `### Type`, `### Description`, `### Assignee`, `### Labels`, `### Dependencies`. Defaults: Priority=2, Type=task

 ### Viewing Issues

@@ -419,26 +339,7 @@ bd dep cycles

 #### Cycle Prevention

-beads maintains a directed acyclic graph (DAG) of dependencies and prevents cycles across **all** dependency types. This ensures:
-
- **Ready work is accurate**: Cycles can hide issues from `bd ready` by making them appear blocked when they're actually part of a circular dependency
- **Dependencies are clear**: Circular dependencies are semantically confusing (if A depends on B and B depends on A, which should be done first?)
- **Traversals work correctly**: Commands like `bd dep tree` rely on DAG structure
-
-**Example - Prevented Cycle:**
-```bash
-bd dep add bd-1 bd-2           # bd-1 blocks on bd-2 ✓
-bd dep add bd-2 bd-3           # bd-2 blocks on bd-3 ✓
-bd dep add bd-3 bd-1           # ERROR: would create cycle bd-3 → bd-1 → bd-2 → bd-3 ✗
-```
-
-Cross-type cycles are also prevented:
-```bash
-bd dep add bd-1 bd-2 --type blocks           # bd-1 blocks on bd-2 ✓
-bd dep add bd-2 bd-1 --type parent-child     # ERROR: would create cycle ✗
-```
-
-If you try to add a dependency that creates a cycle, you'll get a clear error message. After successfully adding dependencies, beads will warn you if any cycles are detected elsewhere in the graph.
+Beads maintains a DAG and prevents cycles across all dependency types. Cycles break ready work detection and tree traversals. Attempting to add a cycle-creating dependency returns an error

 ### Finding Work

@@ -461,44 +362,26 @@ bd ready --json

 ### Compaction (Memory Decay)

-Beads can semantically compress old closed issues to keep the database lightweight. This is agentic memory decay - the database naturally forgets details over time while preserving essential context.
+Beads uses AI to compress old closed issues, keeping databases lightweight as they age. This is agentic memory decay - your database naturally forgets fine-grained details while preserving essential context agents need.

 ```bash
-# Preview what would be compacted
-bd compact --dry-run --all
-
-# Show compaction statistics
-bd compact --stats
-
-# Compact all eligible issues (30+ days closed, no open dependents)
-bd compact --all
-
-# Compact specific issue
-bd compact --id bd-42
-
-# Force compact (bypass eligibility checks)
-bd compact --id bd-42 --force
-
-# Tier 2 ultra-compression (90+ days, 95% reduction)
-bd compact --tier 2 --all
+bd compact --dry-run --all  # Preview candidates
+bd compact --stats          # Show statistics  
+bd compact --all            # Compact eligible issues (30+ days closed)
+bd compact --tier 2 --all   # Ultra-compress (90+ days, rarely referenced)
 ```

-Compaction uses Claude Haiku to semantically summarize issues:
- **Tier 1**: 70-80% space reduction (30+ days closed)
- **Tier 2**: 90-95% space reduction (90+ days closed, rarely referenced)
+Uses Claude Haiku for semantic summarization. **Tier 1** (30+ days): 70-80% reduction. **Tier 2** (90+ days, low references): 90-95% reduction. Requires `ANTHROPIC_API_KEY`. Cost: ~$1 per 1,000 issues.

-**Requirements:**
- Set `ANTHROPIC_API_KEY` environment variable
- Cost: ~$1 per 1,000 issues compacted (Haiku pricing)
+Eligibility: Must be closed with no open dependents. Tier 2 requires low reference frequency (<5 commits or <3 issues in last 90 days).

-**Eligibility:**
- Status: closed
- Tier 1: 30+ days since closed, no open dependents
- Tier 2: 90+ days since closed, rarely referenced in commits/issues
+**Permanent:** Original content is discarded. Recover old versions from git history if needed.

-**Note:** Compaction is permanent graceful decay - original content is discarded to save space. Use git history to recover old versions if needed.
-
-See [COMPACTION.md](COMPACTION.md) for detailed documentation, cost analysis, and automation examples.
+**Automation:**
+```bash
+# Monthly cron
+0 0 1 * * bd compact --all && git add .beads && git commit -m "Monthly compaction"
+```

 ## Database Discovery

@@ -602,49 +485,15 @@ The `discovered-from` type is particularly useful for AI-supervised workflows, w

 ## AI Agent Integration

-bd is designed to work seamlessly with AI coding agents:
-
-```bash
-# Agent discovers ready work
-WORK=$(bd ready --limit 1 --json)
-ISSUE_ID=$(echo $WORK | jq -r '.[0].id')
-
-# Agent claims and starts work
-bd update $ISSUE_ID --status in_progress --json
-
-# Agent discovers new work while executing
-bd create "Fix bug found in testing" -t bug -p 0 --json > new_issue.json
-NEW_ID=$(cat new_issue.json | jq -r '.id')
-bd dep add $NEW_ID $ISSUE_ID --type discovered-from
-
-# Agent completes work
-bd close $ISSUE_ID --reason "Implemented and tested" --json
-```
-
-The `--json` flag on every command makes bd perfect for programmatic workflows.
+All commands support `--json` for programmatic use. Typical agent workflow: `bd ready --json` → `bd update --status in_progress` → `bd create` (discovered work) → `bd close`

 ## Ready Work Algorithm

-An issue is "ready" if:
- Status is `open`
- It has NO open `blocks` dependencies
- All blockers are either closed or non-existent
-
-Example:
-```
-bd-1 [open] ← blocks ← bd-2 [open] ← blocks ← bd-3 [open]
-```
-
-Ready work: `[bd-1]`
-Blocked: `[bd-2, bd-3]`
+Issue is "ready" if status is `open` and it has no open `blocks` dependencies.

 ## Issue Lifecycle

-```
-open → in_progress → closed
-       ↓
-     blocked (manually set, or has open blockers)
-```
+`open → in_progress → closed` (or `blocked` if has open blockers)

 ## Architecture

@@ -706,36 +555,11 @@ This pattern enables powerful integrations while keeping bd simple and focused.

 ## Why bd?

-**bd is designed for AI coding agents, not humans.**
-
-Traditional issue trackers (Jira, GitHub Issues, Linear) assume humans are the primary users. Humans click through web UIs, drag cards on boards, and manually update status.
-
-bd assumes **AI agents are the primary users**, with humans supervising:
-
- **Agents discover work** - `bd ready --json` gives agents unblocked tasks to execute
- **Dependencies prevent wasted work** - Agents don't duplicate effort or work on blocked tasks
- **Discovery during execution** - Agents create issues for work they discover while executing, linked with `discovered-from`
- **Agents lose focus** - Long-running conversations can forget tasks; bd remembers everything
- **Humans supervise** - Check on progress with `bd list` and `bd dep tree`, but don't micromanage
-
-In human-managed workflows, issues are planning artifacts. In agent-managed workflows, **issues are memory** - preventing agents from forgetting tasks during long coding sessions.
-
-Traditional issue trackers were built for human project managers. bd is built for autonomous agents.
+**bd is designed for AI agents**, not humans. Traditional trackers (Jira, GitHub) require web UIs. bd provides `--json` on all commands, explicit dependency types, and `bd ready` for unblocked work detection. In agent workflows, issues are **memory** - preventing agents from forgetting tasks during long sessions

 ## Architecture: JSONL + SQLite

-bd uses a dual-storage approach:
-
- **JSONL files** (`.beads/issues.jsonl`) - Source of truth, committed to git
- **SQLite database** (`.beads/*.db`) - Ephemeral cache for fast queries, gitignored
-
-This gives you:
- ✅ **Git-friendly storage** - Text diffs, AI-resolvable conflicts
- ✅ **Fast queries** - SQLite indexes for dependency graphs
- ✅ **Automatic sync** - Auto-export after CRUD ops, auto-import after pulls
- ✅ **No daemon required** - In-process SQLite, ~10-100ms per command
-
-When you run `bd create`, it writes to SQLite. After 5 seconds of inactivity, changes automatically export to JSONL. After `git pull`, the next bd command automatically imports if JSONL is newer. No manual steps needed!
+**JSONL** (`.beads/issues.jsonl`) is source of truth, committed to git. **SQLite** (`.beads/*.db`) is ephemeral cache for fast queries, gitignored. Auto-export after CRUD (5s debounce), auto-import after `git pull`. No manual sync needed

 ## Export/Import (JSONL Format)