Compaction is permanent graceful decay - restore functionality doesn't exist. Updated docs to reflect reality: - Removed restore command examples - Removed snapshot/restore sections - Updated to explain git history recovery option - Clarified permanent nature throughout
11 KiB
Database Compaction Guide
Overview
Beads compaction is agentic memory decay - your database naturally forgets fine-grained details of old work while preserving the essential context agents need. This keeps your database lightweight and fast, even after thousands of issues.
Key Concepts
- Semantic compression: Claude Haiku summarizes issues intelligently, preserving decisions and outcomes
- Two-tier system: Gradual decay from full detail → summary → ultra-brief
- Permanent decay: Original content is discarded to save space (not reversible)
- Safe by design: Dry-run preview, eligibility checks, git history preserves old versions
How It Works
Tier 1: Semantic Compression (30+ days)
Target: Closed issues 30+ days old with no open dependents
Process:
- Check eligibility (closed, 30+ days, no blockers)
- Send to Claude Haiku for summarization
- Replace verbose fields with concise summary
- Store original size for statistics
Result: 70-80% space reduction
Example:
Before (856 bytes):
Title: Fix authentication race condition in login flow
Description: Users report intermittent 401 errors during concurrent
login attempts. The issue occurs when multiple requests hit the auth
middleware simultaneously...
Design: [15 lines of implementation details]
Acceptance Criteria: [8 test scenarios]
Notes: [debugging session notes]
After (171 bytes):
Title: Fix authentication race condition in login flow
Description: Fixed race condition in auth middleware causing 401s
during concurrent logins. Added mutex locks and updated tests.
Resolution: Deployed in v1.2.3.
Tier 2: Ultra Compression (90+ days)
Target: Tier 1 issues 90+ days old, rarely referenced
Process:
- Verify existing Tier 1 compaction
- Check reference frequency (git commits, other issues)
- Ultra-compress to single paragraph
- Optionally prune events (keep created/closed only)
Result: 90-95% space reduction
Example:
After Tier 2 (43 bytes):
Description: Auth race condition fixed, deployed v1.2.3.
CLI Reference
Preview Candidates
# See what would be compacted
bd compact --dry-run --all
# Check Tier 2 candidates
bd compact --dry-run --all --tier 2
# Preview specific issue
bd compact --dry-run --id bd-42
Compact Issues
# Compact all eligible issues (Tier 1)
bd compact --all
# Compact specific issue
bd compact --id bd-42
# Force compact (bypass checks - use with caution)
bd compact --id bd-42 --force
# Tier 2 ultra-compression
bd compact --all --tier 2
# Control parallelism
bd compact --all --workers 10 --batch-size 20
Statistics & Monitoring
# Show compaction stats
bd compact --stats
# Output:
# Total issues: 2,438
# Compacted: 847 (34.7%)
# Tier 1: 812 issues
# Tier 2: 35 issues
# Space saved: 1.2 MB (68% reduction)
# Estimated cost: $0.85
Eligibility Rules
Tier 1 Eligibility
- ✅ Status:
closed - ✅ Age: 30+ days since
closed_at - ✅ Dependents: No open issues depending on this one
- ✅ Not already compacted
Tier 2 Eligibility
- ✅ Already Tier 1 compacted
- ✅ Age: 90+ days since
closed_at - ✅ Low reference frequency:
- Mentioned in <5 git commits in last 90 days, OR
- Referenced by <3 issues created in last 90 days
Configuration
API Key Setup
Option 1: Environment variable (recommended)
export ANTHROPIC_API_KEY="sk-ant-..."
Add to your shell profile (~/.zshrc, ~/.bashrc, etc.) for persistence.
Option 2: CI/CD environments
# GitHub Actions
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
# GitLab CI
variables:
ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
Parallel Processing
Control performance vs. API rate limits:
# Default: 5 workers, 10 issues per batch
bd compact --all
# High throughput (watch rate limits!)
bd compact --all --workers 20 --batch-size 50
# Conservative (avoid rate limits)
bd compact --all --workers 2 --batch-size 5
Cost Analysis
Pricing Basics
Compaction uses Claude Haiku (~$1 per 1M input tokens, ~$5 per 1M output tokens).
Typical issue:
- Input: ~500 tokens (issue content)
- Output: ~100 tokens (summary)
- Cost per issue: ~$0.001 (0.1¢)
Cost Examples
| Issues | Est. Cost | Time (5 workers) |
|---|---|---|
| 100 | $0.10 | ~2 minutes |
| 1,000 | $1.00 | ~20 minutes |
| 10,000 | $10.00 | ~3 hours |
Monthly Cost Estimate
If you close 50 issues/month and compact monthly:
- Monthly cost: $0.05
- Annual cost: $0.60
Even large teams (500 issues/month) pay ~$6/year.
Space Savings
| Database Size | Issues | After Tier 1 | After Tier 2 |
|---|---|---|---|
| 10 MB | 2,000 | 3 MB (-70%) | 1 MB (-90%) |
| 100 MB | 20,000 | 30 MB (-70%) | 10 MB (-90%) |
| 1 GB | 200,000 | 300 MB (-70%) | 100 MB (-90%) |
Automation
Monthly Cron Job
#!/bin/bash
# /etc/cron.monthly/bd-compact.sh
export ANTHROPIC_API_KEY="sk-ant-..."
cd /path/to/your/repo
# Compact Tier 1
bd compact --all 2>&1 | tee -a ~/.bd-compact.log
# Commit results
git add .beads/issues.jsonl issues.db
git commit -m "Monthly compaction: $(date +%Y-%m)"
git push
Make executable:
chmod +x /etc/cron.monthly/bd-compact.sh
Automated Workflow Script
#!/bin/bash
# examples/compaction/workflow.sh
# Exit on error
set -e
echo "=== BD Compaction Workflow ==="
echo "Date: $(date)"
echo
# Check API key
if [ -z "$ANTHROPIC_API_KEY" ]; then
echo "Error: ANTHROPIC_API_KEY not set"
exit 1
fi
# Preview candidates
echo "--- Preview Tier 1 Candidates ---"
bd compact --dry-run --all
read -p "Proceed with Tier 1 compaction? (y/N) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
echo "--- Running Tier 1 Compaction ---"
bd compact --all
fi
# Preview Tier 2
echo
echo "--- Preview Tier 2 Candidates ---"
bd compact --dry-run --all --tier 2
read -p "Proceed with Tier 2 compaction? (y/N) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
echo "--- Running Tier 2 Compaction ---"
bd compact --all --tier 2
fi
# Show stats
echo
echo "--- Final Statistics ---"
bd compact --stats
echo
echo "=== Compaction Complete ==="
Pre-commit Hook (Automatic)
#!/bin/bash
# .git/hooks/pre-commit
# Auto-compact before each commit (optional, experimental)
if command -v bd &> /dev/null && [ -n "$ANTHROPIC_API_KEY" ]; then
bd compact --all --dry-run > /dev/null 2>&1
# Only compact if >10 eligible issues
ELIGIBLE=$(bd compact --dry-run --all --json 2>/dev/null | jq '. | length')
if [ "$ELIGIBLE" -gt 10 ]; then
echo "Auto-compacting $ELIGIBLE eligible issues..."
bd compact --all
fi
fi
Safety & Recovery
Git History
Compaction is permanent - the original content is discarded to save space. However, you can recover old versions from git history:
# View issue before compaction
git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
# Checkout old version
git checkout <commit-hash> -- .beads/issues.jsonl
# Or use git show
git show <commit-hash>:.beads/issues.jsonl | grep -A 50 "bd-42"
Verification
After compaction, verify with:
# Check compaction stats
bd compact --stats
# Spot-check compacted issues
bd show bd-42
Troubleshooting
"ANTHROPIC_API_KEY not set"
export ANTHROPIC_API_KEY="sk-ant-..."
# Add to ~/.zshrc or ~/.bashrc for persistence
Rate Limit Errors
Reduce parallelism:
bd compact --all --workers 2 --batch-size 5
Or add delays between batches (future enhancement).
Issue Not Eligible
Check eligibility:
bd compact --dry-run --id bd-42
Force compact (if you know what you're doing):
bd compact --id bd-42 --force
FAQ
When should I compact?
- Small projects (<500 issues): Rarely needed, maybe annually
- Medium projects (500-5000 issues): Every 3-6 months
- Large projects (5000+ issues): Monthly or quarterly
- High-velocity teams: Set up automated monthly compaction
Can I recover compacted issues?
Compaction is permanent, but you can recover from git history:
git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
What happens to dependencies?
Dependencies are preserved. Compaction only affects the issue's text fields (description, design, notes, acceptance criteria).
Does compaction affect git history?
No. Old versions of issues remain in git history. Compaction only affects the current state in .beads/issues.jsonl and issues.db.
Should I commit compacted issues?
Yes. Compaction modifies both the database and JSONL. Commit and push:
git add .beads/issues.jsonl issues.db
git commit -m "Compact old closed issues"
git push
What if my team disagrees on compaction frequency?
Use bd compact --dry-run to preview. Discuss the candidates before running. Since compaction is permanent, get team consensus first.
Can I compact open issues?
No. Compaction only works on closed issues to ensure active work retains full detail.
How does Tier 2 decide "rarely referenced"?
It checks:
- Git commits mentioning the issue ID in last 90 days
- Other issues referencing it in descriptions/notes
If references are low (< 5 commits or < 3 issues), it's eligible for Tier 2.
Does compaction slow down queries?
No. Compaction reduces database size, making queries faster. Agents benefit from smaller context when reading issues.
Can I customize the summarization prompt?
Not yet, but it's planned (bd-264). The current prompt is optimized for preserving key decisions and outcomes.
Best Practices
- Start with dry-run: Always preview before compacting
- Compact regularly: Monthly or quarterly depending on project size
- Monitor costs: Use
bd compact --statsto track savings - Automate it: Set up cron jobs for hands-off maintenance
- Commit results: Always commit and push after compaction
- Team communication: Let team know before large compaction runs (it's permanent!)
Examples
See examples/compaction/ for:
workflow.sh- Interactive compaction workflowcron-compact.sh- Automated monthly compactionauto-compact.sh- Smart auto-compaction with thresholds
Related Documentation
- README.md - Quick start and overview
- EXTENDING.md - Database schema and extensions
- GIT_WORKFLOW.md - Multi-machine collaboration
Contributing
Found a bug or have ideas for improving compaction? Open an issue or PR!
Common enhancement requests:
- Custom summarization prompts (bd-264)
- Alternative LLM backends (local models)
- Configurable eligibility rules
- Batch restore operations
- Compaction analytics dashboard