- Removed restore command from bd show output - Updated compact help text (removed snapshot claim) - Fixed COMPACTION.md (removed 'batch restore' from roadmap) - All compaction UI now correctly states permanent decay
452 lines
11 KiB
Markdown
452 lines
11 KiB
Markdown
# Database Compaction Guide
|
|
|
|
## Overview
|
|
|
|
Beads compaction is **agentic memory decay** - your database naturally forgets fine-grained details of old work while preserving the essential context agents need. This keeps your database lightweight and fast, even after thousands of issues.
|
|
|
|
### Key Concepts
|
|
|
|
- **Semantic compression**: Claude Haiku summarizes issues intelligently, preserving decisions and outcomes
|
|
- **Two-tier system**: Gradual decay from full detail → summary → ultra-brief
|
|
- **Permanent decay**: Original content is discarded to save space (not reversible)
|
|
- **Safe by design**: Dry-run preview, eligibility checks, git history preserves old versions
|
|
|
|
## How It Works
|
|
|
|
### Tier 1: Semantic Compression (30+ days)
|
|
|
|
**Target**: Closed issues 30+ days old with no open dependents
|
|
|
|
**Process**:
|
|
1. Check eligibility (closed, 30+ days, no blockers)
|
|
2. Send to Claude Haiku for summarization
|
|
3. Replace verbose fields with concise summary
|
|
4. Store original size for statistics
|
|
|
|
**Result**: 70-80% space reduction
|
|
|
|
**Example**:
|
|
|
|
*Before (856 bytes):*
|
|
```
|
|
Title: Fix authentication race condition in login flow
|
|
Description: Users report intermittent 401 errors during concurrent
|
|
login attempts. The issue occurs when multiple requests hit the auth
|
|
middleware simultaneously...
|
|
|
|
Design: [15 lines of implementation details]
|
|
Acceptance Criteria: [8 test scenarios]
|
|
Notes: [debugging session notes]
|
|
```
|
|
|
|
*After (171 bytes):*
|
|
```
|
|
Title: Fix authentication race condition in login flow
|
|
Description: Fixed race condition in auth middleware causing 401s
|
|
during concurrent logins. Added mutex locks and updated tests.
|
|
Resolution: Deployed in v1.2.3.
|
|
```
|
|
|
|
### Tier 2: Ultra Compression (90+ days)
|
|
|
|
**Target**: Tier 1 issues 90+ days old, rarely referenced
|
|
|
|
**Process**:
|
|
1. Verify existing Tier 1 compaction
|
|
2. Check reference frequency (git commits, other issues)
|
|
3. Ultra-compress to single paragraph
|
|
4. Optionally prune events (keep created/closed only)
|
|
|
|
**Result**: 90-95% space reduction
|
|
|
|
**Example**:
|
|
|
|
*After Tier 2 (43 bytes):*
|
|
```
|
|
Description: Auth race condition fixed, deployed v1.2.3.
|
|
```
|
|
|
|
## CLI Reference
|
|
|
|
### Preview Candidates
|
|
|
|
```bash
|
|
# See what would be compacted
|
|
bd compact --dry-run --all
|
|
|
|
# Check Tier 2 candidates
|
|
bd compact --dry-run --all --tier 2
|
|
|
|
# Preview specific issue
|
|
bd compact --dry-run --id bd-42
|
|
```
|
|
|
|
### Compact Issues
|
|
|
|
```bash
|
|
# Compact all eligible issues (Tier 1)
|
|
bd compact --all
|
|
|
|
# Compact specific issue
|
|
bd compact --id bd-42
|
|
|
|
# Force compact (bypass checks - use with caution)
|
|
bd compact --id bd-42 --force
|
|
|
|
# Tier 2 ultra-compression
|
|
bd compact --all --tier 2
|
|
|
|
# Control parallelism
|
|
bd compact --all --workers 10 --batch-size 20
|
|
```
|
|
|
|
### Statistics & Monitoring
|
|
|
|
```bash
|
|
# Show compaction stats
|
|
bd compact --stats
|
|
|
|
# Output:
|
|
# Total issues: 2,438
|
|
# Compacted: 847 (34.7%)
|
|
# Tier 1: 812 issues
|
|
# Tier 2: 35 issues
|
|
# Space saved: 1.2 MB (68% reduction)
|
|
# Estimated cost: $0.85
|
|
```
|
|
|
|
## Eligibility Rules
|
|
|
|
### Tier 1 Eligibility
|
|
|
|
- ✅ Status: `closed`
|
|
- ✅ Age: 30+ days since `closed_at`
|
|
- ✅ Dependents: No open issues depending on this one
|
|
- ✅ Not already compacted
|
|
|
|
### Tier 2 Eligibility
|
|
|
|
- ✅ Already Tier 1 compacted
|
|
- ✅ Age: 90+ days since `closed_at`
|
|
- ✅ Low reference frequency:
|
|
- Mentioned in <5 git commits in last 90 days, OR
|
|
- Referenced by <3 issues created in last 90 days
|
|
|
|
## Configuration
|
|
|
|
### API Key Setup
|
|
|
|
**Option 1: Environment variable (recommended)**
|
|
|
|
```bash
|
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
```
|
|
|
|
Add to your shell profile (`~/.zshrc`, `~/.bashrc`, etc.) for persistence.
|
|
|
|
**Option 2: CI/CD environments**
|
|
|
|
```yaml
|
|
# GitHub Actions
|
|
env:
|
|
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
|
|
|
# GitLab CI
|
|
variables:
|
|
ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
|
|
```
|
|
|
|
### Parallel Processing
|
|
|
|
Control performance vs. API rate limits:
|
|
|
|
```bash
|
|
# Default: 5 workers, 10 issues per batch
|
|
bd compact --all
|
|
|
|
# High throughput (watch rate limits!)
|
|
bd compact --all --workers 20 --batch-size 50
|
|
|
|
# Conservative (avoid rate limits)
|
|
bd compact --all --workers 2 --batch-size 5
|
|
```
|
|
|
|
## Cost Analysis
|
|
|
|
### Pricing Basics
|
|
|
|
Compaction uses Claude Haiku (~$1 per 1M input tokens, ~$5 per 1M output tokens).
|
|
|
|
Typical issue:
|
|
- Input: ~500 tokens (issue content)
|
|
- Output: ~100 tokens (summary)
|
|
- Cost per issue: ~$0.001 (0.1¢)
|
|
|
|
### Cost Examples
|
|
|
|
| Issues | Est. Cost | Time (5 workers) |
|
|
|--------|-----------|------------------|
|
|
| 100 | $0.10 | ~2 minutes |
|
|
| 1,000 | $1.00 | ~20 minutes |
|
|
| 10,000 | $10.00 | ~3 hours |
|
|
|
|
### Monthly Cost Estimate
|
|
|
|
If you close 50 issues/month and compact monthly:
|
|
- **Monthly cost**: $0.05
|
|
- **Annual cost**: $0.60
|
|
|
|
Even large teams (500 issues/month) pay ~$6/year.
|
|
|
|
### Space Savings
|
|
|
|
| Database Size | Issues | After Tier 1 | After Tier 2 |
|
|
|---------------|--------|--------------|--------------|
|
|
| 10 MB | 2,000 | 3 MB (-70%) | 1 MB (-90%) |
|
|
| 100 MB | 20,000 | 30 MB (-70%) | 10 MB (-90%) |
|
|
| 1 GB | 200,000| 300 MB (-70%)| 100 MB (-90%)|
|
|
|
|
## Automation
|
|
|
|
### Monthly Cron Job
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# /etc/cron.monthly/bd-compact.sh
|
|
|
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
cd /path/to/your/repo
|
|
|
|
# Compact Tier 1
|
|
bd compact --all 2>&1 | tee -a ~/.bd-compact.log
|
|
|
|
# Commit results
|
|
git add .beads/issues.jsonl issues.db
|
|
git commit -m "Monthly compaction: $(date +%Y-%m)"
|
|
git push
|
|
```
|
|
|
|
Make executable:
|
|
```bash
|
|
chmod +x /etc/cron.monthly/bd-compact.sh
|
|
```
|
|
|
|
### Automated Workflow Script
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# examples/compaction/workflow.sh
|
|
|
|
# Exit on error
|
|
set -e
|
|
|
|
echo "=== BD Compaction Workflow ==="
|
|
echo "Date: $(date)"
|
|
echo
|
|
|
|
# Check API key
|
|
if [ -z "$ANTHROPIC_API_KEY" ]; then
|
|
echo "Error: ANTHROPIC_API_KEY not set"
|
|
exit 1
|
|
fi
|
|
|
|
# Preview candidates
|
|
echo "--- Preview Tier 1 Candidates ---"
|
|
bd compact --dry-run --all
|
|
|
|
read -p "Proceed with Tier 1 compaction? (y/N) " -n 1 -r
|
|
echo
|
|
if [[ $REPLY =~ ^[Yy]$ ]]; then
|
|
echo "--- Running Tier 1 Compaction ---"
|
|
bd compact --all
|
|
fi
|
|
|
|
# Preview Tier 2
|
|
echo
|
|
echo "--- Preview Tier 2 Candidates ---"
|
|
bd compact --dry-run --all --tier 2
|
|
|
|
read -p "Proceed with Tier 2 compaction? (y/N) " -n 1 -r
|
|
echo
|
|
if [[ $REPLY =~ ^[Yy]$ ]]; then
|
|
echo "--- Running Tier 2 Compaction ---"
|
|
bd compact --all --tier 2
|
|
fi
|
|
|
|
# Show stats
|
|
echo
|
|
echo "--- Final Statistics ---"
|
|
bd compact --stats
|
|
|
|
echo
|
|
echo "=== Compaction Complete ==="
|
|
```
|
|
|
|
### Pre-commit Hook (Automatic)
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# .git/hooks/pre-commit
|
|
|
|
# Auto-compact before each commit (optional, experimental)
|
|
if command -v bd &> /dev/null && [ -n "$ANTHROPIC_API_KEY" ]; then
|
|
bd compact --all --dry-run > /dev/null 2>&1
|
|
# Only compact if >10 eligible issues
|
|
ELIGIBLE=$(bd compact --dry-run --all --json 2>/dev/null | jq '. | length')
|
|
if [ "$ELIGIBLE" -gt 10 ]; then
|
|
echo "Auto-compacting $ELIGIBLE eligible issues..."
|
|
bd compact --all
|
|
fi
|
|
fi
|
|
```
|
|
|
|
## Safety & Recovery
|
|
|
|
### Git History
|
|
|
|
Compaction is permanent - the original content is discarded to save space. However, you can recover old versions from git history:
|
|
|
|
```bash
|
|
# View issue before compaction
|
|
git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
|
|
|
|
# Checkout old version
|
|
git checkout <commit-hash> -- .beads/issues.jsonl
|
|
|
|
# Or use git show
|
|
git show <commit-hash>:.beads/issues.jsonl | grep -A 50 "bd-42"
|
|
```
|
|
|
|
### Verification
|
|
|
|
After compaction, verify with:
|
|
|
|
```bash
|
|
# Check compaction stats
|
|
bd compact --stats
|
|
|
|
# Spot-check compacted issues
|
|
bd show bd-42
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "ANTHROPIC_API_KEY not set"
|
|
|
|
```bash
|
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
# Add to ~/.zshrc or ~/.bashrc for persistence
|
|
```
|
|
|
|
### Rate Limit Errors
|
|
|
|
Reduce parallelism:
|
|
```bash
|
|
bd compact --all --workers 2 --batch-size 5
|
|
```
|
|
|
|
Or add delays between batches (future enhancement).
|
|
|
|
### Issue Not Eligible
|
|
|
|
Check eligibility:
|
|
```bash
|
|
bd compact --dry-run --id bd-42
|
|
```
|
|
|
|
Force compact (if you know what you're doing):
|
|
```bash
|
|
bd compact --id bd-42 --force
|
|
```
|
|
|
|
## FAQ
|
|
|
|
### When should I compact?
|
|
|
|
- **Small projects (<500 issues)**: Rarely needed, maybe annually
|
|
- **Medium projects (500-5000 issues)**: Every 3-6 months
|
|
- **Large projects (5000+ issues)**: Monthly or quarterly
|
|
- **High-velocity teams**: Set up automated monthly compaction
|
|
|
|
### Can I recover compacted issues?
|
|
|
|
Compaction is permanent, but you can recover from git history:
|
|
```bash
|
|
git log -p -- .beads/issues.jsonl | grep -A 50 "bd-42"
|
|
```
|
|
|
|
### What happens to dependencies?
|
|
|
|
Dependencies are preserved. Compaction only affects the issue's text fields (description, design, notes, acceptance criteria).
|
|
|
|
### Does compaction affect git history?
|
|
|
|
No. Old versions of issues remain in git history. Compaction only affects the current state in `.beads/issues.jsonl` and `issues.db`.
|
|
|
|
### Should I commit compacted issues?
|
|
|
|
**Yes.** Compaction modifies both the database and JSONL. Commit and push:
|
|
|
|
```bash
|
|
git add .beads/issues.jsonl issues.db
|
|
git commit -m "Compact old closed issues"
|
|
git push
|
|
```
|
|
|
|
### What if my team disagrees on compaction frequency?
|
|
|
|
Use `bd compact --dry-run` to preview. Discuss the candidates before running. Since compaction is permanent, get team consensus first.
|
|
|
|
### Can I compact open issues?
|
|
|
|
No. Compaction only works on closed issues to ensure active work retains full detail.
|
|
|
|
### How does Tier 2 decide "rarely referenced"?
|
|
|
|
It checks:
|
|
1. Git commits mentioning the issue ID in last 90 days
|
|
2. Other issues referencing it in descriptions/notes
|
|
|
|
If references are low (< 5 commits or < 3 issues), it's eligible for Tier 2.
|
|
|
|
### Does compaction slow down queries?
|
|
|
|
No. Compaction reduces database size, making queries faster. Agents benefit from smaller context when reading issues.
|
|
|
|
### Can I customize the summarization prompt?
|
|
|
|
Not yet, but it's planned (bd-264). The current prompt is optimized for preserving key decisions and outcomes.
|
|
|
|
## Best Practices
|
|
|
|
1. **Start with dry-run**: Always preview before compacting
|
|
2. **Compact regularly**: Monthly or quarterly depending on project size
|
|
3. **Monitor costs**: Use `bd compact --stats` to track savings
|
|
4. **Automate it**: Set up cron jobs for hands-off maintenance
|
|
5. **Commit results**: Always commit and push after compaction
|
|
6. **Team communication**: Let team know before large compaction runs (it's permanent!)
|
|
|
|
## Examples
|
|
|
|
See [examples/compaction/](examples/compaction/) for:
|
|
- `workflow.sh` - Interactive compaction workflow
|
|
- `cron-compact.sh` - Automated monthly compaction
|
|
- `auto-compact.sh` - Smart auto-compaction with thresholds
|
|
|
|
## Related Documentation
|
|
|
|
- [README.md](README.md) - Quick start and overview
|
|
- [EXTENDING.md](EXTENDING.md) - Database schema and extensions
|
|
- [GIT_WORKFLOW.md](GIT_WORKFLOW.md) - Multi-machine collaboration
|
|
|
|
## Contributing
|
|
|
|
Found a bug or have ideas for improving compaction? Open an issue or PR!
|
|
|
|
Common enhancement requests:
|
|
- Custom summarization prompts (bd-264)
|
|
- Alternative LLM backends (local models)
|
|
- Configurable eligibility rules
|
|
- Compaction analytics dashboard
|
|
- Optional snapshot retention for restore (if requested)
|