- Extended CONTEXT_ENGINEERING.md with additional optimization strategies - Added compaction configuration support to MCP server - Added tests for compaction config and MCP compaction Amp-Thread-ID: https://ampcode.com/threads/T-019b1f07-daa0-750c-878f-20bcc2d24f50 Co-authored-by: Amp <amp@ampcode.com>
11 KiB
Context Engineering for beads-mcp
Overview
This document describes the context engineering optimizations added to beads-mcp to reduce context window usage by ~80-90% while maintaining full functionality.
The Problem
MCP servers load all tool schemas at startup, consuming significant context:
- Before: ~10-50k tokens for full beads tool schemas
- After: ~2-5k tokens with lazy loading and compaction
For coding agents operating in limited context windows (100k-200k tokens), this overhead leaves less room for:
- Code files and diffs
- Conversation history
- Task planning and reasoning
Solutions Implemented
1. Lazy Tool Schema Loading
Instead of loading all tool schemas upfront, agents can discover tools on-demand:
# Step 1: Discover available tools (lightweight - ~500 bytes)
discover_tools()
# Returns: { "tools": { "ready": "Find ready tasks", ... }, "count": 15 }
# Step 2: Get details for specific tool (~300 bytes each)
get_tool_info("ready")
# Returns: { "name": "ready", "parameters": {...}, "example": "..." }
Savings: ~95% reduction in initial schema overhead
2. Minimal Issue Models
List operations now return IssueMinimal instead of full Issue:
# IssueMinimal (~80 bytes per issue)
{
"id": "bd-a1b2",
"title": "Fix auth bug",
"status": "open",
"priority": 1,
"issue_type": "bug",
"assignee": "alice",
"labels": ["backend"],
"dependency_count": 2,
"dependent_count": 0
}
# vs Full Issue (~400 bytes per issue)
{
"id": "bd-a1b2",
"title": "Fix auth bug",
"description": "Long description...",
"design": "Design notes...",
"acceptance_criteria": "...",
"notes": "...",
"status": "open",
"priority": 1,
"issue_type": "bug",
"created_at": "2024-01-01T...",
"updated_at": "2024-01-02T...",
"closed_at": null,
"assignee": "alice",
"labels": ["backend"],
"dependencies": [...],
"dependents": [...],
...
}
Savings: ~80% reduction per issue in list views
3. Result Compaction
When results exceed threshold (20 issues), returns preview + metadata:
# Request: list(status="open")
# Response when >20 results:
{
"compacted": true,
"total_count": 47,
"preview": [/* first 5 issues */],
"preview_count": 5,
"hint": "Use show(issue_id) for full details or add filters"
}
Savings: Prevents unbounded context growth from large queries
Usage Patterns
Efficient Workflow (Recommended)
# 1. Set context once
set_context(workspace_root="/path/to/project")
# 2. Get ready work (minimal format)
issues = ready(limit=10, priority=1)
# 3. Pick an issue and get full details only when needed
full_issue = show(issue_id="bd-a1b2")
# 4. Do work...
# 5. Close when done
close(issue_id="bd-a1b2", reason="Fixed in PR #123")
Tool Discovery Workflow
# First time using beads? Discover tools efficiently:
tools = discover_tools()
# → {"tools": {"ready": "...", "list": "...", ...}, "count": 15}
# Need to know how to use a specific tool?
info = get_tool_info("create")
# → {"parameters": {...}, "example": "create(title='...', ...)"}
Handling Large Result Sets
When a query returns more than 20 results, the response switches to CompactedResult format. This section explains how to detect and handle compacted responses.
CompactedResult Schema
# Response when >20 results
{
"compacted": True,
"total_count": 47, # Total matching issues (not shown)
"preview": [
{
"id": "bd-a1b2",
"title": "Fix auth bug",
"status": "open",
"priority": 1,
"issue_type": "bug",
"assignee": "alice",
"labels": ["backend"],
"dependency_count": 2,
"dependent_count": 0
},
# ... 4 more issues (PREVIEW_COUNT=5)
],
"preview_count": 5,
"hint": "Use show(issue_id) for full details or add filters"
}
Detecting Compacted Results
Check the compacted field in the response:
import json
def handle_issue_list(response):
"""Handle both list and compacted responses."""
if isinstance(response, dict) and response.get("compacted"):
# Compacted response
total = response["total_count"]
shown = response["preview_count"]
issues = response["preview"]
print(f"Showing {shown} of {total} issues")
return issues
else:
# Regular list response (all results included)
return response
Getting Full Results When Compacted
When you get a compacted response, you have several options:
Option 1: Use show() for Specific Issues
# Get full details for a specific issue
full_issue = show(issue_id="bd-a1b2")
# Returns complete Issue model with dependencies, description, etc.
Option 2: Narrow Your Query
Add filters to reduce the result set:
# Instead of list(status="open") # Returns 47+ results
# Try:
issues = list(status="open", priority=0) # Returns 8 results
Option 3: Check Type Hints
The response type tells you what to expect:
from typing import Union
def handle_response(response: Union[list, dict]):
"""Properly typed response handling."""
if isinstance(response, dict) and response.get("compacted"):
# Handle CompactedResult
for issue in response["preview"]:
process_minimal_issue(issue)
print(f"Note: {response['total_count']} total issues exist")
else:
# Handle list[IssueMinimal]
for issue in response:
process_minimal_issue(issue)
Python Client Example
Here's a complete example handling both response types:
class BeadsClient:
"""Example client with proper compaction handling."""
def get_all_ready_work(self):
"""Safely get ready work, handling compaction."""
response = self.ready(limit=10, priority=1)
# Check if compacted
if isinstance(response, dict) and response.get("compacted"):
print(f"Warning: Showing {response['preview_count']} "
f"of {response['total_count']} ready items")
print(f"Hint: {response['hint']}")
return response["preview"]
# Full list returned
return response
def list_with_fallback(self, **filters):
"""List issues, with automatic filter refinement on compaction."""
response = self.list(**filters)
if isinstance(response, dict) and response.get("compacted"):
# Too many results - add priority filter to narrow down
if "priority" not in filters:
print(f"Too many results ({response['total_count']}). "
"Filtering by priority=1...")
filters["priority"] = 1
return self.list_with_fallback(**filters)
else:
# Can't narrow further, return preview
return response["preview"]
return response
def show_full_issue(self, issue_id: str):
"""Always get full issue details (never compacted)."""
return self.show(issue_id=issue_id)
Migration Guide for Clients
If your client was written expecting list() and ready() to always return list[Issue], follow these steps:
Step 1: Update Type Hints
# OLD (incorrect with new server)
def process_issues(issues: list[Issue]) -> None:
for issue in issues:
print(issue.description)
# NEW (handles both cases)
from typing import Union
IssueListOrCompacted = Union[list[IssueMinimal], CompactedResult]
def process_issues(response: IssueListOrCompacted) -> None:
if isinstance(response, dict) and response.get("compacted"):
issues = response["preview"]
print(f"Note: Only showing preview of {response['total_count']} total")
else:
issues = response
for issue in issues:
print(f"{issue.id}: {issue.title}") # Works with IssueMinimal
Step 2: Handle Missing Fields
IssueMinimal doesn't include description, design, or dependencies. Adjust your code:
# OLD (would fail if using IssueMinimal)
for issue in issues:
print(f"{issue.title}\n{issue.description}")
# NEW (only use available fields)
for issue in issues:
print(f"{issue.title}")
if hasattr(issue, 'description'):
print(issue.description)
elif need_description:
full = show(issue.id)
print(full.description)
Step 3: Use show() for Full Details
When you need dependencies or detailed information:
# Get minimal info for listing
ready_issues = ready(limit=20)
# For detailed work, fetch full issue
for minimal_issue in ready_issues if not isinstance(ready_issues, dict) else ready_issues.get("preview", []):
full_issue = show(issue_id=minimal_issue.id)
print(f"Dependencies: {full_issue.dependencies}")
Configuration
Environment Variables (v0.29.0+)
Compaction behavior can be tuned via environment variables:
# Set custom compaction threshold
export BEADS_MCP_COMPACTION_THRESHOLD=50
# Set custom preview size
export BEADS_MCP_PREVIEW_COUNT=10
Environment Variables:
| Variable | Default | Purpose | Constraints |
|---|---|---|---|
BEADS_MCP_COMPACTION_THRESHOLD |
20 | Compact results with more than N issues | Must be ≥ 1 |
BEADS_MCP_PREVIEW_COUNT |
5 | Show first N issues in preview | Must be ≥ 1 and ≤ threshold |
Examples:
# Disable compaction by setting high threshold
BEADS_MCP_COMPACTION_THRESHOLD=10000 beads-mcp
# Show more preview items in compacted results
BEADS_MCP_PREVIEW_COUNT=10 beads-mcp
# Show fewer items for limited context windows
BEADS_MCP_COMPACTION_THRESHOLD=10 BEADS_MCP_PREVIEW_COUNT=3 beads-mcp
Default Values:
If not set, the server uses:
COMPACTION_THRESHOLD = 20 # Compact results with more than 20 issues
PREVIEW_COUNT = 5 # Show first 5 issues in preview
Use Cases:
-
Tight context windows (100k tokens): Reduce threshold and preview count
BEADS_MCP_COMPACTION_THRESHOLD=10 BEADS_MCP_PREVIEW_COUNT=3 -
Plenty of context (200k+ tokens): Increase both settings or disable compaction
BEADS_MCP_COMPACTION_THRESHOLD=1000 BEADS_MCP_PREVIEW_COUNT=20 -
Debugging/Testing: Disable compaction entirely
BEADS_MCP_COMPACTION_THRESHOLD=999999 beads-mcp
Comparison
| Scenario | Before | After | Savings |
|---|---|---|---|
| Tool schemas (all) | ~15,000 bytes | ~500 bytes | 97% |
| List 50 issues | ~20,000 bytes | ~4,000 bytes | 80% |
| Ready work (10) | ~4,000 bytes | ~800 bytes | 80% |
| Single show() | ~400 bytes | ~400 bytes | 0% (full details) |
Design Principles
- Lazy Loading: Only fetch what you need, when you need it
- Minimal by Default: List views use lightweight models
- Full Details On-Demand: Use
show()for complete information - Graceful Degradation: Large results auto-compact with hints
- Backward Compatible: Existing workflows continue to work
Credits
Inspired by:
- MCP Bridge - Context engineering for MCP servers
- Manus Context Engineering - Compaction and offloading patterns
- Anthropic's Context Engineering Guide