Add automated git traffic benchmark for Agent Mail
- Created tests/benchmarks/git_traffic.py with 50-issue workflow - Measures git operations reduction: git-only vs Agent Mail - Generates comparison reports with pass/fail (70% target) - Results: 98.5% reduction (200 → 3 operations) - Added documentation in tests/benchmarks/README.md Closes bd-ic1m Amp-Thread-ID: https://ampcode.com/threads/T-0907bb40-ffc9-4f09-843f-b26716ef9ce0 Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
84
tests/benchmarks/README.md
Normal file
84
tests/benchmarks/README.md
Normal file
@@ -0,0 +1,84 @@
|
||||
# Beads Benchmarks
|
||||
|
||||
Automated benchmarks for measuring Beads performance and Agent Mail coordination efficiency.
|
||||
|
||||
## Git Traffic Reduction Benchmark
|
||||
|
||||
**File:** `git_traffic.py`
|
||||
|
||||
### Purpose
|
||||
|
||||
Measures the reduction in git operations (pulls, commits, pushes) when using Agent Mail for multi-agent coordination compared to pure git-based synchronization.
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
# Run with default settings (50 issues)
|
||||
python3 tests/benchmarks/git_traffic.py
|
||||
|
||||
# Customize number of issues
|
||||
python3 tests/benchmarks/git_traffic.py -n 100
|
||||
|
||||
# Verbose output
|
||||
python3 tests/benchmarks/git_traffic.py -v
|
||||
|
||||
# Save report to file
|
||||
python3 tests/benchmarks/git_traffic.py -o report.md
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
The benchmark compares two workflows:
|
||||
|
||||
**Without Agent Mail (Git-only mode):**
|
||||
- Each issue update requires git pull + commit + push
|
||||
- Other agents pull to check for updates
|
||||
- Total: ~4 git operations per issue
|
||||
|
||||
**With Agent Mail:**
|
||||
- Coordination via HTTP messages (no git operations)
|
||||
- Status updates, reservations, notifications via Agent Mail
|
||||
- Single batched commit/push at end of workflow
|
||||
- Total: 3 git operations for entire batch
|
||||
|
||||
### Expected Results
|
||||
|
||||
For 50 issues:
|
||||
- **Without Agent Mail:** ~200 git operations
|
||||
- **With Agent Mail:** 3 git operations
|
||||
- **Reduction:** ≥70% (typically 95-98%)
|
||||
|
||||
### Exit Codes
|
||||
|
||||
- `0`: Success - achieved ≥70% reduction
|
||||
- `1`: Failure - regression detected
|
||||
|
||||
### Example Output
|
||||
|
||||
```
|
||||
======================================================================
|
||||
SUMMARY
|
||||
======================================================================
|
||||
Without Agent Mail: 200 git operations
|
||||
With Agent Mail: 3 git operations
|
||||
Reduction: 98.5%
|
||||
Target: 70%
|
||||
Status: ✅ PASS
|
||||
======================================================================
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.7+
|
||||
- bd (beads) CLI installed
|
||||
- git
|
||||
- Agent Mail server (optional - falls back to simulation if unavailable)
|
||||
|
||||
## CI Integration
|
||||
|
||||
This benchmark can be used in CI to detect regressions in Agent Mail coordination efficiency:
|
||||
|
||||
```bash
|
||||
python3 tests/benchmarks/git_traffic.py -n 50
|
||||
# Exits with status 1 if reduction < 70%
|
||||
```
|
||||
Reference in New Issue
Block a user