Add enterprise framing for HOP-aligned features

- New: why-these-features.md explaining enterprise justification for each feature - Updated: understanding-gas-town.md with "Why Gas Town Exists" and A/B testing section - Updated: identity.md with "Why Identity Matters" and enterprise use cases - Updated: federation.md with "Why Federation?" and enterprise benefits table 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 21:04:13 -08:00
parent 38fc95d6a7
commit 45b021cf7f
4 changed files with 372 additions and 4 deletions
--- a/docs/understanding-gas-town.md
+++ b/docs/understanding-gas-town.md
@@ -3,6 +3,19 @@
 This document provides a conceptual overview of Gas Town's architecture, focusing on
 the role taxonomy and how different agents interact.

+## Why Gas Town Exists
+
+As AI agents become central to engineering workflows, teams face new challenges:
+
+- **Accountability:** Who did what? Which agent introduced this bug?
+- **Quality:** Which agents are reliable? Which need tuning?
+- **Efficiency:** How do you route work to the right agent?
+- **Scale:** How do you coordinate agents across repos and teams?
+
+Gas Town is an orchestration layer that treats AI agent work as structured data.
+Every action is attributed. Every agent has a track record. Every piece of work
+has provenance. See [Why These Features](why-these-features.md) for the full rationale.
+
 ## Role Taxonomy

 Gas Town has several agent types, each with distinct responsibilities and lifecycles.
@@ -186,6 +199,27 @@ All Gas Town agents follow the same core principle:
 This applies regardless of role. The hook is your assignment. Execute it immediately
 without waiting for confirmation. Gas Town is a steam engine - agents are pistons.

+## Model Evaluation and A/B Testing
+
+Gas Town's attribution and work history features enable objective model comparison:
+
+```bash
+# Deploy different models on similar tasks
+gt sling gt-abc gastown --model=claude-sonnet
+gt sling gt-def gastown --model=gpt-4
+
+# Compare outcomes
+bd stats --actor=gastown/polecats/* --group-by=model
+```
+
+Because every task has completion time, quality signals, and revision count,
+you can make data-driven decisions about which models to deploy where.
+
+This is particularly valuable for:
+- **Model selection:** Which model handles your codebase best?
+- **Capability mapping:** Claude for architecture, GPT for tests?
+- **Cost optimization:** When is a smaller model sufficient?
+
 ## Common Mistakes

 1. **Using dogs for user work**: Dogs are Deacon infrastructure. Use crew or polecats.