feat: Add code-review convoy formula with leg prompt templates (gt-f0701)

Introduces the leg prompt template system for convoy-style formulas:
- Base prompt template with context injection (PR, files, formula name)
- 7 specialized review legs (correctness, performance, security, elegance,
  resilience, style, smells)
- Per-leg focus and description for polecat instructions
- Output format specification with structured findings template
- Synthesis step configuration for combining leg outputs

This formula demonstrates the template system pattern for parallel
polecat execution in convoy workflows.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
slit
2026-01-01 14:50:54 -08:00
committed by Steve Yegge
parent f9f5517027
commit 8726a6d493

View File

@@ -0,0 +1,318 @@
# Code Review Convoy Formula
#
# A convoy-style formula that spawns multiple polecats in parallel,
# each focusing on a different review aspect. Results are collected
# and synthesized into a unified review.
#
# Usage:
# gt formula run code-review --pr=123
# gt formula run code-review --files="src/*.go"
description = """
Comprehensive code review via parallel specialized reviewers.
Each leg examines the code from a different perspective. Findings are
collected and synthesized into a prioritized, actionable review.
## Legs (parallel execution)
- **correctness**: Logic errors, bugs, edge cases
- **performance**: Bottlenecks, efficiency issues
- **security**: Vulnerabilities, OWASP concerns
- **elegance**: Design clarity, abstraction quality
- **resilience**: Error handling, failure modes
- **style**: Convention compliance, consistency
- **smells**: Anti-patterns, technical debt
## Execution Model
1. Each leg spawns as a separate polecat
2. Polecats work in parallel
3. Each writes findings to their designated output
4. Synthesis step combines all findings into unified review
"""
formula = "code-review"
type = "convoy"
version = 1
# Input variables - provided at runtime
[inputs]
[inputs.pr]
description = "Pull request number to review"
type = "number"
required_unless = ["files", "branch"]
[inputs.files]
description = "File glob pattern to review"
type = "string"
required_unless = ["pr", "branch"]
[inputs.branch]
description = "Branch name to review (diff against main)"
type = "string"
required_unless = ["pr", "files"]
# Base prompt template - injected into all leg prompts
[prompts]
base = """
# Code Review Assignment
You are a specialized code reviewer participating in a convoy review.
## Context
- **Formula**: {{formula_name}}
- **Review target**: {{target_description}}
- **Your focus**: {{leg.focus}}
- **Leg ID**: {{leg.id}}
## Files Under Review
{{#if pr_number}}
PR #{{pr_number}}: {{pr_title}}
Changed files:
{{#each changed_files}}
- {{this.path}} (+{{this.additions}}/-{{this.deletions}})
{{/each}}
{{else}}
{{#each files}}
- {{this}}
{{/each}}
{{/if}}
## Your Task
{{leg.description}}
## Output Requirements
Write your findings to: **{{output_path}}**
Structure your output as follows:
```markdown
# {{leg.title}} Review
## Summary
(1-2 paragraph overview of findings)
## Critical Issues
(P0 - Must fix before merge)
- Issue description with file:line reference
- Explanation of impact
- Suggested fix
## Major Issues
(P1 - Should fix before merge)
- ...
## Minor Issues
(P2 - Nice to fix)
- ...
## Observations
(Non-blocking notes and suggestions)
- ...
```
Use specific file:line references. Be actionable. Prioritize impact.
"""
# Output configuration
[output]
directory = ".reviews/{{review_id}}"
leg_pattern = "{{leg.id}}-findings.md"
synthesis = "review-summary.md"
# Leg definitions - each spawns a parallel polecat
[[legs]]
id = "correctness"
title = "Correctness Review"
focus = "Logical correctness and edge case handling"
description = """
Review the code for logical errors and edge case handling.
**Look for:**
- Logic errors and bugs
- Off-by-one errors
- Null/nil/undefined handling
- Unhandled edge cases
- Race conditions in concurrent code
- Dead code or unreachable branches
- Incorrect assumptions in comments vs code
- Integer overflow/underflow potential
- Floating point comparison issues
**Questions to answer:**
- Does the code do what it claims to do?
- What inputs could cause unexpected behavior?
- Are all code paths tested or obviously correct?
"""
[[legs]]
id = "performance"
title = "Performance Review"
focus = "Performance bottlenecks and efficiency"
description = """
Review the code for performance issues.
**Look for:**
- O(n²) or worse algorithms where O(n) is possible
- Unnecessary allocations in hot paths
- Missing caching opportunities
- N+1 query patterns (database or API)
- Blocking operations in async contexts
- Memory leaks or unbounded growth
- Excessive string concatenation
- Unoptimized regex or parsing
**Questions to answer:**
- What happens at 10x, 100x, 1000x scale?
- Are there obvious optimizations being missed?
- Is performance being traded for readability appropriately?
"""
[[legs]]
id = "security"
title = "Security Review"
focus = "Security vulnerabilities and attack surface"
description = """
Review the code for security vulnerabilities.
**Look for:**
- Input validation gaps
- Authentication/authorization bypasses
- Injection vulnerabilities (SQL, XSS, command, LDAP)
- Sensitive data exposure (logs, errors, responses)
- Hardcoded secrets or credentials
- Insecure cryptographic usage
- Path traversal vulnerabilities
- SSRF (Server-Side Request Forgery)
- Deserialization vulnerabilities
- OWASP Top 10 concerns
**Questions to answer:**
- What can a malicious user do with this code?
- What data could be exposed if this fails?
- Are there defense-in-depth gaps?
"""
[[legs]]
id = "elegance"
title = "Elegance Review"
focus = "Design clarity and abstraction quality"
description = """
Review the code for design quality.
**Look for:**
- Unclear abstractions or naming
- Functions doing too many things
- Missing or over-engineered abstractions
- Coupling that should be loose
- Dependencies that flow the wrong direction
- Unclear data flow or control flow
- Magic numbers/strings without explanation
- Inconsistent design patterns
- Violation of SOLID principles
- Reinventing existing utilities
**Questions to answer:**
- Would a new team member understand this?
- Does the structure match the problem domain?
- Is the complexity justified?
"""
[[legs]]
id = "resilience"
title = "Resilience Review"
focus = "Error handling and failure modes"
description = """
Review the code for resilience and error handling.
**Look for:**
- Swallowed errors or empty catch blocks
- Missing error propagation
- Unclear error messages
- Insufficient retry/backoff logic
- Missing timeout handling
- Resource cleanup on failure (files, connections)
- Partial failure states
- Missing circuit breakers for external calls
- Unhelpful panic/crash behavior
- Recovery path gaps
**Questions to answer:**
- What happens when external services fail?
- Can the system recover from partial failures?
- Are errors actionable for operators?
"""
[[legs]]
id = "style"
title = "Style Review"
focus = "Convention compliance and consistency"
description = """
Review the code for style and convention compliance.
**Look for:**
- Naming convention violations
- Formatting inconsistencies
- Import organization issues
- Comment quality (missing, outdated, or obvious)
- Documentation gaps for public APIs
- Inconsistent patterns within the codebase
- Lint/format violations
- Test naming and organization
- Log message quality and levels
**Questions to answer:**
- Does this match the rest of the codebase?
- Would the style guide approve?
- Is the code self-documenting where possible?
"""
[[legs]]
id = "smells"
title = "Code Smells Review"
focus = "Anti-patterns and technical debt"
description = """
Review the code for code smells and anti-patterns.
**Look for:**
- Long methods (>50 lines is suspicious)
- Deep nesting (>3 levels)
- Shotgun surgery patterns
- Feature envy
- Data clumps
- Primitive obsession
- Temporary fields
- Refused bequest
- Speculative generality
- God classes/functions
- Copy-paste code (DRY violations)
- TODO/FIXME accumulation
**Questions to answer:**
- What will cause pain during the next change?
- What would you refactor if you owned this code?
- Is technical debt being added or paid down?
"""
# Synthesis step - combines all leg outputs
[synthesis]
title = "Review Synthesis"
description = """
Combine all leg findings into a unified, prioritized review.
**Your input:**
All leg findings from: {{output.directory}}/
**Your output:**
A synthesized review at: {{output.directory}}/{{output.synthesis}}
**Structure:**
1. **Executive Summary** - Overall assessment, merge recommendation
2. **Critical Issues** - P0 items from all legs, deduplicated
3. **Major Issues** - P1 items, grouped by theme
4. **Minor Issues** - P2 items, briefly listed
5. **Positive Observations** - What's done well
6. **Recommendations** - Actionable next steps
Deduplicate issues found by multiple legs (note which legs found them).
Prioritize by impact and effort. Be actionable.
"""
depends_on = ["correctness", "performance", "security", "elegance", "resilience", "style", "smells"]