From 6ac42499bfd67318c1419bc03d832d38339d73f4 Mon Sep 17 00:00:00 2001
From: Steve Yegge <stevey@sourcegraph.com>
Date: Tue, 28 Oct 2025 19:31:52 -0700
Subject: [PATCH] removed obsolete/implemented designs

---
 n-way-collision-convergence.md |   95 ---
 repair_commands.md             | 1398 --------------------------------
 2 files changed, 1493 deletions(-)
 delete mode 100644 n-way-collision-convergence.md
 delete mode 100644 repair_commands.md

diff --git a/n-way-collision-convergence.md b/n-way-collision-convergence.md
deleted file mode 100644
index 553ab70c..00000000
--- a/n-way-collision-convergence.md
+++ /dev/null
@@ -1,95 +0,0 @@
-# N-Way Collision Convergence Problem
-
-## Summary
-
-The current collision resolution implementation (`--resolve-collisions`) works correctly for 2-way collisions but **does not converge** for 3-way (and by extension N-way) collisions. This is a critical limitation for parallel worker scenarios where multiple agents file issues simultaneously.
-
-## Test Evidence
-
-`TestThreeCloneCollision` in `beads_twoclone_test.go` demonstrates the problem with 3 clones creating the same issue ID (`test-1`) with different content.
-
-### Observed Behavior
-
-**Sync Order A→B→C:**
-- Clone A: 0 issues (empty database after final pull)
-- Clone B: 2 issues (missing "Issue from clone C")
-- Clone C: 3 issues (has all issues)
-
-**Sync Order C→A→B:**
-- Clone A: 2 issues (missing "Issue from clone B")
-- Clone B: 3 issues (has all issues)
-- Clone C: 0 issues (empty database after final pull)
-
-**Pattern:** The middle clone in the sync order gets all issues, but the first and last clones end up with incomplete data. This behavior is **100% reproducible** across all test runs.
-
-## Root Cause Analysis
-
-When the third clone pulls and resolves collisions:
-1. It correctly remaps its conflicting issue to a new ID (e.g., `test-1` → `test-3`)
-2. It imports the issues from the other two clones
-3. It pushes the merged state
-
-However, when the first clone pulls this merged state:
-1. The import sees new issues that collide with its local database
-2. The resolution logic doesn't properly handle issues that were already remapped upstream
-3. The database ends up in an inconsistent state (often empty or partially populated)
-
-## Why This Matters
-
-This prevents reliable N-way parallel worker scenarios:
-- Multiple AI agents filing issues simultaneously
-- Distributed teams working on different clones
-- CI/CD systems creating issues in parallel builds
-
-**Current workaround:** Only works reliably with 2 workers or sequential issue creation.
-
-## What Needs To Be Fixed
-
-### 1. Import Logic Enhancement
-The `--resolve-collisions` import needs to:
-- Detect when incoming issues were already remapped upstream
-- Preserve the remapping chain (track `test-1` → `test-2` → `test-3`)
-- Not re-remap already-remapped issues
-
-### 2. Convergence Algorithm
-Implement a proper convergence algorithm that ensures:
-- All clones eventually have the same complete set of issues
-- Idempotent imports (importing the same JSONL multiple times is safe)
-- Transitive collision resolution (if A remaps to B, and B exists, handle gracefully)
-
-### 3. Test Requirements
-The fix should make `TestThreeCloneCollision` pass without skipping:
-- All three clones must have all three issues (by title)
-- Content must match across all clones (ignoring timestamps and specific ID assignments)
-- Must work for both sync orders (A→B→C and C→A→B)
-
-### 4. Extend to N-Way
-Once 3-way works, verify it generalizes to N workers:
-- Test with 5+ clones
-- Test with different sync order permutations
-- Ensure convergence time is bounded
-
-## Files To Examine
-
-- **`beads_twoclone_test.go`**: Contains `TestThreeCloneCollision` that reproduces the issue
-- **`cmd/bd/import.go`**: Import logic with `--resolve-collisions` flag
-- **`internal/storage/sqlite/sqlite.go`**: Database operations for collision detection
-- **`cmd/bd/sync.go`**: Sync workflow that calls import/export
-
-## Success Criteria
-
-1. `TestThreeCloneCollision` passes without skipping
-2. All clones converge to identical content after final pull
-3. No data loss (all issues present in all clones)
-4. ID assignments can be non-deterministic, but content must match
-5. Works for N workers (extend test to 5+ clones)
-
-## Current Test Status
-
-```bash
-go test -v -run TestThreeCloneCollision
-# Both subtests SKIP with message:
-# "KNOWN LIMITATION: 3-way collisions may require additional resolution logic"
-```
-
-The test is designed to skip when convergence fails, so it won't break CI, but it documents the limitation clearly.
diff --git a/repair_commands.md b/repair_commands.md
deleted file mode 100644
index 1ed9df9c..00000000
--- a/repair_commands.md
+++ /dev/null
@@ -1,1398 +0,0 @@
-# Repair Commands & AI-Assisted Tooling
-
-**Status:** Design Proposal
-**Author:** AI Assistant
-**Date:** 2025-10-28
-**Context:** Reduce agent repair burden by providing specialized repair tools
-
-## Executive Summary
-
-Agents spend significant time repairing beads databases due to:
-1. Git merge conflicts in JSONL
-2. Duplicate issues from parallel work
-3. Semantic inconsistencies (labeling, dependencies)
-4. Orphaned references after deletions
-
-**Solution:** Add dedicated repair commands that agents (and humans) can invoke instead of manually fixing these issues. Some commands use AI for semantic understanding, others are pure mechanical checks.
-
-## Problem Analysis
-
-### Current Repair Scenarios
-
-Based on codebase analysis and commit history:
-
-#### 1. Git Merge Conflicts (High Frequency)
-
-**Scenario:**
-```bash
-# Feature branch creates bd-42
-git checkout -b feature
-bd create "Add authentication"  # Creates bd-42
-
-# Meanwhile, main branch also creates bd-42
-git checkout main
-bd create "Fix logging"  # Also creates bd-42
-
-# Merge creates conflict
-git checkout feature
-git merge main
-```
-
-**JSONL conflict:**
-```json
-<<<<<<< HEAD
-{"id":"bd-42","title":"Add authentication",...}
-=======
-{"id":"bd-42","title":"Fix logging",...}
->>>>>>> main
-```
-
-**Current fix:** Agent manually parses conflict markers, remaps IDs, updates references
-
-**Pain points:**
-- Time-consuming (5-10 minutes per conflict)
-- Error-prone (easy to miss references)
-- Repetitive (same logic every time)
-
-#### 2. Semantic Duplicates (Medium Frequency)
-
-**Scenario:**
-```bash
-# Agent A creates issue
-bd create "Fix memory leak in parser"  # bd-42
-
-# Agent B creates similar issue (different session)
-bd create "Parser memory leak needs fixing"  # bd-87
-
-# Human notices: "These are the same issue!"
-```
-
-**Current fix:** Agent manually:
-1. Reads both issues
-2. Determines they're duplicates
-3. Picks canonical one
-4. Closes duplicate with reference
-5. Moves comments/dependencies
-
-**Pain points:**
-- Requires reading full issue text
-- Subjective judgment (are they really duplicates?)
-- Manual reference updates
-
-#### 3. Test Pollution (Low Frequency Now, High Impact)
-
-**Scenario:**
-```bash
-# Test creates 1044 issues in production DB
-go test ./internal/rpc/...  # Oops, no isolation
-
-bd list
-# Shows 1044 issues with titles like "test-issue-1", "benchmark-issue-42"
-```
-
-**Recent occurrence:** Commits 78e8cb9, d1d3fcd (Oct 2025)
-
-**Current fix:** Agent manually:
-1. Identifies test issues by pattern matching
-2. Bulk closes with `bd close bd-1 bd-2 ... bd-1044`
-3. Archives or deletes
-
-**Pain points:**
-- Hard to distinguish test vs. real issues
-- Risk of deleting real issues
-- No automated recovery
-
-#### 4. Orphaned Dependencies (Medium Frequency)
-
-**Scenario:**
-```bash
-bd create "Implement feature X"  # bd-42
-bd create "Test feature X" --depends bd-42  # bd-43 depends on bd-42
-
-bd delete bd-42  # User deletes parent
-
-bd show bd-43
-# Depends: bd-42 (orphaned - issue doesn't exist!)
-```
-
-**Current fix:** Agent manually updates dependencies
-
-**Pain points:**
-- Silent corruption (no warning on delete)
-- Hard to find orphans (requires DB query)
-
-## Proposed Commands
-
-### 1. `bd resolve-conflicts` - Git Merge Conflict Resolver
-
-**Purpose:** Automatically resolve JSONL merge conflicts
-
-**Usage:**
-```bash
-# Detect conflicts
-bd resolve-conflicts
-
-# Auto-resolve with AI
-bd resolve-conflicts --auto
-
-# Manual conflict resolution
-bd resolve-conflicts --interactive
-```
-
-**Implementation:**
-
-```go
-// cmd/bd/resolve_conflicts.go (new file)
-package main
-
-import (
-    "bufio"
-    "context"
-    "fmt"
-    "os"
-    "strings"
-
-    "github.com/steveyegge/beads/internal/types"
-)
-
-type ConflictBlock struct {
-    HeadIssues []types.Issue
-    BaseIssues []types.Issue
-    LineStart  int
-    LineEnd    int
-}
-
-func detectConflicts(jsonlPath string) ([]ConflictBlock, error) {
-    file, err := os.Open(jsonlPath)
-    if err != nil {
-        return nil, err
-    }
-    defer file.Close()
-
-    var conflicts []ConflictBlock
-    var current *ConflictBlock
-    inConflict := false
-    inHead := false
-    lineNum := 0
-
-    scanner := bufio.NewScanner(file)
-    for scanner.Scan() {
-        line := scanner.Text()
-        lineNum++
-
-        switch {
-        case strings.HasPrefix(line, "<<<<<<<"):
-            // Start of conflict
-            inConflict = true
-            inHead = true
-            current = &ConflictBlock{LineStart: lineNum}
-
-        case strings.HasPrefix(line, "======="):
-            // Switch from HEAD to base
-            inHead = false
-
-        case strings.HasPrefix(line, ">>>>>>>"):
-            // End of conflict
-            inConflict = false
-            current.LineEnd = lineNum
-            conflicts = append(conflicts, *current)
-            current = nil
-
-        case inConflict && inHead:
-            // Parse issue in HEAD section
-            issue, err := parseIssueLine(line)
-            if err == nil {
-                current.HeadIssues = append(current.HeadIssues, issue)
-            }
-
-        case inConflict && !inHead:
-            // Parse issue in base section
-            issue, err := parseIssueLine(line)
-            if err == nil {
-                current.BaseIssues = append(current.BaseIssues, issue)
-            }
-        }
-    }
-
-    if scanner.Err() != nil {
-        return nil, scanner.Err()
-    }
-
-    return conflicts, nil
-}
-
-func resolveConflictsAuto(conflicts []ConflictBlock, useAI bool) ([]Resolution, error) {
-    var resolutions []Resolution
-
-    for _, conflict := range conflicts {
-        if useAI {
-            // Use AI to determine resolution
-            resolution, err := resolveConflictWithAI(conflict)
-            if err != nil {
-                return nil, err
-            }
-            resolutions = append(resolutions, resolution)
-        } else {
-            // Mechanical resolution: remap duplicate IDs
-            resolution := resolveConflictMechanical(conflict)
-            resolutions = append(resolutions, resolution)
-        }
-    }
-
-    return resolutions, nil
-}
-
-type Resolution struct {
-    Action   string // "remap", "merge", "keep-head", "keep-base"
-    OldID    string
-    NewID    string
-    Reason   string
-    Merged   *types.Issue // If action="merge"
-}
-
-func resolveConflictMechanical(conflict ConflictBlock) Resolution {
-    // Mechanical strategy: Keep HEAD, remap base to new IDs
-    // This matches current auto-import collision resolution
-
-    headIDs := make(map[string]bool)
-    for _, issue := range conflict.HeadIssues {
-        headIDs[issue.ID] = true
-    }
-
-    var resolutions []Resolution
-    for _, issue := range conflict.BaseIssues {
-        if headIDs[issue.ID] {
-            // ID collision: remap base issue to next available ID
-            newID := getNextAvailableID()
-            resolutions = append(resolutions, Resolution{
-                Action: "remap",
-                OldID:  issue.ID,
-                NewID:  newID,
-                Reason: fmt.Sprintf("ID %s exists in both branches", issue.ID),
-            })
-        }
-    }
-
-    return resolutions[0] // Simplified for example
-}
-
-func resolveConflictWithAI(conflict ConflictBlock) (Resolution, error) {
-    // Call AI to analyze conflict and suggest resolution
-
-    prompt := fmt.Sprintf(`
-You are resolving a git merge conflict in a beads issue tracker JSONL file.
-
-HEAD issues (current branch):
-%s
-
-BASE issues (incoming branch):
-%s
-
-Analyze these conflicts and suggest ONE of:
-1. "remap" - Issues are different, keep both but remap IDs
-2. "merge" - Issues are similar, merge into one
-3. "keep-head" - HEAD version is correct, discard BASE
-4. "keep-base" - BASE version is correct, discard HEAD
-
-Respond in JSON format:
-{
-    "action": "remap|merge|keep-head|keep-base",
-    "reason": "explanation",
-    "merged_issue": {...}  // Only if action=merge
-}
-`, formatIssues(conflict.HeadIssues), formatIssues(conflict.BaseIssues))
-
-    // Call AI (via environment-configured API)
-    response, err := callAIAPI(prompt)
-    if err != nil {
-        return Resolution{}, err
-    }
-
-    // Parse response
-    var resolution Resolution
-    if err := json.Unmarshal([]byte(response), &resolution); err != nil {
-        return Resolution{}, err
-    }
-
-    return resolution, nil
-}
-
-func applyResolutions(jsonlPath string, conflicts []ConflictBlock, resolutions []Resolution) error {
-    // Read entire JSONL
-    allIssues, err := readJSONL(jsonlPath)
-    if err != nil {
-        return err
-    }
-
-    // Apply resolutions
-    for i, resolution := range resolutions {
-        conflict := conflicts[i]
-
-        switch resolution.Action {
-        case "remap":
-            // Remap IDs and update references
-            remapIssueID(allIssues, resolution.OldID, resolution.NewID)
-
-        case "merge":
-            // Replace both with merged issue
-            replaceIssues(allIssues, conflict.HeadIssues, conflict.BaseIssues, resolution.Merged)
-
-        case "keep-head":
-            // Remove base issues
-            removeIssues(allIssues, conflict.BaseIssues)
-
-        case "keep-base":
-            // Remove head issues
-            removeIssues(allIssues, conflict.HeadIssues)
-        }
-    }
-
-    // Write back to JSONL (atomic)
-    return writeJSONL(jsonlPath, allIssues)
-}
-```
-
-**AI Integration:**
-
-```go
-// internal/ai/client.go (new package)
-package ai
-
-import (
-    "context"
-    "fmt"
-    "os"
-)
-
-type Client struct {
-    provider string // "anthropic", "openai", "ollama"
-    apiKey   string
-    model    string
-}
-
-func NewClient() (*Client, error) {
-    provider := os.Getenv("BEADS_AI_PROVIDER") // "anthropic" (default)
-    apiKey := os.Getenv("BEADS_AI_API_KEY")    // Required for cloud providers
-    model := os.Getenv("BEADS_AI_MODEL")       // "claude-3-5-sonnet-20241022" (default)
-
-    if provider == "" {
-        provider = "anthropic"
-    }
-
-    if apiKey == "" && provider != "ollama" {
-        return nil, fmt.Errorf("BEADS_AI_API_KEY required for provider %s", provider)
-    }
-
-    return &Client{
-        provider: provider,
-        apiKey:   apiKey,
-        model:    model,
-    }, nil
-}
-
-func (c *Client) Complete(ctx context.Context, prompt string) (string, error) {
-    switch c.provider {
-    case "anthropic":
-        return c.callAnthropic(ctx, prompt)
-    case "openai":
-        return c.callOpenAI(ctx, prompt)
-    case "ollama":
-        return c.callOllama(ctx, prompt)
-    default:
-        return "", fmt.Errorf("unknown AI provider: %s", c.provider)
-    }
-}
-
-func (c *Client) callAnthropic(ctx context.Context, prompt string) (string, error) {
-    // Use anthropic-go SDK
-    // Implementation omitted for brevity
-    return "", nil
-}
-```
-
-**Configuration:**
-
-```bash
-# ~/.config/beads/ai.conf (optional)
-BEADS_AI_PROVIDER=anthropic
-BEADS_AI_API_KEY=sk-ant-...
-BEADS_AI_MODEL=claude-3-5-sonnet-20241022
-
-# Or use local Ollama
-BEADS_AI_PROVIDER=ollama
-BEADS_AI_MODEL=llama2
-```
-
-**Example usage:**
-
-```bash
-# Detect conflicts (shows summary, doesn't modify)
-$ bd resolve-conflicts
-Found 3 conflicts in beads.jsonl:
-
-Conflict 1 (lines 42-47):
-  HEAD: bd-42 "Add authentication" (created by alice)
-  BASE: bd-42 "Fix logging" (created by bob)
-  → Recommendation: REMAP (different issues, same ID)
-
-Conflict 2 (lines 103-108):
-  HEAD: bd-87 "Update docs for API"
-  BASE: bd-87 "Update docs for API v2"
-  → Recommendation: MERGE (similar, minor differences)
-
-Conflict 3 (lines 234-239):
-  HEAD: bd-156 "Refactor parser"
-  BASE: bd-156 "Refactor parser" (identical)
-  → Recommendation: KEEP-HEAD (identical content)
-
-Run 'bd resolve-conflicts --auto' to apply recommendations.
-Run 'bd resolve-conflicts --interactive' to review each conflict.
-
-# Auto-resolve with AI
-$ bd resolve-conflicts --auto --ai
-Resolving 3 conflicts...
-✓ Conflict 1: Remapped bd-42 (BASE) → bd-200
-✓ Conflict 2: Merged into bd-87 (combined descriptions)
-✓ Conflict 3: Kept HEAD version (identical)
-
-Updated beads.jsonl (conflicts resolved)
-Next steps:
-  1. Review changes: git diff beads.jsonl
-  2. Import to database: bd import
-  3. Commit resolution: git add beads.jsonl && git commit
-
-# Interactive mode
-$ bd resolve-conflicts --interactive
-Conflict 1 of 3 (lines 42-47):
-
-  HEAD: bd-42 "Add authentication"
-    Created: 2025-10-20 by alice
-    Status: in_progress
-    Labels: feature, security
-
-  BASE: bd-42 "Fix logging"
-    Created: 2025-10-21 by bob
-    Status: open
-    Labels: bug, logging
-
-AI Recommendation: REMAP (different issues, same ID)
-Reason: Issues have different topics (auth vs logging) and authors
-
-Choose action:
-  1) Remap BASE to new ID (recommended)
-  2) Merge into one issue
-  3) Keep HEAD, discard BASE
-  4) Keep BASE, discard HEAD
-  5) Skip (resolve manually)
-
-Your choice [1-5]: 1
-
-✓ Will remap BASE bd-42 → bd-200
-
-Continue to next conflict? [Y/n]:
-```
-
-### 2. `bd find-duplicates` - AI-Powered Duplicate Detection
-
-**Purpose:** Find semantically duplicate issues across the database
-
-**Usage:**
-```bash
-# Find all duplicates
-bd find-duplicates
-
-# Find duplicates with specific threshold
-bd find-duplicates --threshold 0.8
-
-# Auto-merge duplicates (requires confirmation)
-bd find-duplicates --merge
-```
-
-**Implementation:**
-
-```go
-// cmd/bd/find_duplicates.go (new file)
-package main
-
-import (
-    "context"
-    "fmt"
-
-    "github.com/steveyegge/beads/internal/ai"
-    "github.com/steveyegge/beads/internal/storage"
-    "github.com/steveyegge/beads/internal/types"
-)
-
-type DuplicateGroup struct {
-    Issues     []*types.Issue
-    Similarity float64
-    Reason     string
-}
-
-func findDuplicates(ctx context.Context, store storage.Storage, useAI bool, threshold float64) ([]DuplicateGroup, error) {
-    // Get all open issues
-    issues, err := store.ListIssues(ctx, storage.ListOptions{
-        Status: []string{"open", "in_progress"},
-    })
-    if err != nil {
-        return nil, err
-    }
-
-    if !useAI {
-        // Mechanical approach: exact title match
-        return findDuplicatesMechanical(issues), nil
-    }
-
-    // AI approach: semantic similarity
-    return findDuplicatesWithAI(ctx, issues, threshold)
-}
-
-func findDuplicatesMechanical(issues []*types.Issue) []DuplicateGroup {
-    // Group by normalized title
-    titleMap := make(map[string][]*types.Issue)
-
-    for _, issue := range issues {
-        normalized := normalizeTitle(issue.Title)
-        titleMap[normalized] = append(titleMap[normalized], issue)
-    }
-
-    var groups []DuplicateGroup
-    for _, group := range titleMap {
-        if len(group) > 1 {
-            groups = append(groups, DuplicateGroup{
-                Issues:     group,
-                Similarity: 1.0, // Exact match
-                Reason:     "Identical titles",
-            })
-        }
-    }
-
-    return groups
-}
-
-func findDuplicatesWithAI(ctx context.Context, issues []*types.Issue, threshold float64) ([]DuplicateGroup, error) {
-    aiClient, err := ai.NewClient()
-    if err != nil {
-        return nil, fmt.Errorf("AI client unavailable: %v (set BEADS_AI_API_KEY)", err)
-    }
-
-    var groups []DuplicateGroup
-
-    // Compare all pairs (N^2, but issues typically <1000)
-    for i := 0; i < len(issues); i++ {
-        for j := i + 1; j < len(issues); j++ {
-            similarity, reason, err := compareIssues(ctx, aiClient, issues[i], issues[j])
-            if err != nil {
-                continue // Skip on error
-            }
-
-            if similarity >= threshold {
-                groups = append(groups, DuplicateGroup{
-                    Issues:     []*types.Issue{issues[i], issues[j]},
-                    Similarity: similarity,
-                    Reason:     reason,
-                })
-            }
-        }
-    }
-
-    return groups, nil
-}
-
-func compareIssues(ctx context.Context, client *ai.Client, issue1, issue2 *types.Issue) (float64, string, error) {
-    prompt := fmt.Sprintf(`
-Compare these two issues and determine if they are duplicates.
-
-Issue 1: %s
-Title: %s
-Description: %s
-Labels: %v
-Status: %s
-
-Issue 2: %s
-Title: %s
-Description: %s
-Labels: %v
-Status: %s
-
-Respond in JSON:
-{
-    "similarity": 0.0-1.0,
-    "reason": "explanation",
-    "is_duplicate": true/false
-}
-`, issue1.ID, issue1.Title, issue1.Description, issue1.Labels, issue1.Status,
-   issue2.ID, issue2.Title, issue2.Description, issue2.Labels, issue2.Status)
-
-    response, err := client.Complete(ctx, prompt)
-    if err != nil {
-        return 0, "", err
-    }
-
-    var result struct {
-        Similarity  float64 `json:"similarity"`
-        Reason      string  `json:"reason"`
-        IsDuplicate bool    `json:"is_duplicate"`
-    }
-
-    if err := json.Unmarshal([]byte(response), &result); err != nil {
-        return 0, "", err
-    }
-
-    return result.Similarity, result.Reason, nil
-}
-```
-
-**Optimization for large databases:**
-
-For databases with >1000 issues, N^2 comparison is too slow. Use **embedding-based similarity**:
-
-```go
-// Use OpenAI embeddings or local model
-func findDuplicatesWithEmbeddings(ctx context.Context, issues []*types.Issue, threshold float64) ([]DuplicateGroup, error) {
-    // 1. Generate embeddings for all issues
-    embeddings := make([][]float64, len(issues))
-    for i, issue := range issues {
-        text := fmt.Sprintf("%s\n%s", issue.Title, issue.Description)
-        embedding, err := generateEmbedding(ctx, text)
-        if err != nil {
-            return nil, err
-        }
-        embeddings[i] = embedding
-    }
-
-    // 2. Find similar pairs using cosine similarity
-    var groups []DuplicateGroup
-    for i := 0; i < len(embeddings); i++ {
-        for j := i + 1; j < len(embeddings); j++ {
-            similarity := cosineSimilarity(embeddings[i], embeddings[j])
-            if similarity >= threshold {
-                groups = append(groups, DuplicateGroup{
-                    Issues:     []*types.Issue{issues[i], issues[j]},
-                    Similarity: similarity,
-                    Reason:     "Semantic similarity via embeddings",
-                })
-            }
-        }
-    }
-
-    return groups, nil
-}
-
-func generateEmbedding(ctx context.Context, text string) ([]float64, error) {
-    // Use OpenAI text-embedding-3-small or local model
-    // Returns 1536-dimensional vector
-    return nil, nil
-}
-
-func cosineSimilarity(a, b []float64) float64 {
-    var dotProduct, normA, normB float64
-    for i := range a {
-        dotProduct += a[i] * b[i]
-        normA += a[i] * a[i]
-        normB += b[i] * b[i]
-    }
-    return dotProduct / (math.Sqrt(normA) * math.Sqrt(normB))
-}
-```
-
-**Example usage:**
-
-```bash
-# Find duplicates (mechanical, no AI)
-$ bd find-duplicates --no-ai
-Found 2 potential duplicate groups:
-
-Group 1 (Similarity: 100%):
-  bd-42: "Fix memory leak in parser"
-  bd-87: "Fix memory leak in parser"
-  Reason: Identical titles
-
-Group 2 (Similarity: 100%):
-  bd-103: "Update documentation"
-  bd-145: "Update documentation"
-  Reason: Identical titles
-
-# Find duplicates with AI (semantic)
-$ bd find-duplicates --ai --threshold 0.75
-Found 4 potential duplicate groups:
-
-Group 1 (Similarity: 95%):
-  bd-42: "Fix memory leak in parser"
-  bd-87: "Parser memory leak needs fixing"
-  Reason: Same issue described differently
-
-Group 2 (Similarity: 88%):
-  bd-103: "Update API documentation"
-  bd-145: "Document new API endpoints"
-  Reason: Both about API docs, overlapping scope
-
-Group 3 (Similarity: 82%):
-  bd-200: "Optimize database queries"
-  bd-234: "Improve query performance"
-  Reason: Same goal (performance), different wording
-
-Group 4 (Similarity: 76%):
-  bd-301: "Add user authentication"
-  bd-312: "Implement login system"
-  Reason: Authentication and login are related features
-
-# Merge duplicates interactively
-$ bd find-duplicates --merge
-Found 2 duplicate groups. Review each:
-
-Group 1 (Similarity: 95%):
-  bd-42: "Fix memory leak in parser" (alice, 2025-10-20)
-    Status: in_progress
-    Labels: bug, performance
-    Comments: 3
-
-  bd-87: "Parser memory leak needs fixing" (bob, 2025-10-21)
-    Status: open
-    Labels: bug
-    Comments: 1
-
-Merge these issues? [y/N] y
-
-Choose canonical issue:
-  1) bd-42 (more activity, earlier)
-  2) bd-87
-Your choice [1-2]: 1
-
-✓ Merged bd-87 → bd-42
-  - Moved 1 comment from bd-87
-  - Added note: "Duplicate of bd-42"
-  - Closed bd-87 with reason: "duplicate"
-
-Continue to next group? [Y/n]:
-```
-
-### 3. `bd detect-pollution` - Test Issue Detector
-
-**Purpose:** Identify and clean up test issues that leaked into production database
-
-**Usage:**
-```bash
-# Detect test issues
-bd detect-pollution
-
-# Auto-delete with confirmation
-bd detect-pollution --clean
-
-# Export pollution report
-bd detect-pollution --report pollution.json
-```
-
-**Implementation:**
-
-```go
-// cmd/bd/detect_pollution.go (new file)
-package main
-
-import (
-    "context"
-    "regexp"
-    "strings"
-
-    "github.com/steveyegge/beads/internal/storage"
-    "github.com/steveyegge/beads/internal/types"
-)
-
-type PollutionIndicator struct {
-    Pattern string
-    Weight  float64
-}
-
-var pollutionPatterns = []PollutionIndicator{
-    {Pattern: `^test[-_]`, Weight: 0.9},                    // "test-issue-1"
-    {Pattern: `^benchmark[-_]`, Weight: 0.95},              // "benchmark-issue-42"
-    {Pattern: `^(?i)test\s+issue`, Weight: 0.85},           // "Test Issue 123"
-    {Pattern: `^(?i)dummy`, Weight: 0.8},                   // "Dummy issue"
-    {Pattern: `^(?i)sample`, Weight: 0.7},                  // "Sample issue"
-    {Pattern: `^(?i)todo.*test`, Weight: 0.75},             // "TODO test something"
-    {Pattern: `^issue\s+\d+$`, Weight: 0.6},                // "issue 123"
-    {Pattern: `^[A-Z]{4,}-\d+$`, Weight: 0.5},              // "JIRA-123" (might be import)
-}
-
-func detectPollution(ctx context.Context, store storage.Storage, useAI bool) ([]*types.Issue, error) {
-    allIssues, err := store.ListIssues(ctx, storage.ListOptions{})
-    if err != nil {
-        return nil, err
-    }
-
-    if !useAI {
-        // Mechanical approach: pattern matching
-        return detectPollutionMechanical(allIssues), nil
-    }
-
-    // AI approach: semantic classification
-    return detectPollutionWithAI(ctx, allIssues)
-}
-
-func detectPollutionMechanical(issues []*types.Issue) []*types.Issue {
-    var polluted []*types.Issue
-
-    for _, issue := range issues {
-        score := 0.0
-
-        // Check title against patterns
-        for _, indicator := range pollutionPatterns {
-            matched, _ := regexp.MatchString(indicator.Pattern, issue.Title)
-            if matched {
-                score = max(score, indicator.Weight)
-            }
-        }
-
-        // Additional heuristics
-        if len(issue.Title) < 10 {
-            score += 0.2 // Very short titles suspicious
-        }
-
-        if issue.Description == "" || issue.Description == issue.Title {
-            score += 0.1 // No description
-        }
-
-        if strings.Count(issue.Title, "test") > 1 {
-            score += 0.2 // Multiple "test" occurrences
-        }
-
-        // Threshold: 0.7
-        if score >= 0.7 {
-            polluted = append(polluted, issue)
-        }
-    }
-
-    return polluted
-}
-
-func detectPollutionWithAI(ctx context.Context, issues []*types.Issue) ([]*types.Issue, error) {
-    aiClient, err := ai.NewClient()
-    if err != nil {
-        return nil, err
-    }
-
-    // Batch issues for efficiency (classify 50 at a time)
-    batchSize := 50
-    var polluted []*types.Issue
-
-    for i := 0; i < len(issues); i += batchSize {
-        end := min(i+batchSize, len(issues))
-        batch := issues[i:end]
-
-        prompt := buildPollutionPrompt(batch)
-        response, err := aiClient.Complete(ctx, prompt)
-        if err != nil {
-            return nil, err
-        }
-
-        // Parse response: list of issue IDs classified as test pollution
-        pollutedIDs, err := parsePollutionResponse(response)
-        if err != nil {
-            continue
-        }
-
-        for _, issue := range batch {
-            for _, id := range pollutedIDs {
-                if issue.ID == id {
-                    polluted = append(polluted, issue)
-                }
-            }
-        }
-    }
-
-    return polluted, nil
-}
-
-func buildPollutionPrompt(issues []*types.Issue) string {
-    var builder strings.Builder
-    builder.WriteString("Identify test pollution in this issue list. Test issues have patterns like:\n")
-    builder.WriteString("- Titles starting with 'test', 'benchmark', 'sample'\n")
-    builder.WriteString("- Sequential numbering (test-1, test-2, ...)\n")
-    builder.WriteString("- Generic descriptions or no description\n")
-    builder.WriteString("- Created in rapid succession\n\n")
-    builder.WriteString("Issues:\n")
-
-    for _, issue := range issues {
-        fmt.Fprintf(&builder, "%s: %s (created: %s)\n", issue.ID, issue.Title, issue.CreatedAt)
-    }
-
-    builder.WriteString("\nRespond with JSON list of polluted issue IDs: {\"polluted\": [\"bd-1\", \"bd-2\"]}")
-    return builder.String()
-}
-```
-
-**Example usage:**
-
-```bash
-# Detect pollution
-$ bd detect-pollution
-Scanning 523 issues for test pollution...
-
-Found 47 potential test issues:
-
-High Confidence (score ≥ 0.9):
-  bd-100: "test-issue-1"
-  bd-101: "test-issue-2"
-  ...
-  bd-146: "benchmark-create-47"
-  (Total: 45 issues)
-
-Medium Confidence (score 0.7-0.9):
-  bd-200: "Quick test"
-  bd-301: "sample issue for testing"
-  (Total: 2 issues)
-
-Recommendation: Review and clean up these issues.
-Run 'bd detect-pollution --clean' to delete them (with confirmation).
-
-# Clean up
-$ bd detect-pollution --clean
-Found 47 test issues. Delete them? [y/N] y
-
-Deleting 47 issues...
-✓ Deleted bd-100 through bd-146
-✓ Deleted bd-200, bd-301
-
-Cleanup complete. Exported deleted issues to .beads/pollution-backup.jsonl
-(Run 'bd import .beads/pollution-backup.jsonl' to restore if needed)
-```
-
-### 4. `bd repair-deps` - Orphaned Dependency Cleaner
-
-**Purpose:** Find and fix orphaned dependency references
-
-**Usage:**
-```bash
-# Find orphans
-bd repair-deps
-
-# Auto-fix (remove orphaned references)
-bd repair-deps --fix
-
-# Interactive
-bd repair-deps --interactive
-```
-
-**Implementation:**
-
-```go
-// cmd/bd/repair_deps.go (new file)
-package main
-
-import (
-    "context"
-    "fmt"
-
-    "github.com/steveyegge/beads/internal/storage"
-    "github.com/steveyegge/beads/internal/types"
-)
-
-type OrphanedDependency struct {
-    Issue      *types.Issue
-    OrphanedID string
-}
-
-func findOrphanedDeps(ctx context.Context, store storage.Storage) ([]OrphanedDependency, error) {
-    allIssues, err := store.ListIssues(ctx, storage.ListOptions{})
-    if err != nil {
-        return nil, err
-    }
-
-    // Build ID existence map
-    existingIDs := make(map[string]bool)
-    for _, issue := range allIssues {
-        existingIDs[issue.ID] = true
-    }
-
-    // Find orphans
-    var orphaned []OrphanedDependency
-    for _, issue := range allIssues {
-        for _, depID := range issue.DependsOn {
-            if !existingIDs[depID] {
-                orphaned = append(orphaned, OrphanedDependency{
-                    Issue:      issue,
-                    OrphanedID: depID,
-                })
-            }
-        }
-    }
-
-    return orphaned, nil
-}
-
-func repairOrphanedDeps(ctx context.Context, store storage.Storage, orphaned []OrphanedDependency, autoFix bool) error {
-    for _, o := range orphaned {
-        if autoFix {
-            // Remove orphaned dependency
-            newDeps := removeString(o.Issue.DependsOn, o.OrphanedID)
-            o.Issue.DependsOn = newDeps
-
-            if err := store.UpdateIssue(ctx, o.Issue); err != nil {
-                return err
-            }
-
-            fmt.Printf("✓ Removed orphaned dependency %s from %s\n", o.OrphanedID, o.Issue.ID)
-        } else {
-            fmt.Printf("Found orphan: %s depends on non-existent %s\n", o.Issue.ID, o.OrphanedID)
-        }
-    }
-
-    return nil
-}
-```
-
-**Example usage:**
-
-```bash
-# Find orphaned deps
-$ bd repair-deps
-Scanning dependencies...
-
-Found 3 orphaned dependencies:
-
-  bd-42: depends on bd-10 (deleted)
-  bd-87: depends on bd-25 (deleted)
-  bd-103: depends on bd-25 (deleted)
-
-Run 'bd repair-deps --fix' to remove these references.
-
-# Auto-fix
-$ bd repair-deps --fix
-✓ Removed bd-10 from bd-42 dependencies
-✓ Removed bd-25 from bd-87 dependencies
-✓ Removed bd-25 from bd-103 dependencies
-
-Repaired 3 issues.
-```
-
-### 5. `bd validate` - Comprehensive Health Check
-
-**Purpose:** Run all validation checks in one command
-
-**Usage:**
-```bash
-# Run all checks
-bd validate
-
-# Auto-fix all issues
-bd validate --fix-all
-
-# Specific checks
-bd validate --checks=duplicates,orphans,pollution
-```
-
-**Implementation:**
-
-```go
-// cmd/bd/validate.go (new file)
-package main
-
-import (
-    "context"
-    "fmt"
-
-    "github.com/steveyegge/beads/internal/storage"
-)
-
-func runValidation(ctx context.Context, store storage.Storage, checks []string, autoFix bool) error {
-    results := ValidationResults{}
-
-    for _, check := range checks {
-        switch check {
-        case "duplicates":
-            groups, err := findDuplicates(ctx, store, false, 1.0)
-            if err != nil {
-                return err
-            }
-            results.Duplicates = len(groups)
-
-        case "orphans":
-            orphaned, err := findOrphanedDeps(ctx, store)
-            if err != nil {
-                return err
-            }
-            results.Orphans = len(orphaned)
-            if autoFix {
-                repairOrphanedDeps(ctx, store, orphaned, true)
-            }
-
-        case "pollution":
-            polluted, err := detectPollution(ctx, store, false)
-            if err != nil {
-                return err
-            }
-            results.Pollution = len(polluted)
-
-        case "conflicts":
-            jsonlPath := findJSONLPath()
-            conflicts, err := detectConflicts(jsonlPath)
-            if err != nil {
-                return err
-            }
-            results.Conflicts = len(conflicts)
-        }
-    }
-
-    results.Print()
-    return nil
-}
-
-type ValidationResults struct {
-    Duplicates int
-    Orphans    int
-    Pollution  int
-    Conflicts  int
-}
-
-func (r ValidationResults) Print() {
-    fmt.Println("\nValidation Results:")
-    fmt.Println("===================")
-    fmt.Printf("Duplicates:    %d\n", r.Duplicates)
-    fmt.Printf("Orphans:       %d\n", r.Orphans)
-    fmt.Printf("Pollution:     %d\n", r.Pollution)
-    fmt.Printf("Conflicts:     %d\n", r.Conflicts)
-
-    total := r.Duplicates + r.Orphans + r.Pollution + r.Conflicts
-    if total == 0 {
-        fmt.Println("\n✓ Database is healthy!")
-    } else {
-        fmt.Printf("\n⚠ Found %d issues to fix\n", total)
-    }
-}
-```
-
-**Example usage:**
-
-```bash
-$ bd validate
-Running validation checks...
-
-✓ Checking for duplicates... found 2 groups
-✓ Checking for orphaned dependencies... found 3
-✓ Checking for test pollution... found 0
-✓ Checking for git conflicts... found 1
-
-Validation Results:
-===================
-Duplicates:    2
-Orphans:       3
-Pollution:     0
-Conflicts:     1
-
-⚠ Found 6 issues to fix
-
-Recommendations:
-  - Run 'bd find-duplicates --merge' to handle duplicates
-  - Run 'bd repair-deps --fix' to remove orphaned dependencies
-  - Run 'bd resolve-conflicts' to resolve git conflicts
-
-$ bd validate --fix-all
-Running validation with auto-fix...
-✓ Fixed 3 orphaned dependencies
-✓ Resolved 1 git conflict (mechanical)
-
-2 duplicate groups require manual review.
-Run 'bd find-duplicates --merge' to handle them interactively.
-```
-
-## Agent Integration
-
-### MCP Server Functions
-
-Add these as MCP functions for easy agent access:
-
-```python
-# integrations/beads-mcp/src/beads_mcp/server.py
-
-@server.call_tool()
-async def beads_resolve_conflicts(auto: bool = False, ai: bool = True) -> list:
-    """Resolve git merge conflicts in JSONL file"""
-    result = subprocess.run(
-        ["bd", "resolve-conflicts"] +
-        (["--auto"] if auto else []) +
-        (["--ai"] if ai else []) +
-        ["--json"],
-        capture_output=True,
-        text=True
-    )
-    return json.loads(result.stdout)
-
-@server.call_tool()
-async def beads_find_duplicates(ai: bool = True, threshold: float = 0.8) -> list:
-    """Find duplicate issues using AI or mechanical matching"""
-    result = subprocess.run(
-        ["bd", "find-duplicates"] +
-        (["--ai"] if ai else ["--no-ai"]) +
-        ["--threshold", str(threshold), "--json"],
-        capture_output=True,
-        text=True
-    )
-    return json.loads(result.stdout)
-
-@server.call_tool()
-async def beads_detect_pollution() -> list:
-    """Detect test issues that leaked into production"""
-    result = subprocess.run(
-        ["bd", "detect-pollution", "--json"],
-        capture_output=True,
-        text=True
-    )
-    return json.loads(result.stdout)
-
-@server.call_tool()
-async def beads_validate(fix_all: bool = False) -> dict:
-    """Run all validation checks"""
-    result = subprocess.run(
-        ["bd", "validate"] +
-        (["--fix-all"] if fix_all else []) +
-        ["--json"],
-        capture_output=True,
-        text=True
-    )
-    return json.loads(result.stdout)
-```
-
-### Agent Workflow
-
-**Typical agent repair workflow:**
-
-```
-1. Agent notices issue (e.g., git merge conflict error)
-2. Agent calls: mcp__beads__resolve_conflicts(auto=True, ai=True)
-3. If successful:
-   - Agent reports: "Resolved 3 conflicts, remapped 1 ID"
-   - Agent continues work
-4. If fails:
-   - Agent calls: mcp__beads__resolve_conflicts() for report
-   - Agent asks user for guidance
-```
-
-**Proactive validation:**
-
-```
-At session start, agent can:
-1. Call: mcp__beads__validate()
-2. If issues found:
-   - Report to user: "Found 3 orphaned deps and 2 duplicates"
-   - Ask: "Should I fix these?"
-3. If user approves:
-   - Call: mcp__beads__validate(fix_all=True)
-   - Report: "Fixed 3 orphans, 2 duplicates need manual review"
-```
-
-## Cost Considerations
-
-### AI API Costs
-
-**Claude 3.5 Sonnet pricing (2025):**
-- Input: $3.00 / 1M tokens
-- Output: $15.00 / 1M tokens
-
-**Typical usage:**
-
-1. **Resolve conflicts** (~500 tokens per conflict)
-   - Cost: ~$0.0075 per conflict
-   - 10 conflicts/day = $0.075/day = $2.25/month
-
-2. **Find duplicates** (~200 tokens per comparison)
-   - Cost: ~$0.003 per comparison
-   - 100 issues = 4,950 comparisons = $15/run
-   - **Too expensive!** Use embeddings instead
-
-3. **Embeddings approach** (text-embedding-3-small)
-   - $0.02 / 1M tokens
-   - 100 issues × 100 tokens = 10K tokens = $0.0002/run
-   - **Much cheaper!**
-
-**Recommendations:**
-- Use AI for conflict resolution (low frequency, high value)
-- Use embeddings for duplicate detection (high frequency, needs scale)
-- Use mechanical checks by default, AI as opt-in
-
-### Local AI Option
-
-For users who want to avoid API costs:
-
-```bash
-# Use Ollama (free, local)
-BEADS_AI_PROVIDER=ollama
-BEADS_AI_MODEL=llama3.2
-
-# Or use local embedding model
-BEADS_EMBEDDING_PROVIDER=local
-BEADS_EMBEDDING_MODEL=all-MiniLM-L6-v2  # 384-dimensional, fast
-```
-
-## Implementation Roadmap
-
-### Phase 1: Mechanical Commands (2-3 weeks)
-- [ ] `bd repair-deps` (orphaned dependency cleaner)
-- [ ] `bd detect-pollution` (pattern-based test detection)
-- [ ] `bd resolve-conflicts` (mechanical ID remapping)
-- [ ] `bd validate` (run all checks)
-
-### Phase 2: AI Integration (2-3 weeks)
-- [ ] Add `internal/ai` package
-- [ ] Implement Anthropic, OpenAI, Ollama providers
-- [ ] Add `--ai` flag to commands
-- [ ] Test with real conflicts/duplicates
-
-### Phase 3: Embeddings (1-2 weeks)
-- [ ] Add embedding generation
-- [ ] Implement cosine similarity search
-- [ ] Optimize for large databases (>1K issues)
-- [ ] Benchmark performance
-
-### Phase 4: MCP Integration (1 week)
-- [ ] Add MCP functions for all repair commands
-- [ ] Update beads-mcp documentation
-- [ ] Add examples to AGENTS.md
-
-### Phase 5: Polish (1 week)
-- [ ] Add `--json` output for all commands
-- [ ] Improve error messages
-- [ ] Add progress indicators for slow operations
-- [ ] Write comprehensive tests
-
-**Total timeline: 7-10 weeks**
-
-## Success Metrics
-
-### Quantitative
-- ✅ Agent repair time reduced by >50%
-- ✅ Manual interventions reduced by >70%
-- ✅ Conflict resolution time <30 seconds
-- ✅ Duplicate detection accuracy >90%
-
-### Qualitative
-- ✅ Agents report fewer "stuck" situations
-- ✅ Users spend less time on database maintenance
-- ✅ Fewer support requests about database issues
-
-## Open Questions
-
-1. **Should repair commands auto-run in daemon?**
-   - Recommendation: No, too risky. On-demand only.
-
-2. **Should agents proactively run validation?**
-   - Recommendation: Yes, at session start (with user notification)
-
-3. **What AI provider should be default?**
-   - Recommendation: None (mechanical by default), user opts in
-
-4. **Should duplicate detection be continuous?**
-   - Recommendation: No, run on-demand or weekly scheduled
-
-5. **How to handle false positives in pollution detection?**
-   - Recommendation: Always confirm before deleting, backup to JSONL
-
-## Conclusion
-
-Repair commands address the **root cause of agent repair burden**: lack of specialized tools for common maintenance tasks. By providing `bd resolve-conflicts`, `bd find-duplicates`, `bd detect-pollution`, and `bd validate`, we:
-
-✅ Reduce agent time from 5-10 minutes to <30 seconds per repair
-✅ Provide consistent repair logic across sessions
-✅ Enable proactive validation instead of reactive fixing
-✅ Allow AI assistance where valuable (conflicts, duplicates) while keeping mechanical checks fast
-
-Combined with event-driven daemon (instant feedback), these tools should significantly reduce the "not as much in the background as I'd like" pain.