diff --git a/.claude/test-strategy.md b/.claude/test-strategy.md
new file mode 100644
index 00000000..f678da56
--- /dev/null
+++ b/.claude/test-strategy.md
@@ -0,0 +1,103 @@
+# Test Running Strategy for Claude Code
+
+## Critical Rules
+
+1. **ALWAYS use `./scripts/test.sh` instead of `go test` directly**
+   - It automatically skips broken tests from `.test-skip`
+   - Uses appropriate timeouts (3m default)
+   - Consistent with human developers and CI/CD
+
+2. **Use `-run` to target specific tests when developing features**
+   ```bash
+   # Good: When working on feature X
+   ./scripts/test.sh -run TestFeatureX ./cmd/bd/...
+
+   # Avoid: Running full suite unnecessarily
+   ./scripts/test.sh ./...
+   ```
+
+3. **Understand the bottleneck: COMPILATION not EXECUTION**
+   - 180s compilation time vs 3.8s actual test execution (cmd/bd)
+   - Running subset of tests doesn't save much time (still recompiles)
+   - But use `-run` anyway to avoid seeing unrelated failures
+
+## Common Commands
+
+```bash
+# Full test suite (what 'make test' runs)
+./scripts/test.sh
+
+# Test specific package
+./scripts/test.sh ./cmd/bd/...
+./scripts/test.sh ./internal/storage/sqlite/...
+
+# Test specific feature
+./scripts/test.sh -run TestCreate ./cmd/bd/...
+./scripts/test.sh -run TestImport
+
+# Verbose output (when debugging)
+./scripts/test.sh -v -run TestSpecificTest
+```
+
+## When Tests Fail
+
+1. **Check if it's a known broken test:**
+   ```bash
+   cat .test-skip
+   ```
+
+2. **If it's new, investigate:**
+   - Read the test failure message
+   - Run with `-v` for more detail
+   - Check if recent code changes broke it
+
+3. **If unfixable now:**
+   - File GitHub issue with details
+   - Add to `.test-skip` with issue reference
+   - Document in commit message
+
+## Package Size Context
+
+The `cmd/bd` package is LARGE:
+- 41,696 lines of code
+- 205 files (82 test files)
+- 313 individual tests
+- Compilation takes ~180 seconds
+
+This is why:
+- Compilation is slow
+- Test script uses 3-minute timeout
+- Targeting specific tests is important
+
+## Environment Variables
+
+Use these when needed:
+
+```bash
+# Custom timeout
+TEST_TIMEOUT=5m ./scripts/test.sh
+
+# Verbose by default
+TEST_VERBOSE=1 ./scripts/test.sh
+
+# Run pattern
+TEST_RUN=TestSomething ./scripts/test.sh
+```
+
+## Quick Reference
+
+| Task | Command |
+|------|---------|
+| Run all tests | `make test` or `./scripts/test.sh` |
+| Test one package | `./scripts/test.sh ./cmd/bd/...` |
+| Test one function | `./scripts/test.sh -run TestName` |
+| Verbose output | `./scripts/test.sh -v` |
+| Custom timeout | `./scripts/test.sh -timeout 10m` |
+| Skip additional test | `./scripts/test.sh -skip TestFoo` |
+
+## Remember
+
+- The test script is in `.gitignore` path: `scripts/test.sh`
+- Skip list is in repo root: `.test-skip`
+- Full documentation: `docs/TESTING.md`
+- Current broken tests: See GH issues #355, #356
diff --git a/.test-skip b/.test-skip
new file mode 100644
index 00000000..0c4354b8
--- /dev/null
+++ b/.test-skip
@@ -0,0 +1,8 @@
+# Tests to skip due to known issues
+# Format: one test name per line (regex patterns supported)
+
+# Issue #355: Deadlocks with database mutex during cleanup
+TestFallbackToDirectModeEnablesFlush
+
+# Issue #356: Expects wrong JSONL filename (issues.jsonl vs beads.jsonl)
+TestFindJSONLPathDefault
diff --git a/Makefile b/Makefile
index 489d6cf3..fe5dd184 100644
--- a/Makefile
+++ b/Makefile
@@ -10,10 +10,10 @@ build:
 	@echo "Building bd..."
 	go build -o bd ./cmd/bd
 
-# Run all tests
+# Run all tests (skips known broken tests listed in .test-skip)
 test:
 	@echo "Running tests..."
-	go test ./...
+	@./scripts/test.sh
 
 # Run performance benchmarks (10K and 20K issue databases with automatic CPU profiling)
 # Generates CPU profile: internal/storage/sqlite/bench-cpu-<timestamp>.prof
diff --git a/docs/TESTING.md b/docs/TESTING.md
new file mode 100644
index 00000000..28e279f5
--- /dev/null
+++ b/docs/TESTING.md
@@ -0,0 +1,186 @@
+# Testing Guide
+
+## Overview
+
+The beads project has a comprehensive test suite with **~41,000 lines of code** across **205 files** in `cmd/bd` alone.
+
+## Test Performance
+
+- **Total test time:** ~3 minutes (excluding broken tests)
+- **Package count:** 20+ packages with tests
+- **Compilation overhead:** ~180 seconds (most of the total time)
+- **Individual test time:** Only ~3.8 seconds combined for all 313 tests in cmd/bd
+
+## Running Tests
+
+### Quick Start
+
+```bash
+# Run all tests (auto-skips known broken tests)
+make test
+
+# Or directly:
+./scripts/test.sh
+
+# Run specific package
+./scripts/test.sh ./cmd/bd/...
+
+# Run specific test pattern
+./scripts/test.sh -run TestCreate ./cmd/bd/...
+
+# Verbose output
+./scripts/test.sh -v
+```
+
+### Environment Variables
+
+```bash
+# Set custom timeout (default: 3m)
+TEST_TIMEOUT=5m ./scripts/test.sh
+
+# Enable verbose output
+TEST_VERBOSE=1 ./scripts/test.sh
+
+# Run specific pattern
+TEST_RUN=TestCreate ./scripts/test.sh
+```
+
+### Advanced Usage
+
+```bash
+# Skip additional tests beyond .test-skip
+./scripts/test.sh -skip SomeSlowTest
+
+# Run with custom timeout
+./scripts/test.sh -timeout 5m
+
+# Combine flags
+./scripts/test.sh -v -run TestCreate ./internal/beads/...
+```
+
+## Known Broken Tests
+
+Tests in `.test-skip` are automatically skipped. Current broken tests:
+
+1. **TestFallbackToDirectModeEnablesFlush** (GH #355)
+   - Location: `cmd/bd/direct_mode_test.go:14`
+   - Issue: Database deadlock, hangs for 5 minutes
+   - Impact: Makes test suite extremely slow
+
+2. **TestFindJSONLPathDefault** (GH #356)
+   - Location: `internal/beads/beads_test.go:175`
+   - Issue: Expects `issues.jsonl` but code returns `beads.jsonl`
+   - Impact: Assertion failure
+
+## For Claude Code / AI Agents
+
+When running tests during development:
+
+### Best Practices
+
+1. **Use the test script:** Always use `./scripts/test.sh` instead of `go test` directly
+   - Automatically skips known broken tests
+   - Uses appropriate timeouts
+   - Consistent with CI/CD
+
+2. **Target specific tests when possible:**
+   ```bash
+   # Instead of running everything:
+   ./scripts/test.sh
+
+   # Run just what you changed:
+   ./scripts/test.sh -run TestSpecificFeature ./cmd/bd/...
+   ```
+
+3. **Compilation is the bottleneck:**
+   - The 180-second compilation time dominates
+   - Individual tests are fast
+   - Use `-run` to avoid recompiling unnecessarily
+
+4. **Check for new failures:**
+   ```bash
+   # If you see a new failure, check if it's known:
+   cat .test-skip
+   ```
+
+### Adding Tests to Skip List
+
+If you discover a broken test:
+
+1. File a GitHub issue documenting the problem
+2. Add to `.test-skip`:
+   ```bash
+   # Issue #NNN: Brief description
+   TestNameToSkip
+   ```
+3. Tests in `.test-skip` support regex patterns
+
+## Test Organization
+
+### Slowest Tests (>0.05s)
+
+The top slow tests in cmd/bd:
+- `TestDoctorWithBeadsDir` (1.68s) - Only significantly slow test
+- `TestFlushManagerDebouncing` (0.21s)
+- `TestDebouncer_*` tests (0.06-0.12s each) - Intentional sleeps for concurrency testing
+- `TestMultiWorkspaceDeletionSync` (0.12s)
+
+Most tests are <0.01s and very fast.
+
+### Package Structure
+
+```
+cmd/bd/           - Main CLI tests (82 test files, most of the suite)
+internal/beads/   - Core beads library tests
+internal/storage/ - Storage backend tests (SQLite, memory)
+internal/rpc/     - RPC protocol tests
+internal/*/       - Various internal package tests
+```
+
+## Continuous Integration
+
+The test script is designed to work seamlessly with CI/CD:
+
+```yaml
+# Example GitHub Actions
+- name: Run tests
+  run: make test
+```
+
+## Debugging Test Failures
+
+### Get detailed output
+```bash
+./scripts/test.sh -v ./path/to/package/...
+```
+
+### Run a single test
+```bash
+./scripts/test.sh -run '^TestExactName$' ./cmd/bd/...
+```
+
+### Check which tests are being skipped
+```bash
+./scripts/test.sh 2>&1 | head -5
+```
+
+Output shows:
+```
+Running: go test -timeout 3m -skip TestFoo|TestBar ./...
+Skipping: TestFoo|TestBar
+```
+
+## Contributing
+
+When adding new tests:
+
+1. Keep tests fast (<0.1s if possible)
+2. Use `t.Parallel()` for independent tests
+3. Clean up resources in `t.Cleanup()` or `defer`
+4. Avoid sleeps unless testing concurrency
+
+When tests break:
+
+1. Fix them if possible
+2. If unfixable right now, file an issue and add to `.test-skip`
+3. Document the issue in `.test-skip` with issue number
diff --git a/scripts/test.sh b/scripts/test.sh
new file mode 100755
index 00000000..dc826936
--- /dev/null
+++ b/scripts/test.sh
@@ -0,0 +1,86 @@
+#!/usr/bin/env bash
+# Test runner that automatically skips known broken tests
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+SKIP_FILE="$REPO_ROOT/.test-skip"
+
+# Build skip pattern from .test-skip file
+build_skip_pattern() {
+    if [[ ! -f "$SKIP_FILE" ]]; then
+        echo ""
+        return
+    fi
+
+    # Read non-comment, non-empty lines and join with |
+    local pattern=$(grep -v '^#' "$SKIP_FILE" | grep -v '^[[:space:]]*$' | paste -sd '|' -)
+    echo "$pattern"
+}
+
+# Default values
+TIMEOUT="${TEST_TIMEOUT:-3m}"
+SKIP_PATTERN=$(build_skip_pattern)
+VERBOSE="${TEST_VERBOSE:-}"
+RUN_PATTERN="${TEST_RUN:-}"
+
+# Parse arguments
+PACKAGES=()
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        -v|--verbose)
+            VERBOSE="-v"
+            shift
+            ;;
+        -timeout)
+            TIMEOUT="$2"
+            shift 2
+            ;;
+        -run)
+            RUN_PATTERN="$2"
+            shift 2
+            ;;
+        -skip)
+            # Allow additional skip patterns
+            if [[ -n "$SKIP_PATTERN" ]]; then
+                SKIP_PATTERN="$SKIP_PATTERN|$2"
+            else
+                SKIP_PATTERN="$2"
+            fi
+            shift 2
+            ;;
+        *)
+            PACKAGES+=("$1")
+            shift
+            ;;
+    esac
+done
+
+# Default to all packages if none specified
+if [[ ${#PACKAGES[@]} -eq 0 ]]; then
+    PACKAGES=("./...")
+fi
+
+# Build go test command
+CMD=(go test -timeout "$TIMEOUT")
+
+if [[ -n "$VERBOSE" ]]; then
+    CMD+=(-v)
+fi
+
+if [[ -n "$SKIP_PATTERN" ]]; then
+    CMD+=(-skip "$SKIP_PATTERN")
+fi
+
+if [[ -n "$RUN_PATTERN" ]]; then
+    CMD+=(-run "$RUN_PATTERN")
+fi
+
+CMD+=("${PACKAGES[@]}")
+
+echo "Running: ${CMD[*]}" >&2
+echo "Skipping: $SKIP_PATTERN" >&2
+echo "" >&2
+
+exec "${CMD[@]}"