Add GitHub Issues migration script (bd-68)

- New gh2jsonl.py script supports GitHub API and JSON file import
- Maps GitHub labels to bd priority/type/status
- Preserves metadata, assignees, timestamps, external refs
- Auto-detects cross-references and creates dependencies
- Production-ready: User-Agent, rate limit handling, UTF-8 support
- Comprehensive README with examples and troubleshooting
- Tested and reviewed

Amp-Thread-ID: https://ampcode.com/threads/T-2fc85f05-302b-4fc9-8cac-63ac0e03c9af
Co-authored-by: Amp <amp@ampcode.com>
This commit is contained in:
Steve Yegge
2025-10-17 23:55:51 -07:00
parent 9fb46d41b8
commit 56a379dc5a
5 changed files with 744 additions and 6 deletions

View File

@@ -36,9 +36,7 @@
{"id":"bd-132","title":"Batch test 1","description":"","status":"open","priority":3,"issue_type":"task","created_at":"2025-10-17T21:01:21.047341-07:00","updated_at":"2025-10-17T21:01:21.047341-07:00"}
{"id":"bd-133","title":"Batch test 2","description":"","status":"open","priority":3,"issue_type":"task","created_at":"2025-10-17T21:01:21.055026-07:00","updated_at":"2025-10-17T21:01:21.055026-07:00"}
{"id":"bd-134","title":"Batch test 3","description":"","status":"open","priority":3,"issue_type":"task","created_at":"2025-10-17T21:01:21.055526-07:00","updated_at":"2025-10-17T21:01:21.055526-07:00"}
{"id":"bd-139","title":"Child issue","description":"","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-17T21:01:25.104232-07:00","updated_at":"2025-10-17T21:01:25.104232-07:00"}
{"id":"bd-14","title":"Add --resolve-collisions flag and user reporting","description":"Add import flags: --resolve-collisions (auto-fix) and --dry-run (preview). Display clear report: collisions detected, remappings applied (old→new with scores), reference counts updated. Default behavior: fail on collision (safe).","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.645323-07:00","closed_at":"2025-10-16T10:07:34.003238-07:00","dependencies":[{"issue_id":"bd-14","depends_on_id":"bd-48","type":"parent-child","created_at":"2025-10-16T21:51:08.923374-07:00","created_by":"renumber"}]}
{"id":"bd-144","title":"Parent with desc","description":"See [deleted:bd-143] for details","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-17T21:11:37.608527-07:00","updated_at":"2025-10-17T21:11:42.32958-07:00"}
{"id":"bd-148","title":"bd list shows 0 issues despite database containing 115 issues","description":"When running 'bd list --status all' it shows 'Found 0 issues' even though 'bd stats' shows 115 total issues and 'sqlite3 .beads/vc.db \"SELECT COUNT(*) FROM issues\"' returns 115.\n\nReproduction:\n1. cd ~/src/vc/vc\n2. bd stats # Shows 115 issues\n3. bd list --status all # Shows 0 issues\n4. sqlite3 .beads/vc.db 'SELECT COUNT(*) FROM issues;' # Shows 115\n\nExpected: bd list should show all 115 issues\nActual: Shows 'Found 0 issues:'\n\nThis occurs with both /opt/homebrew/bin/bd (v0.9.9) and ~/src/vc/adar/beads/bd (v0.9.10)","design":"Possible causes:\n- Default filter excluding all issues\n- Database query issue in list command\n- Auto-discovery finding wrong database (but stats works?)\n- Recent deletion operation corrupted some index","acceptance_criteria":"bd list --status all shows all issues that bd stats counts","status":"closed","priority":0,"issue_type":"bug","created_at":"2025-10-17T21:19:08.225181-07:00","updated_at":"2025-10-17T21:55:40.788625-07:00","closed_at":"2025-10-17T21:55:40.788625-07:00"}
{"id":"bd-149","title":"Confusing version mismatch warnings with contradictory messages","description":"The version mismatch warning shows contradictory messages depending on which binary version is used:\n\nWhen using v0.9.10 binary with v0.9.9 database:\n'Your bd binary (v0.9.10) differs from the database version (v0.9.9)'\n'Your binary appears to be OUTDATED.'\n\nWhen using v0.9.9 binary with v0.9.10 database:\n'Your bd binary (v0.9.9) differs from the database version (v0.9.10)'\n'Your binary appears NEWER than the database.'\n\nThe first message is incorrect - v0.9.10 \u003e v0.9.9, so the binary is NEWER, not outdated.\n\nReproduction:\n1. Use ~/src/vc/adar/beads/bd (v0.9.10) with a v0.9.9 database\n2. Observe warning says binary is OUTDATED when it's actually newer\n\nExpected: Correct version comparison\nActual: Inverted comparison logic","design":"Fix version comparison in warning message generation. Should compare semantic versions correctly.","acceptance_criteria":"Warning correctly identifies which component (binary vs database) is newer/older","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-10-17T21:19:19.540274-07:00","updated_at":"2025-10-17T22:14:27.015397-07:00","closed_at":"2025-10-17T22:14:27.015397-07:00"}
{"id":"bd-15","title":"Write comprehensive collision resolution tests","description":"Test cases: simple collision, multiple collisions, dependency updates, text reference updates, chain dependencies, edge cases (partial ID matches, case sensitivity, triple merges). Add to import_test.go and collision_test.go.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.647268-07:00","closed_at":"2025-10-16T10:07:34.007864-07:00","dependencies":[{"issue_id":"bd-15","depends_on_id":"bd-48","type":"parent-child","created_at":"2025-10-16T21:51:08.917092-07:00","created_by":"renumber"}]}
@@ -51,8 +49,6 @@
{"id":"bd-159","title":"Global daemon still requires database and runs sync loop","description":"The --global flag skips git repo check (line 80) but runDaemonLoop still calls FindDatabasePath (line 500-507) and opens a store (line 512). It also runs the single-repo sync loop (lines 563-620).\n\nOracle correctly identified this violates the spec: 'Don't require being in a git repo when --global is used'.\n\nFix: Global mode should skip DB open and sync loop entirely. It should be a pure RPC router that uses per-request context (bd-115) to route to the correct repo's DB.\n\nImpact: Users can't run 'bd daemon --global' outside a repo, defeating the purpose.","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-10-17T22:58:02.138008-07:00","updated_at":"2025-10-17T23:00:08.734632-07:00","closed_at":"2025-10-17T23:00:08.734632-07:00"}
{"id":"bd-16","title":"Update documentation for collision resolution","description":"Update README.md with collision resolution section. Update CLAUDE.md with new workflow. Document --resolve-collisions and --dry-run flags. Add example scenarios showing branch merge workflows.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.648113-07:00","closed_at":"2025-10-16T10:07:34.028648-07:00","dependencies":[{"issue_id":"bd-16","depends_on_id":"bd-48","type":"parent-child","created_at":"2025-10-16T21:51:08.924312-07:00","created_by":"renumber"}]}
{"id":"bd-160","title":"Global daemon should warn/reject --auto-commit and --auto-push","description":"When user runs 'bd daemon --global --auto-commit', it's unclear which repo the daemon will commit to (especially after fixing bd-122 where global daemon won't open a DB).\n\nOptions:\n1. Warn and ignore the flags in global mode\n2. Error out with clear message\n\nLine 87-91 already checks autoPush, but should skip check entirely for global mode. Add user-friendly messaging about flag incompatibility.","status":"closed","priority":3,"issue_type":"feature","created_at":"2025-10-17T22:58:02.137987-07:00","updated_at":"2025-10-17T23:04:30.223432-07:00","closed_at":"2025-10-17T23:04:30.223432-07:00"}
{"id":"bd-161","title":"Test A","description":"","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-17T23:06:55.010454-07:00","updated_at":"2025-10-17T23:06:55.010454-07:00"}
{"id":"bd-162","title":"Test B","description":"","status":"open","priority":2,"issue_type":"task","created_at":"2025-10-17T23:06:55.060581-07:00","updated_at":"2025-10-17T23:06:55.060581-07:00"}
{"id":"bd-163","title":"Test A","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-17T23:06:59.59343-07:00","updated_at":"2025-10-17T23:06:59.740704-07:00","closed_at":"2025-10-17T23:06:59.740704-07:00","dependencies":[{"issue_id":"bd-163","depends_on_id":"bd-164","type":"blocks","created_at":"2025-10-17T23:06:59.668292-07:00","created_by":"daemon"}]}
{"id":"bd-164","title":"Test B","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-17T23:06:59.626612-07:00","updated_at":"2025-10-17T23:06:59.744519-07:00","closed_at":"2025-10-17T23:06:59.744519-07:00"}
{"id":"bd-17","title":"bd should auto-detect .beads/*.db in current directory","description":"When bd is run without --db flag, it defaults to beads' own database instead of looking for a .beads/*.db file in the current working directory. This causes confusion when working on other projects that use beads for issue tracking (like vc).\n\nExpected behavior: bd should search for .beads/*.db in cwd and use that if found, before falling back to default beads database.\n\nExample: Running 'bd ready' in /Users/stevey/src/vc/vc/ should automatically find and use .beads/vc.db without requiring --db flag every time.","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.650584-07:00","closed_at":"2025-10-16T10:07:34.046944-07:00"}
@@ -111,14 +107,14 @@
{"id":"bd-65","title":"Test label dirty tracking","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.898188-07:00","closed_at":"2025-10-14T14:37:52.155733-07:00"}
{"id":"bd-66","title":"Test issue","description":"Testing prefix","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.902423-07:00","closed_at":"2025-10-14T13:39:55.828804-07:00","dependencies":[{"issue_id":"bd-66","depends_on_id":"bd-47","type":"parent-child","created_at":"2025-10-16T21:51:08.914487-07:00","created_by":"renumber"}]}
{"id":"bd-67","title":"Test hash-based import","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.905866-07:00","closed_at":"2025-10-14T13:39:56.958248-07:00"}
{"id":"bd-68","title":"Add migration scripts for GitHub Issues","description":"Create scripts to import from GitHub Issues API or exported JSON","status":"open","priority":2,"issue_type":"feature","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.91077-07:00","dependencies":[{"issue_id":"bd-68","depends_on_id":"bd-47","type":"parent-child","created_at":"2025-10-16T21:51:08.915539-07:00","created_by":"renumber"}]}
{"id":"bd-68","title":"Add migration scripts for GitHub Issues","description":"Create scripts to import from GitHub Issues API or exported JSON","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T23:51:47.390748-07:00","closed_at":"2025-10-17T23:51:47.390748-07:00","dependencies":[{"issue_id":"bd-68","depends_on_id":"bd-47","type":"parent-child","created_at":"2025-10-16T21:51:08.915539-07:00","created_by":"renumber"}]}
{"id":"bd-69","title":"Add test for deep hierarchy blocking (50+ levels)","description":"Current tests verify 2-level depth (grandparent → parent → child). The depth limit is hardcoded to 50 in the recursive CTE, but we don't test edge cases near that limit.\n\n**Test cases needed:**\n1. Verify 50-level deep hierarchy works correctly\n2. Verify depth limit prevents runaway recursion\n3. Measure performance impact of deep hierarchies\n4. Consider if 50 is the right limit (why not 100? why not 20?)\n\n**Rationale:**\n- Most hierarchies are 2-5 levels deep\n- But pathological cases (malicious or accidental) could create 50+ level nesting\n- Need to ensure graceful degradation, not catastrophic failure\n\n**Implementation:**\nAdd TestDeepHierarchyBlocking to ready_test.go","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.915131-07:00","closed_at":"2025-10-14T13:12:16.610152-07:00"}
{"id":"bd-7","title":"Test auto-export timing","description":"","status":"open","priority":4,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.599423-07:00"}
{"id":"bd-70","title":"Document hierarchical blocking behavior in README","description":"The fix for bd-65 changes user-visible behavior: children of blocked epics are now automatically blocked.\n\n**What needs documenting:**\n1. README.md dependency section should explain blocking propagation\n2. Clarify that 'blocks' + 'parent-child' together create transitive blocking\n3. Note that 'related' and 'discovered-from' do NOT propagate blocking\n4. Add example showing epic → child blocking propagation\n\n**Example to add:**\n```bash\n# If epic is blocked, children are too\nbd create \"Epic 1\" -t epic -p 1\nbd create \"Task 1\" -t task -p 1\nbd dep add task-1 epic-1 --type parent-child\n\n# Block the epic\nbd create \"Blocker\" -t task -p 0\nbd dep add epic-1 blocker-1 --type blocks\n\n# Now both epic-1 AND task-1 are blocked\nbd ready # Neither will show up\n```","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:01.020334-07:00","closed_at":"2025-10-14T13:10:38.482538-07:00"}
{"id":"bd-71","title":"Document versioning and release strategy","description":"Create comprehensive versioning strategy for beads ecosystem.\n\nComponents to document:\n1. bd CLI (Go binary) - main version number\n2. Plugin (Claude Code) - tracks CLI version\n3. MCP server (Python) - bundled with plugin\n4. Release workflow - how to sync all three\n\nDecisions to make:\n- Should plugin.json auto-update from bd CLI version?\n- Should we have a VERSION file at repo root?\n- How to handle breaking changes across components?\n- What's the update notification strategy?\n\nReferences:\n- plugin.json engines field now requires bd \u003e=0.9.0\n- /bd-version command added for checking compatibility\n- PLUGIN.md now documents update workflow","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.91836-07:00","closed_at":"2025-10-14T13:55:59.178075-07:00"}
{"id":"bd-72","title":"Create version bump script","description":"Create scripts/bump-version.sh to automate version syncing across all components.\n\nThe script should:\n1. Take a version number as argument (e.g., ./scripts/bump-version.sh 0.9.3)\n2. Update all version files:\n - cmd/bd/version.go (Version constant)\n - .claude-plugin/plugin.json (version field)\n - .claude-plugin/marketplace.json (plugins[].version)\n - integrations/beads-mcp/pyproject.toml (version field)\n - README.md (Alpha version mention)\n - PLUGIN.md (version requirements)\n3. Validate semantic versioning format\n4. Show diff preview before applying\n5. Optionally create git commit with standard message\n\nThis prevents the version mismatch issue that occurred when only version.go was updated.\n\nRelated: bd-73 (version sync issue)","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:01.02371-07:00","closed_at":"2025-10-14T13:49:22.368581-07:00"}
{"id":"bd-73","title":"Add system-wide/multi-repo support for beads","description":"GitHub issue #4 requests ability to use beads across multiple projects and for system-wide task tracking.\n\nCurrent limitation: beads is per-repository isolated. Each project has its own .beads/ directory and issues cannot reference issues in other projects.\n\nPotential approaches:\n1. Global beads instance in ~/.beads/global.db for cross-project work\n2. Project references - allow issues to link across repos\n3. Multi-project workspace support - one beads instance managing multiple repos\n4. Integration with existing MCP server to provide remote multi-project access\n\nUse cases:\n- System administrators tracking work across multiple machines/repos\n- Developers working on a dozen+ projects simultaneously\n- Cross-cutting concerns that span multiple repositories\n- Global todo list with project-specific subtasks\n\nRelated:\n- GitHub issue #4: https://github.com/steveyegge/beads/issues/4\n- Comparison to membank MCP which already supports multi-project via centralized server\n- MCP server at integrations/beads-mcp/ could be extended for this\n\nSee also: Testing framework for plugins (also from GH #4)","notes":"Multi-repo support status update:\n\n✅ **COMPLETED (P1 - Core functionality):**\n- bd-121: --global daemon flag ✅ \n- bd-122: Multi-repo documentation ✅\n- bd-115: Per-request context routing ✅\n\n**REMAINING (Optional enhancements):**\n- bd-123 (P2): 'bd repos' command - nice-to-have for UX\n- bd-124 (P2): Daemon auto-start - convenience feature\n- bd-125 (P3): Workspace config - alternative approach\n- bd-126 (P4): Cross-repo references - future feature\n\n**Decision:** Core multi-repo support is COMPLETE and working. Remaining items are independent enhancements, not blockers. \n\nRecommend closing bd-73 as complete. Open new issues for specific enhancements if needed.","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T23:15:17.705446-07:00","closed_at":"2025-10-17T23:15:17.705446-07:00"}
{"id":"bd-74","title":"Implement storage driver interface for pluggable backends","description":"Create abstraction layer for storage to support multiple backends (SQLite, Postgres, Turso, in-memory testing, etc.).\n\n**Current state:** All storage logic hardcoded to SQLite in internal/storage/sqlite/sqlite.go\n\n**Proposed design:**\n\n```go\n// internal/storage/storage.go\ntype Store interface {\n // Issue CRUD\n CreateIssue(issue *Issue) error\n GetIssue(id string) (*Issue, error)\n UpdateIssue(id string, updates *Issue) error\n DeleteIssue(id string) error\n ListIssues(filter *Filter) ([]*Issue, error)\n \n // Dependencies\n AddDependency(from, to string, depType DependencyType) error\n RemoveDependency(from, to string, depType DependencyType) error\n GetDependencies(id string) ([]*Dependency, error)\n \n // Counters, stats\n GetNextID(prefix string) (string, error)\n GetStats() (*Stats, error)\n \n Close() error\n}\n```\n\n**Benefits:**\n- Better testing (mock/in-memory stores)\n- Future flexibility (Postgres, cloud APIs, etc.)\n- Clean architecture (business logic decoupled from storage)\n- Enable Turso or other backends without refactoring everything\n\n**Implementation steps:**\n1. Define Store interface in internal/storage/storage.go\n2. Refactor SQLiteStore to implement interface\n3. Update all commands to use interface, not concrete type\n4. Add MemoryStore for testing\n5. Add driver selection via config (storage.driver = sqlite|turso|postgres)\n6. Update tests to use interface\n\n**Note:** This is valuable even without adopting Turso. Good architecture practice.\n\n**Context:** From GH issue #2 RFC evaluation. Driver interface is low-cost, high-value regardless of whether we add alternative backends.","status":"open","priority":2,"issue_type":"feature","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.920278-07:00"}
{"id":"bd-74","title":"Implement storage driver interface for pluggable backends","description":"Create abstraction layer for storage to support multiple backends (SQLite, Postgres, Turso, in-memory testing, etc.).\n\n**Current state:** All storage logic hardcoded to SQLite in internal/storage/sqlite/sqlite.go\n\n**Proposed design:**\n\n```go\n// internal/storage/storage.go\ntype Store interface {\n // Issue CRUD\n CreateIssue(issue *Issue) error\n GetIssue(id string) (*Issue, error)\n UpdateIssue(id string, updates *Issue) error\n DeleteIssue(id string) error\n ListIssues(filter *Filter) ([]*Issue, error)\n \n // Dependencies\n AddDependency(from, to string, depType DependencyType) error\n RemoveDependency(from, to string, depType DependencyType) error\n GetDependencies(id string) ([]*Dependency, error)\n \n // Counters, stats\n GetNextID(prefix string) (string, error)\n GetStats() (*Stats, error)\n \n Close() error\n}\n```\n\n**Benefits:**\n- Better testing (mock/in-memory stores)\n- Future flexibility (Postgres, cloud APIs, etc.)\n- Clean architecture (business logic decoupled from storage)\n- Enable Turso or other backends without refactoring everything\n\n**Implementation steps:**\n1. Define Store interface in internal/storage/storage.go\n2. Refactor SQLiteStore to implement interface\n3. Update all commands to use interface, not concrete type\n4. Add MemoryStore for testing\n5. Add driver selection via config (storage.driver = sqlite|turso|postgres)\n6. Update tests to use interface\n\n**Note:** This is valuable even without adopting Turso. Good architecture practice.\n\n**Context:** From GH issue #2 RFC evaluation. Driver interface is low-cost, high-value regardless of whether we add alternative backends.","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T23:46:22.447301-07:00","closed_at":"2025-10-17T23:46:22.447301-07:00"}
{"id":"bd-75","title":"Test issue with --deps flag","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.921065-07:00","closed_at":"2025-10-16T10:07:34.027923-07:00"}
{"id":"bd-76","title":"Fix: bd init --prefix test -q flag not recognized","description":"The init command doesn't recognize the -q flag. When running 'bd init --prefix test -q', it fails silently or behaves unexpectedly. The flag should either be implemented for quiet mode or removed from documentation if not supported.","status":"closed","priority":2,"issue_type":"bug","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.921744-07:00","closed_at":"2025-10-17T00:09:18.921816-07:00"}
{"id":"bd-77","title":"Improve session management","description":"Current session management is basic. Need to improve with better expiration handling.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-10-16T20:46:08.971822-07:00","updated_at":"2025-10-17T01:32:00.922344-07:00","closed_at":"2025-10-14T14:37:17.463188-07:00"}

View File

@@ -7,6 +7,7 @@ This directory contains examples of how to integrate bd with AI agents and workf
- **[python-agent/](python-agent/)** - Simple Python agent that discovers ready work and completes tasks
- **[bash-agent/](bash-agent/)** - Bash script showing the full agent workflow
- **[markdown-to-jsonl/](markdown-to-jsonl/)** - Convert markdown planning docs to bd issues
- **[github-import/](github-import/)** - Import issues from GitHub repositories
- **[git-hooks/](git-hooks/)** - Pre-configured git hooks for automatic export/import
- **[branch-merge/](branch-merge/)** - Branch merge workflow with collision resolution
- **[claude-desktop-mcp/](claude-desktop-mcp/)** - MCP server for Claude Desktop integration

View File

@@ -0,0 +1,303 @@
# GitHub Issues to bd Importer
Import issues from GitHub repositories into `bd`.
## Overview
This tool converts GitHub Issues to bd's JSONL format, supporting both:
1. **GitHub API** - Fetch issues directly from a repository
2. **JSON Export** - Parse manually exported GitHub issues
## Features
-**Fetch from GitHub API** - Direct import from any public/private repo
-**JSON file import** - Parse exported GitHub issues JSON
-**Label mapping** - Auto-map GitHub labels to bd priority/type
-**Preserve metadata** - Keep assignees, timestamps, descriptions
-**Cross-references** - Convert `#123` references to dependencies
-**External links** - Preserve URLs back to original GitHub issues
-**Filter PRs** - Automatically excludes pull requests
## Installation
No dependencies required! Uses Python 3 standard library.
For API access, set up a GitHub token:
```bash
# Create token at: https://github.com/settings/tokens
# Permissions needed: public_repo (or repo for private repos)
export GITHUB_TOKEN=ghp_your_token_here
```
**Security Note:** Use the `GITHUB_TOKEN` environment variable instead of `--token` flag when possible. The `--token` flag may appear in shell history and process listings.
## Usage
### From GitHub API
```bash
# Fetch all issues from a repository
python gh2jsonl.py --repo owner/repo | bd import
# Save to file first (recommended)
python gh2jsonl.py --repo owner/repo > issues.jsonl
bd import -i issues.jsonl --dry-run # Preview
bd import -i issues.jsonl # Import
# Fetch only open issues
python gh2jsonl.py --repo owner/repo --state open
# Fetch only closed issues
python gh2jsonl.py --repo owner/repo --state closed
```
### From JSON File
Export issues from GitHub (via API or manually), then:
```bash
# Single issue
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/owner/repo/issues/123 > issue.json
python gh2jsonl.py --file issue.json | bd import
# Multiple issues
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/owner/repo/issues > issues.json
python gh2jsonl.py --file issues.json | bd import
```
### Custom Options
```bash
# Use custom prefix (instead of 'bd')
python gh2jsonl.py --repo owner/repo --prefix myproject
# Start numbering from specific ID
python gh2jsonl.py --repo owner/repo --start-id 100
# Pass token directly (instead of env var)
python gh2jsonl.py --repo owner/repo --token ghp_...
```
## Label Mapping
The script maps GitHub labels to bd fields:
### Priority Mapping
| GitHub Labels | bd Priority |
|--------------|-------------|
| `critical`, `p0`, `urgent` | 0 (Critical) |
| `high`, `p1`, `important` | 1 (High) |
| (default) | 2 (Medium) |
| `low`, `p3`, `minor` | 3 (Low) |
| `backlog`, `p4`, `someday` | 4 (Backlog) |
### Type Mapping
| GitHub Labels | bd Type |
|--------------|---------|
| `bug`, `defect` | bug |
| `feature`, `enhancement` | feature |
| `epic`, `milestone` | epic |
| `chore`, `maintenance`, `dependencies` | chore |
| (default) | task |
### Status Mapping
| GitHub State | GitHub Labels | bd Status |
|-------------|---------------|-----------|
| closed | (any) | closed |
| open | `in progress`, `in-progress`, `wip` | in_progress |
| open | `blocked` | blocked |
| open | (default) | open |
### Labels
All other labels are preserved in the `labels` field. Labels used for mapping (priority, type, status) are filtered out to avoid duplication.
## Field Mapping
| GitHub Field | bd Field | Notes |
|--------------|----------|-------|
| `number` | (internal mapping) | GH#123 → bd-1, etc. |
| `title` | `title` | Direct copy |
| `body` | `description` | Direct copy |
| `state` | `status` | See status mapping |
| `labels` | `priority`, `issue_type`, `labels` | See label mapping |
| `assignee.login` | `assignee` | First assignee only |
| `created_at` | `created_at` | ISO 8601 timestamp |
| `updated_at` | `updated_at` | ISO 8601 timestamp |
| `closed_at` | `closed_at` | ISO 8601 timestamp |
| `html_url` | `external_ref` | Link back to GitHub |
## Cross-References
Issue references in the body text are converted to dependencies:
**GitHub:**
```markdown
This depends on #123 and fixes #456.
See also owner/other-repo#789.
```
**Result:**
- If GH#123 was imported, creates `related` dependency to its bd ID
- If GH#456 was imported, creates `related` dependency to its bd ID
- Cross-repo references (#789) are ignored (unless those issues were also imported)
**Note:** Dependency records use `"issue_id": ""` format, which the bd importer automatically fills. This matches the behavior of the markdown-to-jsonl converter.
## Examples
### Example 1: Import Active Issues
```bash
# Import only open issues for active work
export GITHUB_TOKEN=ghp_...
python gh2jsonl.py --repo mycompany/myapp --state open > open-issues.jsonl
# Preview
cat open-issues.jsonl | jq .
# Import
bd import -i open-issues.jsonl
bd ready # See what's ready to work on
```
### Example 2: Full Repository Migration
```bash
# Import all issues (open and closed)
python gh2jsonl.py --repo mycompany/myapp > all-issues.jsonl
# Preview import (check for collisions)
bd import -i all-issues.jsonl --dry-run
# Import with collision resolution if needed
bd import -i all-issues.jsonl --resolve-collisions
# View stats
bd stats
```
### Example 3: Partial Import from JSON
```bash
# Manually export specific issues via GitHub API
gh api repos/owner/repo/issues?labels=p1,bug > high-priority-bugs.json
# Import
python gh2jsonl.py --file high-priority-bugs.json | bd import
```
## Customization
The script is intentionally simple to customize for your workflow:
### 1. Adjust Label Mappings
Edit `map_priority()`, `map_issue_type()`, and `map_status()` to match your label conventions:
```python
def map_priority(self, labels: List[str]) -> int:
label_names = [label.get("name", "").lower() if isinstance(label, dict) else label.lower() for label in labels]
# Add your custom mappings
if any(l in label_names for l in ["sev1", "emergency"]):
return 0
# ... etc
```
### 2. Add Custom Fields
Map additional GitHub fields to bd:
```python
def convert_issue(self, gh_issue: Dict[str, Any]) -> Dict[str, Any]:
# ... existing code ...
# Add milestone to design field
if gh_issue.get("milestone"):
issue["design"] = f"Milestone: {gh_issue['milestone']['title']}"
return issue
```
### 3. Enhanced Dependency Detection
Parse more dependency patterns from body text:
```python
def extract_dependencies_from_body(self, body: str) -> List[str]:
# ... existing code ...
# Add: "Blocks: #123, #456"
blocks_pattern = r'Blocks:\s*((?:#\d+(?:\s*,\s*)?)+)'
# ... etc
```
## Limitations
- **Single assignee**: GitHub supports multiple assignees, bd supports one
- **No milestones**: GitHub milestones aren't mapped (consider using design field)
- **Simple cross-refs**: Only basic `#123` patterns detected
- **No comments**: Issue comments aren't imported (only the body)
- **No reactions**: GitHub reactions/emoji aren't imported
- **No projects**: GitHub project board info isn't imported
## API Rate Limits
GitHub API has rate limits:
- **Authenticated**: 5,000 requests/hour
- **Unauthenticated**: 60 requests/hour
This script uses 1 request per 100 issues (pagination), so:
- Can fetch ~500,000 issues/hour (authenticated)
- Can fetch ~6,000 issues/hour (unauthenticated)
For large repositories (>1000 issues), authentication is recommended.
**Note:** The script automatically includes a `User-Agent` header (required by GitHub) and provides actionable error messages when rate limits are exceeded, including the reset timestamp.
## Troubleshooting
### "GitHub token required"
Set the `GITHUB_TOKEN` environment variable:
```bash
export GITHUB_TOKEN=ghp_your_token_here
```
Or pass directly:
```bash
python gh2jsonl.py --repo owner/repo --token ghp_...
```
### "GitHub API error: 404"
- Check repository name format: `owner/repo`
- Check repository exists and is accessible
- For private repos, ensure token has `repo` scope
### "GitHub API error: 403"
- Rate limit exceeded (wait or use authentication)
- Token doesn't have required permissions
- Repository requires different permissions
### Issue numbers don't match
This is expected! GitHub issue numbers (e.g., #123) are mapped to bd IDs (e.g., bd-1) based on import order. The original GitHub URL is preserved in `external_ref`.
## See Also
- [bd README](../../README.md) - Main documentation
- [Markdown Import Example](../markdown-to-jsonl/) - Import from markdown
- [TEXT_FORMATS.md](../../TEXT_FORMATS.md) - Understanding bd's JSONL format
- [JSONL Import Guide](../../README.md#import) - Import collision handling

View File

@@ -0,0 +1,52 @@
[
{
"number": 42,
"title": "Add user authentication",
"body": "Implement JWT-based authentication.\n\nThis blocks #43 and is related to #44.",
"state": "open",
"labels": [
{"name": "feature"},
{"name": "high"},
{"name": "security"}
],
"assignee": {
"login": "alice"
},
"created_at": "2025-01-15T10:00:00Z",
"updated_at": "2025-01-16T14:30:00Z",
"html_url": "https://github.com/example/repo/issues/42"
},
{
"number": 43,
"title": "Add API rate limiting",
"body": "Implement rate limiting for API endpoints.\n\nDepends on authentication (#42) being completed first.",
"state": "open",
"labels": [
{"name": "feature"},
{"name": "p1"}
],
"assignee": {
"login": "bob"
},
"created_at": "2025-01-15T11:00:00Z",
"updated_at": "2025-01-15T11:00:00Z",
"html_url": "https://github.com/example/repo/issues/43"
},
{
"number": 44,
"title": "Fix login redirect bug",
"body": "Login page redirects to wrong URL after authentication.",
"state": "closed",
"labels": [
{"name": "bug"},
{"name": "critical"}
],
"assignee": {
"login": "charlie"
},
"created_at": "2025-01-10T09:00:00Z",
"updated_at": "2025-01-12T16:00:00Z",
"closed_at": "2025-01-12T16:00:00Z",
"html_url": "https://github.com/example/repo/issues/44"
}
]

View File

@@ -0,0 +1,386 @@
#!/usr/bin/env python3
"""
Convert GitHub Issues to bd JSONL format.
Supports two input modes:
1. GitHub API - Fetch issues directly from a repository
2. JSON Export - Parse exported GitHub issues JSON
Usage:
# From GitHub API
export GITHUB_TOKEN=ghp_your_token_here
python gh2jsonl.py --repo owner/repo | bd import
# From exported JSON file
python gh2jsonl.py --file issues.json | bd import
# Save to file first
python gh2jsonl.py --repo owner/repo > issues.jsonl
"""
import json
import os
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import List, Dict, Any, Optional
from urllib.request import Request, urlopen
from urllib.error import HTTPError, URLError
class GitHubToBeads:
"""Convert GitHub Issues to bd JSONL format."""
def __init__(self, prefix: str = "bd", start_id: int = 1):
self.prefix = prefix
self.issue_counter = start_id
self.issues: List[Dict[str, Any]] = []
self.gh_id_to_bd_id: Dict[int, str] = {}
def fetch_from_api(self, repo: str, token: Optional[str] = None, state: str = "all"):
"""Fetch issues from GitHub API."""
if not token:
token = os.getenv("GITHUB_TOKEN")
if not token:
raise ValueError(
"GitHub token required. Set GITHUB_TOKEN env var or pass --token"
)
# Parse repo
if "/" not in repo:
raise ValueError("Repository must be in format: owner/repo")
# Fetch all issues (paginated)
page = 1
per_page = 100
all_issues = []
while True:
url = f"https://api.github.com/repos/{repo}/issues?state={state}&per_page={per_page}&page={page}"
headers = {
"Authorization": f"token {token}",
"Accept": "application/vnd.github.v3+json",
"User-Agent": "bd-gh-import/1.0",
}
try:
req = Request(url, headers=headers)
with urlopen(req) as response:
data = json.loads(response.read().decode())
if not data:
break
# Filter out pull requests (they appear in issues endpoint)
issues = [issue for issue in data if "pull_request" not in issue]
all_issues.extend(issues)
if len(data) < per_page:
break
page += 1
except HTTPError as e:
error_body = e.read().decode(errors="replace")
remaining = e.headers.get("X-RateLimit-Remaining")
reset = e.headers.get("X-RateLimit-Reset")
msg = f"GitHub API error: {e.code} - {error_body}"
if e.code == 403 and remaining == "0":
msg += f"\nRate limit exceeded. Resets at Unix timestamp: {reset}"
raise RuntimeError(msg)
except URLError as e:
raise RuntimeError(f"Network error calling GitHub: {e.reason}")
print(f"Fetched {len(all_issues)} issues from {repo}", file=sys.stderr)
return all_issues
def parse_json_file(self, filepath: Path) -> List[Dict[str, Any]]:
"""Parse GitHub issues from JSON file."""
with open(filepath, 'r', encoding='utf-8') as f:
try:
data = json.load(f)
except json.JSONDecodeError as e:
raise ValueError(f"Invalid JSON in {filepath}: {e}")
# Handle both single issue and array of issues
if isinstance(data, dict):
# Filter out PRs
if "pull_request" in data:
return []
return [data]
elif isinstance(data, list):
# Filter out PRs
return [issue for issue in data if "pull_request" not in issue]
else:
raise ValueError("JSON must be a single issue object or array of issues")
def map_priority(self, labels: List[str]) -> int:
"""Map GitHub labels to bd priority."""
label_names = [label.get("name", "").lower() if isinstance(label, dict) else label.lower() for label in labels]
# Priority labels (customize for your repo)
if any(l in label_names for l in ["critical", "p0", "urgent"]):
return 0
elif any(l in label_names for l in ["high", "p1", "important"]):
return 1
elif any(l in label_names for l in ["low", "p3", "minor"]):
return 3
elif any(l in label_names for l in ["backlog", "p4", "someday"]):
return 4
else:
return 2 # Default medium
def map_issue_type(self, labels: List[str]) -> str:
"""Map GitHub labels to bd issue type."""
label_names = [label.get("name", "").lower() if isinstance(label, dict) else label.lower() for label in labels]
# Type labels (customize for your repo)
if any(l in label_names for l in ["bug", "defect"]):
return "bug"
elif any(l in label_names for l in ["feature", "enhancement"]):
return "feature"
elif any(l in label_names for l in ["epic", "milestone"]):
return "epic"
elif any(l in label_names for l in ["chore", "maintenance", "dependencies"]):
return "chore"
else:
return "task"
def map_status(self, state: str, labels: List[str]) -> str:
"""Map GitHub state to bd status."""
label_names = [label.get("name", "").lower() if isinstance(label, dict) else label.lower() for label in labels]
if state == "closed":
return "closed"
elif any(l in label_names for l in ["in progress", "in-progress", "wip"]):
return "in_progress"
elif any(l in label_names for l in ["blocked"]):
return "blocked"
else:
return "open"
def extract_labels(self, gh_labels: List) -> List[str]:
"""Extract label names from GitHub label objects."""
labels = []
for label in gh_labels:
if isinstance(label, dict):
name = label.get("name", "")
else:
name = str(label)
# Filter out labels we use for mapping
skip_labels = {
"bug", "feature", "epic", "chore", "enhancement", "defect",
"critical", "high", "low", "p0", "p1", "p2", "p3", "p4",
"urgent", "important", "minor", "backlog", "someday",
"in progress", "in-progress", "wip", "blocked"
}
if name.lower() not in skip_labels:
labels.append(name)
return labels
def extract_dependencies_from_body(self, body: str) -> List[str]:
"""Extract issue references from body text."""
if not body:
return []
refs = []
# Pattern: #123 or owner/repo#123
issue_pattern = r'(?:^|\s)#(\d+)|(?:[\w-]+/[\w-]+)#(\d+)'
for match in re.finditer(issue_pattern, body):
issue_num = match.group(1) or match.group(2)
if issue_num:
refs.append(int(issue_num))
return list(set(refs)) # Deduplicate
def convert_issue(self, gh_issue: Dict[str, Any]) -> Dict[str, Any]:
"""Convert a single GitHub issue to bd format."""
gh_id = gh_issue["number"]
bd_id = f"{self.prefix}-{self.issue_counter}"
self.issue_counter += 1
# Store mapping
self.gh_id_to_bd_id[gh_id] = bd_id
labels = gh_issue.get("labels", [])
# Build bd issue
issue = {
"id": bd_id,
"title": gh_issue["title"],
"description": gh_issue.get("body") or "",
"status": self.map_status(gh_issue["state"], labels),
"priority": self.map_priority(labels),
"issue_type": self.map_issue_type(labels),
"created_at": gh_issue["created_at"],
"updated_at": gh_issue["updated_at"],
}
# Add external reference
issue["external_ref"] = gh_issue["html_url"]
# Add assignee if present
if gh_issue.get("assignee"):
issue["assignee"] = gh_issue["assignee"]["login"]
# Add labels (filtered)
bd_labels = self.extract_labels(labels)
if bd_labels:
issue["labels"] = bd_labels
# Add closed timestamp if closed
if gh_issue.get("closed_at"):
issue["closed_at"] = gh_issue["closed_at"]
return issue
def add_dependencies(self):
"""Add dependencies based on issue references in body text."""
for gh_issue_data in getattr(self, '_gh_issues', []):
gh_id = gh_issue_data["number"]
bd_id = self.gh_id_to_bd_id.get(gh_id)
if not bd_id:
continue
body = gh_issue_data.get("body") or ""
referenced_gh_ids = self.extract_dependencies_from_body(body)
dependencies = []
for ref_gh_id in referenced_gh_ids:
ref_bd_id = self.gh_id_to_bd_id.get(ref_gh_id)
if ref_bd_id:
dependencies.append({
"issue_id": "",
"depends_on_id": ref_bd_id,
"type": "related"
})
# Find the bd issue and add dependencies
if dependencies:
for issue in self.issues:
if issue["id"] == bd_id:
issue["dependencies"] = dependencies
break
def convert(self, gh_issues: List[Dict[str, Any]]):
"""Convert all GitHub issues to bd format."""
# Store for dependency processing
self._gh_issues = gh_issues
# Sort by issue number for consistent ID assignment
sorted_issues = sorted(gh_issues, key=lambda x: x["number"])
# Convert each issue
for gh_issue in sorted_issues:
bd_issue = self.convert_issue(gh_issue)
self.issues.append(bd_issue)
# Add cross-references
self.add_dependencies()
print(
f"Converted {len(self.issues)} issues. Mapping: GH #{min(self.gh_id_to_bd_id.keys())} -> {self.gh_id_to_bd_id[min(self.gh_id_to_bd_id.keys())]}",
file=sys.stderr
)
def to_jsonl(self) -> str:
"""Convert issues to JSONL format."""
lines = []
for issue in self.issues:
lines.append(json.dumps(issue, ensure_ascii=False))
return '\n'.join(lines)
def main():
"""Main entry point."""
import argparse
parser = argparse.ArgumentParser(
description="Convert GitHub Issues to bd JSONL format",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# From GitHub API
export GITHUB_TOKEN=ghp_...
python gh2jsonl.py --repo owner/repo | bd import
# From JSON file
python gh2jsonl.py --file issues.json > issues.jsonl
# Fetch only open issues
python gh2jsonl.py --repo owner/repo --state open
# Custom prefix and start ID
python gh2jsonl.py --repo owner/repo --prefix myproject --start-id 100
"""
)
parser.add_argument(
"--repo",
help="GitHub repository (owner/repo)"
)
parser.add_argument(
"--file",
type=Path,
help="JSON file containing GitHub issues export"
)
parser.add_argument(
"--token",
help="GitHub personal access token (or set GITHUB_TOKEN env var)"
)
parser.add_argument(
"--state",
choices=["open", "closed", "all"],
default="all",
help="Issue state to fetch (default: all)"
)
parser.add_argument(
"--prefix",
default="bd",
help="Issue ID prefix (default: bd)"
)
parser.add_argument(
"--start-id",
type=int,
default=1,
help="Starting issue number (default: 1)"
)
args = parser.parse_args()
# Validate inputs
if not args.repo and not args.file:
parser.error("Either --repo or --file is required")
if args.repo and args.file:
parser.error("Cannot use both --repo and --file")
# Create converter
converter = GitHubToBeads(prefix=args.prefix, start_id=args.start_id)
# Load issues
if args.repo:
gh_issues = converter.fetch_from_api(args.repo, args.token, args.state)
else:
gh_issues = converter.parse_json_file(args.file)
if not gh_issues:
print("No issues found", file=sys.stderr)
sys.exit(0)
# Convert
converter.convert(gh_issues)
# Output JSONL
print(converter.to_jsonl())
if __name__ == "__main__":
main()