- Updated FAQ.md, ADVANCED.md, TROUBLESHOOTING.md to explain hash IDs eliminate collisions - Removed --resolve-collisions references from all documentation and examples - Renamed handleCollisions() to detectUpdates() to reflect update semantics - Updated test names: TestAutoImportWithCollision → TestAutoImportWithUpdate - Clarified: with hash IDs, same-ID = update operation, not collision Closes: bd-50a7, bd-b84f, bd-bda8, bd-650c, bd-3ef2, bd-c083, bd-85a6
GitHub Issues to bd Importer
Import issues from GitHub repositories into bd.
Overview
This tool converts GitHub Issues to bd's JSONL format, supporting both:
- GitHub API - Fetch issues directly from a repository
- JSON Export - Parse manually exported GitHub issues
Features
- ✅ Fetch from GitHub API - Direct import from any public/private repo
- ✅ JSON file import - Parse exported GitHub issues JSON
- ✅ Label mapping - Auto-map GitHub labels to bd priority/type
- ✅ Preserve metadata - Keep assignees, timestamps, descriptions
- ✅ Cross-references - Convert
#123references to dependencies - ✅ External links - Preserve URLs back to original GitHub issues
- ✅ Filter PRs - Automatically excludes pull requests
Installation
No dependencies required! Uses Python 3 standard library.
For API access, set up a GitHub token:
# Create token at: https://github.com/settings/tokens
# Permissions needed: public_repo (or repo for private repos)
export GITHUB_TOKEN=ghp_your_token_here
Security Note: Use the GITHUB_TOKEN environment variable instead of --token flag when possible. The --token flag may appear in shell history and process listings.
Usage
From GitHub API
# Fetch all issues from a repository
python gh2jsonl.py --repo owner/repo | bd import
# Save to file first (recommended)
python gh2jsonl.py --repo owner/repo > issues.jsonl
bd import -i issues.jsonl --dry-run # Preview
bd import -i issues.jsonl # Import
# Fetch only open issues
python gh2jsonl.py --repo owner/repo --state open
# Fetch only closed issues
python gh2jsonl.py --repo owner/repo --state closed
From JSON File
Export issues from GitHub (via API or manually), then:
# Single issue
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/owner/repo/issues/123 > issue.json
python gh2jsonl.py --file issue.json | bd import
# Multiple issues
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/owner/repo/issues > issues.json
python gh2jsonl.py --file issues.json | bd import
Custom Options
# Use custom prefix (instead of 'bd')
python gh2jsonl.py --repo owner/repo --prefix myproject
# Start numbering from specific ID
python gh2jsonl.py --repo owner/repo --start-id 100
# Pass token directly (instead of env var)
python gh2jsonl.py --repo owner/repo --token ghp_...
Label Mapping
The script maps GitHub labels to bd fields:
Priority Mapping
| GitHub Labels | bd Priority |
|---|---|
critical, p0, urgent |
0 (Critical) |
high, p1, important |
1 (High) |
| (default) | 2 (Medium) |
low, p3, minor |
3 (Low) |
backlog, p4, someday |
4 (Backlog) |
Type Mapping
| GitHub Labels | bd Type |
|---|---|
bug, defect |
bug |
feature, enhancement |
feature |
epic, milestone |
epic |
chore, maintenance, dependencies |
chore |
| (default) | task |
Status Mapping
| GitHub State | GitHub Labels | bd Status |
|---|---|---|
| closed | (any) | closed |
| open | in progress, in-progress, wip |
in_progress |
| open | blocked |
blocked |
| open | (default) | open |
Labels
All other labels are preserved in the labels field. Labels used for mapping (priority, type, status) are filtered out to avoid duplication.
Field Mapping
| GitHub Field | bd Field | Notes |
|---|---|---|
number |
(internal mapping) | GH#123 → bd-1, etc. |
title |
title |
Direct copy |
body |
description |
Direct copy |
state |
status |
See status mapping |
labels |
priority, issue_type, labels |
See label mapping |
assignee.login |
assignee |
First assignee only |
created_at |
created_at |
ISO 8601 timestamp |
updated_at |
updated_at |
ISO 8601 timestamp |
closed_at |
closed_at |
ISO 8601 timestamp |
html_url |
external_ref |
Link back to GitHub |
Cross-References
Issue references in the body text are converted to dependencies:
GitHub:
This depends on #123 and fixes #456.
See also owner/other-repo#789.
Result:
- If GH#123 was imported, creates
relateddependency to its bd ID - If GH#456 was imported, creates
relateddependency to its bd ID - Cross-repo references (#789) are ignored (unless those issues were also imported)
Note: Dependency records use "issue_id": "" format, which the bd importer automatically fills. This matches the behavior of the markdown-to-jsonl converter.
Examples
Example 1: Import Active Issues
# Import only open issues for active work
export GITHUB_TOKEN=ghp_...
python gh2jsonl.py --repo mycompany/myapp --state open > open-issues.jsonl
# Preview
cat open-issues.jsonl | jq .
# Import
bd import -i open-issues.jsonl
bd ready # See what's ready to work on
Example 2: Full Repository Migration
# Import all issues (open and closed)
python gh2jsonl.py --repo mycompany/myapp > all-issues.jsonl
# Preview import (check for new issues and updates)
bd import -i all-issues.jsonl --dry-run
# Import issues
bd import -i all-issues.jsonl
# View stats
bd stats
Example 3: Partial Import from JSON
# Manually export specific issues via GitHub API
gh api repos/owner/repo/issues?labels=p1,bug > high-priority-bugs.json
# Import
python gh2jsonl.py --file high-priority-bugs.json | bd import
Customization
The script is intentionally simple to customize for your workflow:
1. Adjust Label Mappings
Edit map_priority(), map_issue_type(), and map_status() to match your label conventions:
def map_priority(self, labels: List[str]) -> int:
label_names = [label.get("name", "").lower() if isinstance(label, dict) else label.lower() for label in labels]
# Add your custom mappings
if any(l in label_names for l in ["sev1", "emergency"]):
return 0
# ... etc
2. Add Custom Fields
Map additional GitHub fields to bd:
def convert_issue(self, gh_issue: Dict[str, Any]) -> Dict[str, Any]:
# ... existing code ...
# Add milestone to design field
if gh_issue.get("milestone"):
issue["design"] = f"Milestone: {gh_issue['milestone']['title']}"
return issue
3. Enhanced Dependency Detection
Parse more dependency patterns from body text:
def extract_dependencies_from_body(self, body: str) -> List[str]:
# ... existing code ...
# Add: "Blocks: #123, #456"
blocks_pattern = r'Blocks:\s*((?:#\d+(?:\s*,\s*)?)+)'
# ... etc
Limitations
- Single assignee: GitHub supports multiple assignees, bd supports one
- No milestones: GitHub milestones aren't mapped (consider using design field)
- Simple cross-refs: Only basic
#123patterns detected - No comments: Issue comments aren't imported (only the body)
- No reactions: GitHub reactions/emoji aren't imported
- No projects: GitHub project board info isn't imported
API Rate Limits
GitHub API has rate limits:
- Authenticated: 5,000 requests/hour
- Unauthenticated: 60 requests/hour
This script uses 1 request per 100 issues (pagination), so:
- Can fetch ~500,000 issues/hour (authenticated)
- Can fetch ~6,000 issues/hour (unauthenticated)
For large repositories (>1000 issues), authentication is recommended.
Note: The script automatically includes a User-Agent header (required by GitHub) and provides actionable error messages when rate limits are exceeded, including the reset timestamp.
Troubleshooting
"GitHub token required"
Set the GITHUB_TOKEN environment variable:
export GITHUB_TOKEN=ghp_your_token_here
Or pass directly:
python gh2jsonl.py --repo owner/repo --token ghp_...
"GitHub API error: 404"
- Check repository name format:
owner/repo - Check repository exists and is accessible
- For private repos, ensure token has
reposcope
"GitHub API error: 403"
- Rate limit exceeded (wait or use authentication)
- Token doesn't have required permissions
- Repository requires different permissions
Issue numbers don't match
This is expected! GitHub issue numbers (e.g., #123) are mapped to bd IDs (e.g., bd-1) based on import order. The original GitHub URL is preserved in external_ref.
See Also
- bd README - Main documentation
- Markdown Import Example - Import from markdown
- TEXT_FORMATS.md - Understanding bd's JSONL format
- JSONL Import Guide - Import collision handling