feat: Add markdown-to-jsonl converter example [addresses #9]

Add lightweight example script for converting markdown planning docs
to bd JSONL format. This addresses #9 without adding complexity to
bd core.

Features:
- YAML frontmatter parsing (priority, type, assignee)
- Headings converted to issues
- Task lists extracted as sub-issues
- Dependency parsing (blocks: bd-10, etc.)
- Fully customizable by users

This demonstrates the "lightweight extension pattern" - keeping bd
core minimal while providing examples users can adapt for their needs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Steve Yegge
2025-10-14 13:24:03 -07:00
parent 92885bb7a3
commit e4fba408f3
4 changed files with 468 additions and 0 deletions

View File

@@ -0,0 +1,165 @@
# Markdown to JSONL Converter
Convert markdown planning documents into `bd` issues.
## Overview
This example shows how to bridge the gap between markdown planning docs and tracked issues, without adding complexity to the `bd` core tool.
The converter script (`md2jsonl.py`) parses markdown files and outputs JSONL that can be imported into `bd`.
## Features
-**YAML Frontmatter** - Extract metadata (priority, type, assignee)
-**Headings as Issues** - Each H1/H2 becomes an issue
-**Task Lists** - Markdown checklists become sub-issues
-**Dependency Parsing** - Extract "blocks: bd-10" references
-**Customizable** - Modify the script for your conventions
## Usage
### Basic conversion
```bash
python md2jsonl.py feature.md | bd import
```
### Save to file first
```bash
python md2jsonl.py feature.md > issues.jsonl
bd import -i issues.jsonl
```
### Preview before importing
```bash
python md2jsonl.py feature.md | jq .
```
## Markdown Format
### Frontmatter (Optional)
```markdown
---
priority: 1
type: feature
assignee: alice
---
```
### Headings
Each heading becomes an issue:
```markdown
# Main Feature
Description of the feature...
## Sub-task 1
Details about sub-task...
## Sub-task 2
More details...
```
### Task Lists
Task lists are converted to separate issues:
```markdown
## Setup Tasks
- [ ] Install dependencies
- [x] Configure database
- [ ] Set up CI/CD
```
Creates 3 issues (second one marked as closed).
### Dependencies
Reference other issues in the description:
```markdown
## Implement API
This task requires the database schema to be ready first.
Dependencies:
- blocks: bd-5
- related: bd-10, bd-15
```
The script extracts these and creates dependency records.
## Example
See `example-feature.md` for a complete example.
```bash
# Convert the example
python md2jsonl.py example-feature.md > example-issues.jsonl
# View the output
cat example-issues.jsonl | jq .
# Import into bd
bd import -i example-issues.jsonl
```
## Customization
The script is intentionally simple so you can customize it for your needs:
1. **Different heading levels** - Modify which headings become issues (H1 only? H1-H3?)
2. **Custom metadata** - Parse additional frontmatter fields
3. **Labels** - Extract hashtags or keywords as labels
4. **Epic detection** - Top-level headings become epics
5. **Issue templates** - Map different markdown structures to issue types
## Limitations
This is a simple example, not a production tool:
- Basic YAML parsing (no nested structures)
- Simple dependency extraction (regex-based)
- No validation of referenced issue IDs
- Doesn't handle all markdown edge cases
For production use, you might want to:
- Use a proper YAML parser (`pip install pyyaml`)
- Use a markdown parser (`pip install markdown` or `python-markdown2`)
- Add validation and error handling
- Support more dependency formats
## Philosophy
This example demonstrates the **lightweight extension pattern**:
- ✅ Keep `bd` core focused and minimal
- ✅ Let users customize for their workflows
- ✅ Use existing import infrastructure
- ✅ Easy to understand and modify
Rather than adding markdown support to `bd` core (800+ LOC + dependencies + maintenance), we provide a simple converter that users can adapt.
## Contributing
Have improvements? Found a bug? This is just an example, but contributions are welcome!
Consider:
- Better error messages
- More markdown patterns
- Integration with popular markdown formats
- Support for GFM (GitHub Flavored Markdown) extensions
## See Also
- [bd README](../../README.md) - Main documentation
- [Python Agent Example](../python-agent/) - Full agent workflow
- [JSONL Format](../../TEXT_FORMATS.md) - Understanding bd's JSONL structure

View File

@@ -0,0 +1,49 @@
---
priority: 1
type: feature
assignee: alice
---
# User Authentication System
Implement a complete user authentication system with login, signup, and password recovery.
This is a critical feature for the application. The authentication should be secure and follow best practices.
**Dependencies:**
- blocks: bd-5 (database schema must be ready first)
## Login Flow
Implement the login page with email/password authentication. Should support:
- Email validation
- Password hashing (bcrypt)
- Session management
- Remember me functionality
## Signup Flow
Create new user registration with validation:
- Email uniqueness check
- Password strength requirements
- Email verification
- Terms of service acceptance
## Password Recovery
Allow users to reset forgotten passwords:
- [ ] Send recovery email
- [ ] Generate secure reset tokens
- [x] Create reset password form
- [ ] Expire tokens after 24 hours
## Session Management
Handle user sessions securely:
- JWT tokens
- Refresh token rotation
- Session timeout after 30 days
- Logout functionality
Related to bd-10 (API endpoints) and discovered-from: bd-2 (security audit).

View File

@@ -0,0 +1,253 @@
#!/usr/bin/env python3
"""
Convert markdown files to bd JSONL format.
This is a simple example converter that demonstrates the pattern.
Users can customize this for their specific markdown conventions.
Supported markdown patterns:
1. YAML frontmatter for metadata
2. H1/H2 headings as issue titles
3. Task lists as sub-issues
4. Inline issue references (e.g., "blocks: bd-10")
Usage:
python md2jsonl.py feature.md | bd import
python md2jsonl.py feature.md > issues.jsonl
"""
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import List, Dict, Any, Optional
class MarkdownToIssues:
"""Convert markdown to bd JSONL format."""
def __init__(self, prefix: str = "bd"):
self.prefix = prefix
self.issue_counter = 1
self.issues: List[Dict[str, Any]] = []
def parse_frontmatter(self, content: str) -> tuple[Optional[Dict], str]:
"""Extract YAML frontmatter if present."""
# Simple frontmatter detection (--- ... ---)
if not content.startswith('---\n'):
return None, content
end = content.find('\n---\n', 4)
if end == -1:
return None, content
frontmatter_text = content[4:end]
body = content[end + 5:]
# Parse simple YAML (key: value)
metadata = {}
for line in frontmatter_text.split('\n'):
line = line.strip()
if ':' in line:
key, value = line.split(':', 1)
metadata[key.strip()] = value.strip()
return metadata, body
def extract_issue_from_heading(
self,
heading: str,
level: int,
content: str,
metadata: Optional[Dict] = None
) -> Dict[str, Any]:
"""Create an issue from a markdown heading and its content."""
# Generate ID
issue_id = f"{self.prefix}-{self.issue_counter}"
self.issue_counter += 1
# Extract title (remove markdown formatting)
title = heading.strip('#').strip()
# Parse metadata from frontmatter or defaults
if metadata is None:
metadata = {}
# Build issue
issue = {
"id": issue_id,
"title": title,
"description": content.strip(),
"status": metadata.get("status", "open"),
"priority": int(metadata.get("priority", 2)),
"issue_type": metadata.get("type", "task"),
"created_at": datetime.now(timezone.utc).isoformat().replace('+00:00', 'Z'),
"updated_at": datetime.now(timezone.utc).isoformat().replace('+00:00', 'Z'),
}
# Optional fields
if "assignee" in metadata:
issue["assignee"] = metadata["assignee"]
if "design" in metadata:
issue["design"] = metadata["design"]
# Extract dependencies from description
dependencies = self.extract_dependencies(content)
if dependencies:
issue["dependencies"] = dependencies
return issue
def extract_dependencies(self, text: str) -> List[Dict[str, str]]:
"""Extract dependency references from text."""
dependencies = []
# Pattern: "blocks: bd-10" or "depends-on: bd-5, bd-6"
# Pattern: "discovered-from: bd-20"
dep_pattern = r'(blocks|related|parent-child|discovered-from):\s*((?:bd-\d+(?:\s*,\s*)?)+)'
for match in re.finditer(dep_pattern, text, re.IGNORECASE):
dep_type = match.group(1).lower()
dep_ids = [id.strip() for id in match.group(2).split(',')]
for dep_id in dep_ids:
dependencies.append({
"issue_id": "", # Will be filled by import
"depends_on_id": dep_id.strip(),
"type": dep_type
})
return dependencies
def parse_task_list(self, content: str) -> List[Dict[str, Any]]:
"""Extract task list items as separate issues."""
issues = []
# Pattern: - [ ] Task or - [x] Task
task_pattern = r'^-\s+\[([ x])\]\s+(.+)$'
for line in content.split('\n'):
match = re.match(task_pattern, line.strip())
if match:
is_done = match.group(1) == 'x'
task_text = match.group(2)
issue_id = f"{self.prefix}-{self.issue_counter}"
self.issue_counter += 1
issue = {
"id": issue_id,
"title": task_text,
"description": "",
"status": "closed" if is_done else "open",
"priority": 2,
"issue_type": "task",
"created_at": datetime.now(timezone.utc).isoformat().replace('+00:00', 'Z'),
"updated_at": datetime.now(timezone.utc).isoformat().replace('+00:00', 'Z'),
}
issues.append(issue)
return issues
def parse_markdown(self, content: str, global_metadata: Optional[Dict] = None):
"""Parse markdown content into issues."""
# Extract frontmatter
frontmatter, body = self.parse_frontmatter(content)
# Merge metadata
metadata = global_metadata or {}
if frontmatter:
metadata.update(frontmatter)
# Split by headings
heading_pattern = r'^(#{1,6})\s+(.+)$'
lines = body.split('\n')
current_heading = None
current_level = 0
current_content = []
for line in lines:
match = re.match(heading_pattern, line)
if match:
# Save previous section
if current_heading:
content_text = '\n'.join(current_content)
# Check for task lists
task_issues = self.parse_task_list(content_text)
if task_issues:
self.issues.extend(task_issues)
else:
# Create issue from heading
issue = self.extract_issue_from_heading(
current_heading,
current_level,
content_text,
metadata
)
self.issues.append(issue)
# Start new section
current_level = len(match.group(1))
current_heading = match.group(2)
current_content = []
else:
current_content.append(line)
# Save final section
if current_heading:
content_text = '\n'.join(current_content)
task_issues = self.parse_task_list(content_text)
if task_issues:
self.issues.extend(task_issues)
else:
issue = self.extract_issue_from_heading(
current_heading,
current_level,
content_text,
metadata
)
self.issues.append(issue)
def to_jsonl(self) -> str:
"""Convert issues to JSONL format."""
lines = []
for issue in self.issues:
lines.append(json.dumps(issue, ensure_ascii=False))
return '\n'.join(lines)
def main():
"""Main entry point."""
if len(sys.argv) < 2:
print("Usage: python md2jsonl.py <markdown-file>", file=sys.stderr)
print("", file=sys.stderr)
print("Examples:", file=sys.stderr)
print(" python md2jsonl.py feature.md | bd import", file=sys.stderr)
print(" python md2jsonl.py feature.md > issues.jsonl", file=sys.stderr)
sys.exit(1)
markdown_file = Path(sys.argv[1])
if not markdown_file.exists():
print(f"Error: File not found: {markdown_file}", file=sys.stderr)
sys.exit(1)
# Read markdown
content = markdown_file.read_text()
# Convert to issues
converter = MarkdownToIssues(prefix="bd")
converter.parse_markdown(content)
# Output JSONL
print(converter.to_jsonl())
if __name__ == "__main__":
main()