When using --id-mode hash, the script now tracks generated IDs and
retries with increasing nonce (0-9) then increasing length (up to 8)
if a collision is detected within the same import batch.
This matches the collision handling behavior in the Go implementation
(internal/storage/sqlite/ids.go).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Adds --id-mode flag to gh2jsonl.py with support for both sequential
and hash-based IDs, matching the algorithm in internal/storage/sqlite/ids.go.
New features:
- encode_base36() function for base36 encoding
- generate_hash_id() matching Go implementation (SHA256 + base36)
- --id-mode {sequential|hash} CLI flag (default: sequential)
- --hash-length {3,4,5,6,7,8} for configurable hash length (default: 6)
Hash IDs are deterministic and content-based, using title, description,
creator, and timestamp. Sequential mode remains the default for backward
compatibility.
Examples:
python gh2jsonl.py --repo owner/repo --id-mode hash | bd import
python gh2jsonl.py --file issues.json --id-mode hash --hash-length 4