Fix doctor incorrectly diagnosing hash IDs as sequential (issue #322)

- Enhanced checkIDFormat to sample multiple issues instead of just one
- Added detectHashBasedIDs function with robust multi-heuristic detection:
  * Checks for child_counters table (hash ID schema indicator)
  * Detects letters in IDs (base36 encoding)
  * Identifies leading zeros (common in hash IDs, rare in sequential)
  * Analyzes variable length patterns (adaptive hash IDs)
  * Checks for non-sequential numeric ordering
- Added comprehensive test coverage (16 new test cases)
- Fixes false positives for numeric-only hash IDs like 'pf-0088'

Closes #322
This commit is contained in:
Charles P. Cross
2025-11-18 05:20:11 -05:00
parent aeea3532cd
commit 8c1f865e23
3 changed files with 391 additions and 12 deletions

View File

@@ -251,13 +251,33 @@ func TestIsHashID(t *testing.T) {
id string
expected bool
}{
// Sequential IDs (numeric only, short)
{"bd-1", false},
{"bd-123", false},
{"bd-9999", false},
// Hash IDs with letters
{"bd-a3f8e9a2", true},
{"bd-abc123", true},
{"bd-123abc", true},
{"bd-a3f8e9a2.1", true},
{"bd-a3f8e9a2.1.2", true},
// Hash IDs that are numeric but 5+ characters (likely hash)
{"bd-12345", true},
{"bd-0088", false}, // 4 chars, all numeric - ambiguous, defaults to false
{"bd-00880", true}, // 5+ chars, likely hash
// Base36 hash IDs with letters
{"bd-5n3", true},
{"bd-65w", true},
{"bd-jmx", true},
{"bd-4rt", true},
// Edge cases
{"bd-", false}, // Empty suffix
{"invalid", false}, // No dash
{"bd-0", false}, // Single digit
}
for _, tt := range tests {