Short IDs become the canonical identifier in URLs (/i/:short_id), MinIO/R2 storage keys, and all API responses. Hash-based deduplication is preserved. Includes two-phase Alembic migration (003 adds nullable column, 004 enforces NOT NULL) with a backfill script to copy storage objects and populate short_id for existing images. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.7 KiB
Research: Short Image IDs
Short ID Generation
Decision: Use secrets.choice over string.ascii_letters + string.digits (base62, 62 characters), 8 characters long.
Rationale: secrets.choice is cryptographically random, eliminating any bias from modular reduction that affects simpler approaches. Base62 (a–z, A–Z, 0–9) is URL-safe without percent-encoding. 8 characters gives 62⁸ ≈ 218 trillion combinations — negligible collision probability even at millions of images.
Alternatives considered:
secrets.token_urlsafe(6)— includes-and_, not pure alphanumeric- UUID truncation (first 8 chars of hex) — only 16 chars of alphabet (hex), dramatically fewer combinations than base62
- nanoid (npm) — JavaScript library, requires a separate dependency for Python
Collision retry: On insert, if a UniqueConstraint violation is raised on short_id, generate a new one and retry (up to a configurable limit, e.g., 10 attempts). At 10,000 images the per-attempt collision probability is ~4.6 × 10⁻¹¹; retries are a pure safety measure.
Alembic Two-Phase Migration Strategy
Decision: Two separate Alembic migrations (003 + 004), with the Python migration script run between them.
Rationale: The short_id column must start nullable so existing rows can be inserted without a value. The migration script fills all existing rows. Once confirmed, a second migration adds the NOT NULL constraint. Running both as one migration would require a complex inline Python script in Alembic (fragile, untestable). Two migrations with a script in between is the standard approach for backfill + constraint change.
Migration 003: ADD COLUMN short_id VARCHAR(8) NULL UNIQUE + GiST/B-tree index.
Script: Fill all rows, idempotent (skip rows where short_id IS NOT NULL).
Migration 004: ALTER COLUMN short_id SET NOT NULL.
Storage Object Copy Strategy
Decision: Copy-then-verify-then-delete (not atomic rename). Using the MinIO/S3 copy_object API followed by a delete_object call.
Rationale: S3-compatible object stores do not support atomic renames. The safe approach is: copy to new key, verify new object exists (head_object), update DB, delete old key. If interrupted after copy but before delete, the old object remains — wasted storage but no data loss. The migration is idempotent: if short_id is already set on a row, the script skips it.
Alternatives considered:
mc mv(MinIO client CLI) — simpler but harder to script transactionally with DB updates- Direct Python with
aiobotocore— chosen; same library already used by the storage backend
API Route Parameter Change
Decision: Change all image route parameters from image_id: uuid.UUID to short_id: str with manual length/charset validation.
Rationale: FastAPI's uuid.UUID type annotation rejects non-UUID strings at the path-parsing stage, so the existing routes cannot accept short IDs without a type change. Switching to str with a custom validator (8 alphanumeric chars) is minimal and clear.
Impact: All routes under /api/v1/images/{id} change to accept an 8-char string. The id field in API responses is retained as the UUID; short_id is added as a new field. The UI switches to using short_id for all navigation and API calls.
Response Schema: Additive Change
Decision: Add short_id as a new field to the image response dict. The existing id (UUID) field is retained.
Rationale: Adding a field is non-breaking per §3.1. Removing id would be a breaking change. Retaining both allows any internal tooling or API consumers that already use id to continue working. The UI transitions to using short_id for routing and API calls, but the UUID remains queryable if needed.