Feat: Replace UUID image identifiers with 8-character base62 short IDs

Short IDs become the canonical identifier in URLs (/i/:short_id),
MinIO/R2 storage keys, and all API responses. Hash-based deduplication
is preserved. Includes two-phase Alembic migration (003 adds nullable
column, 004 enforces NOT NULL) with a backfill script to copy storage
objects and populate short_id for existing images.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-10 00:13:55 +00:00
parent 87eb2703f5
commit 61d923d5be
41 changed files with 1445 additions and 137 deletions

View File

@@ -0,0 +1,198 @@
# Implementation Plan: Short Image IDs
**Branch**: `017-short-id-migration` | **Date**: 2026-05-09 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `specs/017-short-id-migration/spec.md`
## Summary
Replace hash-based storage keys and UUID-based URL routing with 8-character base62 short IDs. The short ID becomes the canonical identifier in URLs (`/i/:short_id`), storage keys (`{short_id}` / `{short_id}-thumb`), and all API responses. Hash-based deduplication is preserved unchanged. A Python migration script handles existing images: generates short IDs, copies storage objects to new keys, updates DB records, deletes old keys.
## Technical Context
**Language/Version**: Python 3.12+ (API), TypeScript strict (UI)
**Primary Dependencies**: FastAPI, SQLAlchemy 2.x async, Alembic, aiobotocore/boto3, Angular (latest stable)
**Storage**: PostgreSQL (DB), S3-compatible via boto3 (MinIO local / CDN in prod)
**Testing**: pytest + pytest-asyncio (API unit + integration), Karma/Jasmine (Angular)
**Target Platform**: Linux server (k3s), browser SPA
**Project Type**: Web application (FastAPI API + Angular SPA)
**Performance Goals**: Migration script should process all existing images without timeout; no user-facing performance change
**Constraints**: Migration must be idempotent; no data loss; copy-before-delete for all storage operations
**Scale/Scope**: Personal collection (~hundreds to low thousands of images); collision probability negligible
## Constitution Check
*GATE: Must pass before implementation.*
| Principle | Status | Notes |
|-----------|--------|-------|
| §2.5 DB abstraction — all queries through repos | ✅ PASS | New `get_by_short_id()` added to `ImageRepository`; no raw SQL outside repo |
| §2.6 No speculative abstraction | ✅ PASS | `generate_short_id()` is a concrete utility; no new interfaces |
| §3.1 Routes prefixed `/api/v1/` | ✅ PASS | All routes remain under `/api/v1/images/` |
| §3.1 Adding fields is non-breaking | ✅ PASS | `short_id` is additive; `id` UUID retained |
| §4.2 Images immutable after upload | ✅ PASS | File content is copied, not replaced; the operation changes the storage key, not the bytes |
| §4.3 Deduplication by content hash | ✅ PASS | `hash` column retained; `get_by_hash` unchanged |
| §5.1 Tests alongside every implementation task | ✅ PASS | Each task includes tests |
| §5.2 Integration tests use real PostgreSQL + MinIO | ✅ PASS | Existing integration test infrastructure reused |
| §8 Scope boundaries | ✅ PASS | No multi-user, no public sharing feature, no OR/NOT tag logic |
**No violations. Implementation may proceed.**
## Project Structure
### Documentation (this feature)
```text
specs/017-short-id-migration/
├── plan.md ← this file
├── research.md ← short ID generation, migration strategy
├── data-model.md ← Image schema changes, Alembic migrations
├── contracts/
│ └── image-api.md ← updated ImageRecord schema, route changes
├── quickstart.md ← manual test scenarios
└── tasks.md ← generated by /speckit-tasks
```
### Source Code Changes
```text
api/
├── app/
│ ├── models.py # Add Image.short_id column
│ ├── utils.py # Add generate_short_id()
│ ├── repositories/
│ │ └── image_repo.py # Add get_by_short_id(), update create()
│ └── routers/
│ └── images.py # Path params uuid→str, add short_id to response
├── alembic/versions/
│ ├── 003_add_short_id.py # ADD COLUMN short_id VARCHAR(8) NULLABLE UNIQUE
│ └── 004_short_id_not_null.py # SET NOT NULL (run after migration script)
├── scripts/
│ └── migrate_to_short_ids.py # Backfill existing images
└── tests/
├── unit/
│ ├── test_hashing.py # Add generate_short_id() tests
│ ├── test_url_construction.py # Update mock images to include short_id
│ └── test_short_id.py # NEW: collision retry, charset validation
└── integration/
├── test_upload.py # Assert short_id in response
├── test_search.py # Update {id} → {short_id} in route calls
├── test_delete.py # Update route params
├── test_serving.py # Update route params
└── test_tags.py # Update route params
ui/src/app/
├── app.routes.ts # 'images/:id' → 'i/:id'
├── services/
│ └── image.service.ts # Add short_id to ImageRecord, update service calls
├── library/
│ └── library.component.ts # Navigate to ['/i', img.short_id]
├── upload/
│ └── upload.component.ts # Navigate to ['/i', res.short_id] after upload
└── detail/
└── detail.component.ts # (no route change needed; reads :id param same way)
```
**Structure Decision**: Existing web application layout. API changes are concentrated in models, repository, router, and a new migration script. UI changes are confined to routes, image service interface, and two navigation calls.
## Implementation Phases
### Phase 1: Backend — Short ID Infrastructure
1. Add `generate_short_id()` to `api/app/utils.py`
- Base62 charset: `string.ascii_letters + string.digits`
- Uses `secrets.choice` for cryptographic randomness
- Returns 8-character string
2. Add Alembic migration `003_add_short_id.py`
- `ADD COLUMN short_id VARCHAR(8) NULL`
- `CREATE UNIQUE INDEX ix_images_short_id ON images (short_id)`
3. Update `api/app/models.py`
- Add `short_id: Mapped[str | None] = mapped_column(String(8), unique=True, nullable=True, index=True)`
4. Update `api/app/repositories/image_repo.py`
- Add `get_by_short_id(short_id: str) -> Image | None`
- Update `create()` to accept and persist `short_id` parameter
5. Update `api/app/routers/images.py`
- Change all `image_id: uuid.UUID` path params to `short_id: str`
- Add `_validate_short_id(short_id: str)` helper (8 alphanumeric chars, else 422)
- Replace `get_by_id` calls with `get_by_short_id`
- Update `_image_to_dict` to include `"short_id": image.short_id` in response
- Update upload handler: generate `short_id` with collision retry, use as storage key
### Phase 2: Migration Script
`api/scripts/migrate_to_short_ids.py`:
```
for each image where short_id IS NULL:
generate short_id (retry on DB collision)
copy {hash} → {short_id} in storage
if thumbnail_key IS NOT NULL:
copy {hash}-thumb → {short_id}-thumb in storage
verify new objects exist (head_object)
UPDATE images SET short_id={sid}, storage_key={sid}, thumbnail_key={sid}-thumb WHERE id={id}
delete {hash} from storage
if thumbnail_key was not null:
delete {hash}-thumb from storage
log: "migrated {id} → {short_id}"
print summary: N migrated, M skipped (already had short_id)
```
After script runs with 0 remaining `NULL` short_ids, apply migration `004_short_id_not_null.py`.
### Phase 3: Frontend
1. `app.routes.ts`: `path: 'images/:id'``path: 'i/:id'`
2. `image.service.ts`: add `short_id: string` to `ImageRecord`
3. `library.component.ts`: `router.navigate(['/images', img.id])``router.navigate(['/i', img.short_id])`
4. `upload.component.ts`: `router.navigate(['/images', res.id])``router.navigate(['/i', res.short_id])`
### Phase 4: Polish
- Update all existing API integration tests to use `short_id` in route paths
- Run `ng lint` and `ruff check` across modified files
- Verify `ng build --configuration production` succeeds
- Run full test suites: `make test-unit && make test-integration`
## Key Implementation Notes
### Collision Retry Pattern (upload)
```python
MAX_RETRIES = 10
for attempt in range(MAX_RETRIES):
short_id = generate_short_id()
try:
image = await image_repo.create(..., short_id=short_id)
break
except IntegrityError: # short_id collision
await db.rollback()
if attempt == MAX_RETRIES - 1:
raise RuntimeError("Could not generate unique short_id")
```
### Route Validation
```python
import re
_SHORT_ID_RE = re.compile(r'^[a-zA-Z0-9]{8}$')
def _validate_short_id(short_id: str) -> None:
if not _SHORT_ID_RE.match(short_id):
raise HTTPException(422, detail={"detail": "Invalid image ID", "code": "invalid_short_id"})
```
### `_image_to_dict` Update
Add `"short_id": image.short_id` to the returned dict. The `file_url` and `thumbnail_url` generation already uses `image.storage_key` which will now equal `image.short_id` — no formula change needed.
### Migration Script Entry Point
```bash
cd api && python -m scripts.migrate_to_short_ids
```
Reads DB URL and storage config from environment variables (same as the application).