Feat: Pre-generate WebP thumbnails on upload for faster library load
- Add Pillow dependency and thumbnail.py with generate_thumbnail() — produces
WebP ≤400px, preserves aspect ratio, never upscales, handles GIF frame 0
- Alembic migration 002 adds nullable thumbnail_key column to images table
- Upload route generates thumbnail via asyncio.to_thread (non-blocking),
stores at {hash}-thumb; failure is tolerated and upload succeeds with null key
- New GET /api/v1/images/{id}/thumbnail endpoint: serves WebP thumbnail or
falls back to original for pre-feature images; ETag + immutable cache headers
- Delete route cleans up thumbnail storage object alongside original
- Library grid switches from /file to /thumbnail for all image src bindings
- 59 tests passing (46 existing + 13 new across unit, upload, serving, delete)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
130
specs/003-upload-thumbnails/research.md
Normal file
130
specs/003-upload-thumbnails/research.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Research: Upload Thumbnails
|
||||
|
||||
**Branch**: `003-upload-thumbnails` | **Date**: 2026-05-03
|
||||
|
||||
## Decision 1: Image processing library
|
||||
|
||||
**Decision**: Add `Pillow` as a runtime dependency to the API.
|
||||
|
||||
**Rationale**: Pillow is the standard Python image processing library. It supports
|
||||
reading JPEG, PNG, GIF (frame extraction), and WebP, and can encode output as WebP.
|
||||
It handles aspect-ratio-preserving resize natively via `Image.thumbnail()`. No
|
||||
other dependency is needed.
|
||||
|
||||
**Alternatives considered**:
|
||||
- `wand` (ImageMagick binding): More powerful but much heavier; overkill for a
|
||||
fixed-size resize operation.
|
||||
- `opencv-python`: ML-focused, large binary; not justified for simple resize.
|
||||
- Pure `aiobotocore` + external service: Adds operational complexity with no benefit
|
||||
over a local library for a single-user app.
|
||||
|
||||
---
|
||||
|
||||
## Decision 2: Thumbnail dimensions and format
|
||||
|
||||
**Decision**: Longest side ≤ 400 px, WebP output, aspect ratio preserved. This
|
||||
matches FR-003 and FR-004 exactly and the user's stated preference.
|
||||
|
||||
**Rationale**: WebP produces smaller files than JPEG/PNG at equivalent visual quality.
|
||||
400 px covers a typical grid thumbnail at 1× and 2× display density without being
|
||||
oversized. Pillow's `Image.thumbnail((400, 400))` implements this constraint directly
|
||||
(it shrinks to fit within the bounding box, never upscaling).
|
||||
|
||||
**Alternatives considered**:
|
||||
- JPEG thumbnails: Larger file sizes; no alpha channel support.
|
||||
- Multiple sizes: Out of scope for v1 per spec Assumptions.
|
||||
- On-demand resize: Rejected by user in favour of pre-generation.
|
||||
|
||||
---
|
||||
|
||||
## Decision 3: Thumbnail storage key convention
|
||||
|
||||
**Decision**: `{sha256_hash}-thumb` (e.g., the 64-char hash hex string + literal
|
||||
`-thumb`, giving a 70-char key). Stored under the same S3 bucket as originals.
|
||||
|
||||
**Rationale**: Deterministic from the image hash — no new random state needed and the
|
||||
key can always be reconstructed from the `Image.hash` field. The `-thumb` suffix
|
||||
clearly distinguishes it from the original key. Fits within a `String(70)` column.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Separate bucket for thumbnails: More complex bucket policy management with no benefit
|
||||
for a single-user app.
|
||||
- UUID-based key: Non-deterministic; requires an extra DB round-trip to look up.
|
||||
- `{hash}/thumb.webp` (path prefix): Works, but adds key parsing complexity for no gain.
|
||||
|
||||
---
|
||||
|
||||
## Decision 4: Database schema change
|
||||
|
||||
**Decision**: Add a nullable `thumbnail_key: String(70)` column to the `images`
|
||||
table. `NULL` means no thumbnail exists (either generation failed or the image
|
||||
pre-dates this feature). Add a new Alembic migration `002_add_thumbnail_key.py`.
|
||||
|
||||
**Rationale**: Explicitly tracking the thumbnail key in the DB makes the "does a
|
||||
thumbnail exist?" question a simple `IS NOT NULL` check rather than an S3 head
|
||||
request. Also allows the delete route to skip the thumbnail delete if the column
|
||||
is NULL, avoiding a storage error for legacy images.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Derive key at runtime from `image.hash + "-thumb"` without a DB column: Simpler but
|
||||
means no way to distinguish "thumbnail was generated" from "thumbnail was never
|
||||
attempted", and delete would need a conditional S3 head request.
|
||||
- Separate `thumbnails` table: Over-engineered; one thumbnail per image with no
|
||||
additional attributes doesn't warrant its own table.
|
||||
|
||||
---
|
||||
|
||||
## Decision 5: Where thumbnail generation lives in the code
|
||||
|
||||
**Decision**: A standalone async function `generate_thumbnail(data: bytes, mime_type: str) -> bytes`
|
||||
in a new module `api/app/thumbnail.py`. Called from the upload route after the original
|
||||
is stored, before the Image record is created.
|
||||
|
||||
**Rationale**: Keeps the thumbnail logic self-contained and independently testable.
|
||||
The upload route calls it but doesn't own it. Constitution §2.6 allows concrete
|
||||
functions when no second implementation is needed — no abstract interface is warranted.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Method on `StorageBackend`: Wrong layer; storage knows nothing about image content.
|
||||
- Inline in the upload route: Makes the route harder to test and read.
|
||||
- A `ThumbnailService` class: No justification for a class when a module-level function suffices.
|
||||
|
||||
---
|
||||
|
||||
## Decision 6: Failure handling during upload
|
||||
|
||||
**Decision**: If `generate_thumbnail()` raises, log the exception, set `thumbnail_key`
|
||||
to `NULL` on the Image record, and continue. The upload response succeeds. The
|
||||
`GET /api/v1/images/{id}/thumbnail` endpoint falls back to the original when
|
||||
`thumbnail_key` is NULL (FR-009).
|
||||
|
||||
**Rationale**: A thumbnail failure should not block the upload — the user still gets
|
||||
their image in the library. The fallback in the thumbnail endpoint ensures the grid
|
||||
still renders something.
|
||||
|
||||
---
|
||||
|
||||
## Decision 7: Thumbnail endpoint response
|
||||
|
||||
**Decision**: `GET /api/v1/images/{id}/thumbnail` follows the same pattern as
|
||||
`GET /api/v1/images/{id}/file`:
|
||||
- Returns `200` with binary content, `Content-Type: image/webp` (or original
|
||||
`mime_type` when falling back to original), `ETag`, and
|
||||
`Cache-Control: public, max-age=31536000, immutable`.
|
||||
- Returns `404` with `{"detail": "...", "code": "image_not_found"}` if the image
|
||||
does not exist.
|
||||
- Falls back to the original when `thumbnail_key IS NULL`.
|
||||
|
||||
**Rationale**: Consistent with the existing `/file` endpoint pattern established in
|
||||
feature 002. The UI only needs to know one URL per image for the grid.
|
||||
|
||||
---
|
||||
|
||||
## Decision 8: GIF handling
|
||||
|
||||
**Decision**: For GIF uploads, `generate_thumbnail()` extracts frame 0 via
|
||||
`Image.seek(0)` before resizing. The output is always WebP (static, not animated).
|
||||
|
||||
**Rationale**: Matches spec assumption: "Animated GIF thumbnails capture only the
|
||||
first frame; animation is not preserved in the thumbnail." Pillow supports this
|
||||
with `im.seek(0)`.
|
||||
Reference in New Issue
Block a user