Files
agatha f953c88984 Feat: Pre-generate WebP thumbnails on upload for faster library load
- Add Pillow dependency and thumbnail.py with generate_thumbnail() — produces
  WebP ≤400px, preserves aspect ratio, never upscales, handles GIF frame 0
- Alembic migration 002 adds nullable thumbnail_key column to images table
- Upload route generates thumbnail via asyncio.to_thread (non-blocking),
  stores at {hash}-thumb; failure is tolerated and upload succeeds with null key
- New GET /api/v1/images/{id}/thumbnail endpoint: serves WebP thumbnail or
  falls back to original for pre-feature images; ETag + immutable cache headers
- Delete route cleans up thumbnail storage object alongside original
- Library grid switches from /file to /thumbnail for all image src bindings
- 59 tests passing (46 existing + 13 new across unit, upload, serving, delete)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 17:26:16 +00:00

5.6 KiB
Raw Permalink Blame History

Research: Upload Thumbnails

Branch: 003-upload-thumbnails | Date: 2026-05-03

Decision 1: Image processing library

Decision: Add Pillow as a runtime dependency to the API.

Rationale: Pillow is the standard Python image processing library. It supports reading JPEG, PNG, GIF (frame extraction), and WebP, and can encode output as WebP. It handles aspect-ratio-preserving resize natively via Image.thumbnail(). No other dependency is needed.

Alternatives considered:

  • wand (ImageMagick binding): More powerful but much heavier; overkill for a fixed-size resize operation.
  • opencv-python: ML-focused, large binary; not justified for simple resize.
  • Pure aiobotocore + external service: Adds operational complexity with no benefit over a local library for a single-user app.

Decision 2: Thumbnail dimensions and format

Decision: Longest side ≤ 400 px, WebP output, aspect ratio preserved. This matches FR-003 and FR-004 exactly and the user's stated preference.

Rationale: WebP produces smaller files than JPEG/PNG at equivalent visual quality. 400 px covers a typical grid thumbnail at 1× and 2× display density without being oversized. Pillow's Image.thumbnail((400, 400)) implements this constraint directly (it shrinks to fit within the bounding box, never upscaling).

Alternatives considered:

  • JPEG thumbnails: Larger file sizes; no alpha channel support.
  • Multiple sizes: Out of scope for v1 per spec Assumptions.
  • On-demand resize: Rejected by user in favour of pre-generation.

Decision 3: Thumbnail storage key convention

Decision: {sha256_hash}-thumb (e.g., the 64-char hash hex string + literal -thumb, giving a 70-char key). Stored under the same S3 bucket as originals.

Rationale: Deterministic from the image hash — no new random state needed and the key can always be reconstructed from the Image.hash field. The -thumb suffix clearly distinguishes it from the original key. Fits within a String(70) column.

Alternatives considered:

  • Separate bucket for thumbnails: More complex bucket policy management with no benefit for a single-user app.
  • UUID-based key: Non-deterministic; requires an extra DB round-trip to look up.
  • {hash}/thumb.webp (path prefix): Works, but adds key parsing complexity for no gain.

Decision 4: Database schema change

Decision: Add a nullable thumbnail_key: String(70) column to the images table. NULL means no thumbnail exists (either generation failed or the image pre-dates this feature). Add a new Alembic migration 002_add_thumbnail_key.py.

Rationale: Explicitly tracking the thumbnail key in the DB makes the "does a thumbnail exist?" question a simple IS NOT NULL check rather than an S3 head request. Also allows the delete route to skip the thumbnail delete if the column is NULL, avoiding a storage error for legacy images.

Alternatives considered:

  • Derive key at runtime from image.hash + "-thumb" without a DB column: Simpler but means no way to distinguish "thumbnail was generated" from "thumbnail was never attempted", and delete would need a conditional S3 head request.
  • Separate thumbnails table: Over-engineered; one thumbnail per image with no additional attributes doesn't warrant its own table.

Decision 5: Where thumbnail generation lives in the code

Decision: A standalone async function generate_thumbnail(data: bytes, mime_type: str) -> bytes in a new module api/app/thumbnail.py. Called from the upload route after the original is stored, before the Image record is created.

Rationale: Keeps the thumbnail logic self-contained and independently testable. The upload route calls it but doesn't own it. Constitution §2.6 allows concrete functions when no second implementation is needed — no abstract interface is warranted.

Alternatives considered:

  • Method on StorageBackend: Wrong layer; storage knows nothing about image content.
  • Inline in the upload route: Makes the route harder to test and read.
  • A ThumbnailService class: No justification for a class when a module-level function suffices.

Decision 6: Failure handling during upload

Decision: If generate_thumbnail() raises, log the exception, set thumbnail_key to NULL on the Image record, and continue. The upload response succeeds. The GET /api/v1/images/{id}/thumbnail endpoint falls back to the original when thumbnail_key is NULL (FR-009).

Rationale: A thumbnail failure should not block the upload — the user still gets their image in the library. The fallback in the thumbnail endpoint ensures the grid still renders something.


Decision 7: Thumbnail endpoint response

Decision: GET /api/v1/images/{id}/thumbnail follows the same pattern as GET /api/v1/images/{id}/file:

  • Returns 200 with binary content, Content-Type: image/webp (or original mime_type when falling back to original), ETag, and Cache-Control: public, max-age=31536000, immutable.
  • Returns 404 with {"detail": "...", "code": "image_not_found"} if the image does not exist.
  • Falls back to the original when thumbnail_key IS NULL.

Rationale: Consistent with the existing /file endpoint pattern established in feature 002. The UI only needs to know one URL per image for the grid.


Decision 8: GIF handling

Decision: For GIF uploads, generate_thumbnail() extracts frame 0 via Image.seek(0) before resizing. The output is always WebP (static, not animated).

Rationale: Matches spec assumption: "Animated GIF thumbnails capture only the first frame; animation is not preserved in the thumbnail." Pillow supports this with im.seek(0).