Files
reactbin/specs/003-upload-thumbnails/research.md
agatha f953c88984 Feat: Pre-generate WebP thumbnails on upload for faster library load
- Add Pillow dependency and thumbnail.py with generate_thumbnail() — produces
  WebP ≤400px, preserves aspect ratio, never upscales, handles GIF frame 0
- Alembic migration 002 adds nullable thumbnail_key column to images table
- Upload route generates thumbnail via asyncio.to_thread (non-blocking),
  stores at {hash}-thumb; failure is tolerated and upload succeeds with null key
- New GET /api/v1/images/{id}/thumbnail endpoint: serves WebP thumbnail or
  falls back to original for pre-feature images; ETag + immutable cache headers
- Delete route cleans up thumbnail storage object alongside original
- Library grid switches from /file to /thumbnail for all image src bindings
- 59 tests passing (46 existing + 13 new across unit, upload, serving, delete)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 17:26:16 +00:00

131 lines
5.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Research: Upload Thumbnails
**Branch**: `003-upload-thumbnails` | **Date**: 2026-05-03
## Decision 1: Image processing library
**Decision**: Add `Pillow` as a runtime dependency to the API.
**Rationale**: Pillow is the standard Python image processing library. It supports
reading JPEG, PNG, GIF (frame extraction), and WebP, and can encode output as WebP.
It handles aspect-ratio-preserving resize natively via `Image.thumbnail()`. No
other dependency is needed.
**Alternatives considered**:
- `wand` (ImageMagick binding): More powerful but much heavier; overkill for a
fixed-size resize operation.
- `opencv-python`: ML-focused, large binary; not justified for simple resize.
- Pure `aiobotocore` + external service: Adds operational complexity with no benefit
over a local library for a single-user app.
---
## Decision 2: Thumbnail dimensions and format
**Decision**: Longest side ≤ 400 px, WebP output, aspect ratio preserved. This
matches FR-003 and FR-004 exactly and the user's stated preference.
**Rationale**: WebP produces smaller files than JPEG/PNG at equivalent visual quality.
400 px covers a typical grid thumbnail at 1× and 2× display density without being
oversized. Pillow's `Image.thumbnail((400, 400))` implements this constraint directly
(it shrinks to fit within the bounding box, never upscaling).
**Alternatives considered**:
- JPEG thumbnails: Larger file sizes; no alpha channel support.
- Multiple sizes: Out of scope for v1 per spec Assumptions.
- On-demand resize: Rejected by user in favour of pre-generation.
---
## Decision 3: Thumbnail storage key convention
**Decision**: `{sha256_hash}-thumb` (e.g., the 64-char hash hex string + literal
`-thumb`, giving a 70-char key). Stored under the same S3 bucket as originals.
**Rationale**: Deterministic from the image hash — no new random state needed and the
key can always be reconstructed from the `Image.hash` field. The `-thumb` suffix
clearly distinguishes it from the original key. Fits within a `String(70)` column.
**Alternatives considered**:
- Separate bucket for thumbnails: More complex bucket policy management with no benefit
for a single-user app.
- UUID-based key: Non-deterministic; requires an extra DB round-trip to look up.
- `{hash}/thumb.webp` (path prefix): Works, but adds key parsing complexity for no gain.
---
## Decision 4: Database schema change
**Decision**: Add a nullable `thumbnail_key: String(70)` column to the `images`
table. `NULL` means no thumbnail exists (either generation failed or the image
pre-dates this feature). Add a new Alembic migration `002_add_thumbnail_key.py`.
**Rationale**: Explicitly tracking the thumbnail key in the DB makes the "does a
thumbnail exist?" question a simple `IS NOT NULL` check rather than an S3 head
request. Also allows the delete route to skip the thumbnail delete if the column
is NULL, avoiding a storage error for legacy images.
**Alternatives considered**:
- Derive key at runtime from `image.hash + "-thumb"` without a DB column: Simpler but
means no way to distinguish "thumbnail was generated" from "thumbnail was never
attempted", and delete would need a conditional S3 head request.
- Separate `thumbnails` table: Over-engineered; one thumbnail per image with no
additional attributes doesn't warrant its own table.
---
## Decision 5: Where thumbnail generation lives in the code
**Decision**: A standalone async function `generate_thumbnail(data: bytes, mime_type: str) -> bytes`
in a new module `api/app/thumbnail.py`. Called from the upload route after the original
is stored, before the Image record is created.
**Rationale**: Keeps the thumbnail logic self-contained and independently testable.
The upload route calls it but doesn't own it. Constitution §2.6 allows concrete
functions when no second implementation is needed — no abstract interface is warranted.
**Alternatives considered**:
- Method on `StorageBackend`: Wrong layer; storage knows nothing about image content.
- Inline in the upload route: Makes the route harder to test and read.
- A `ThumbnailService` class: No justification for a class when a module-level function suffices.
---
## Decision 6: Failure handling during upload
**Decision**: If `generate_thumbnail()` raises, log the exception, set `thumbnail_key`
to `NULL` on the Image record, and continue. The upload response succeeds. The
`GET /api/v1/images/{id}/thumbnail` endpoint falls back to the original when
`thumbnail_key` is NULL (FR-009).
**Rationale**: A thumbnail failure should not block the upload — the user still gets
their image in the library. The fallback in the thumbnail endpoint ensures the grid
still renders something.
---
## Decision 7: Thumbnail endpoint response
**Decision**: `GET /api/v1/images/{id}/thumbnail` follows the same pattern as
`GET /api/v1/images/{id}/file`:
- Returns `200` with binary content, `Content-Type: image/webp` (or original
`mime_type` when falling back to original), `ETag`, and
`Cache-Control: public, max-age=31536000, immutable`.
- Returns `404` with `{"detail": "...", "code": "image_not_found"}` if the image
does not exist.
- Falls back to the original when `thumbnail_key IS NULL`.
**Rationale**: Consistent with the existing `/file` endpoint pattern established in
feature 002. The UI only needs to know one URL per image for the grid.
---
## Decision 8: GIF handling
**Decision**: For GIF uploads, `generate_thumbnail()` extracts frame 0 via
`Image.seek(0)` before resizing. The output is always WebP (static, not animated).
**Rationale**: Matches spec assumption: "Animated GIF thumbnails capture only the
first frame; animation is not preserved in the thumbnail." Pillow supports this
with `im.seek(0)`.