Feat: Proxy image content through the API instead of redirecting to MinIO

Replace the presigned-URL redirect (302) in GET /api/v1/images/{id}/file
with a direct proxy that fetches bytes from S3 server-side and returns them
to the client. The browser never contacts the storage backend, eliminating
the /etc/hosts workaround needed in local development.

- StorageBackend: swap get_presigned_url for get(key) -> bytes
- S3StorageBackend: implement get() via aiobotocore get_object
- serve_image_file: return Response with ETag + Cache-Control: immutable
- test_serving: assert 200 + content-type + ETag; add no-storage-details test
- Spec Kit artifacts for feature 002-api-image-proxy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-03 16:36:43 +00:00
parent 1cee6adc68
commit cd89ba5dea
13 changed files with 688 additions and 25 deletions

View File

@@ -0,0 +1,56 @@
# Research: API Image Proxy
**Branch**: `002-api-image-proxy` | **Date**: 2026-05-03
## Decision 1: Storage retrieval method
**Decision**: Add `async def get(key: str) -> bytes` to `StorageBackend` and remove `get_presigned_url`.
**Rationale**: The only consumer of `get_presigned_url` is the `serve_image_file` route. Once the route stops redirecting and starts streaming bytes directly, `get_presigned_url` has no call sites. Per §2.6 (no speculative abstraction), it must be removed. A simple `get → bytes` method is consistent with `put` which already operates on `bytes`, and is straightforward to implement and test. At 50 MB maximum file size, loading the full object into memory in a single call is acceptable — the same pattern is used on upload already.
**Alternatives considered**:
- `async def stream(key) -> AsyncIterator[bytes]`: True streaming avoids buffering but complicates the abstract interface (async generators cannot cleanly implement abstract methods in Python without workarounds). Deferred; can be introduced later if memory pressure is observed.
- Keep `get_presigned_url` and add `get`: Violates §2.6 since `get_presigned_url` would then have no callers.
---
## Decision 2: HTTP response type for the proxy endpoint
**Decision**: Return `fastapi.Response(content=data, media_type=mime_type)` with the image bytes directly.
**Rationale**: FastAPI's `Response` with raw bytes is the simplest correct approach when the full content is already in memory. The `mime_type` field is already stored on the `Image` database record, so the router can set `Content-Type` from it without re-inspecting the file.
**Alternatives considered**:
- `StreamingResponse` with an async generator: Appropriate for true streaming but adds complexity with no benefit when content is already loaded as `bytes`.
- `FileResponse`: For local file paths only, not applicable.
---
## Decision 3: Caching headers
**Decision**: Add `ETag: "<sha256-hash>"` and `Cache-Control: public, max-age=31536000, immutable` to the content response.
**Rationale**: Image files are immutable after upload (constitution §4.2). The SHA-256 hash is already stored on the `Image` record. An `ETag` allows conditional `GET` requests (`If-None-Match`) so browsers skip re-downloading unchanged content. `Cache-Control: immutable` tells browsers the content will never change for this URL, eliminating speculative revalidation. Together these satisfy SC-004 (browser caching).
**Alternatives considered**:
- `Last-Modified` header: Less precise than ETag for binary content.
- No caching headers: Fails SC-004.
---
## Decision 4: Endpoint path
**Decision**: Keep the existing path `GET /api/v1/images/{image_id}/file`. Change the response from `302 RedirectResponse` to `200` with binary content.
**Rationale**: The path already exists, is already referenced by the UI's `getFileUrl()` method, and appears in the existing OpenAPI contract. Changing the response body but not the path means the UI requires no URL-construction changes. The existing integration tests (`test_serving.py`) must be updated to assert `200` instead of `302`.
**Alternatives considered**:
- New path `/api/v1/images/{id}/content`: Would require UI changes with no benefit; the existing path already semantically expresses "the file content for this image".
---
## Decision 5: Removal of `storage_key` from public API response
**Decision**: Out of scope for this feature. `storage_key` in the image metadata response is an internal S3 key (SHA-256 hex string), but it is not an actionable credential or hostname. Removal is a separate API-breaking change.
**Rationale**: The spec explicitly scopes this feature to how image *content* is delivered. Metadata response shape changes are independent and require their own spec and contract update.