Files
agatha cd89ba5dea Feat: Proxy image content through the API instead of redirecting to MinIO
Replace the presigned-URL redirect (302) in GET /api/v1/images/{id}/file
with a direct proxy that fetches bytes from S3 server-side and returns them
to the client. The browser never contacts the storage backend, eliminating
the /etc/hosts workaround needed in local development.

- StorageBackend: swap get_presigned_url for get(key) -> bytes
- S3StorageBackend: implement get() via aiobotocore get_object
- serve_image_file: return Response with ETag + Cache-Control: immutable
- test_serving: assert 200 + content-type + ETag; add no-storage-details test
- Spec Kit artifacts for feature 002-api-image-proxy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 16:36:43 +00:00

128 lines
5.9 KiB
Markdown

# Feature Specification: API Image Proxy
**Feature Branch**: `002-api-image-proxy`
**Created**: 2026-05-03
**Status**: Draft
**Input**: User description: "Instead of directly exposing the Minio storage, proxy fetching images through the API. The API server should talk directly to the S3 backend."
## User Scenarios & Testing *(mandatory)*
### User Story 1 — View Images Without Storage Configuration (Priority: P1)
A user opens the application on any network without any special DNS or host-file
configuration. All images — both thumbnails in the library grid and the full-size
image on the detail page — load correctly.
**Why this priority**: This is the core motivator for the feature. Direct storage
exposure required `/etc/hosts` hacks to resolve the internal storage hostname;
the proxy removes this friction entirely and makes the application work
out-of-the-box.
**Independent Test**: Start the application on a machine with no custom
`/etc/hosts` entries for the storage backend. Open the library and verify all
thumbnails load. Open an image detail page and verify the full-size image loads.
Confirm no network errors related to image fetching appear in the browser console.
**Acceptance Scenarios**:
1. **Given** the application is running, **When** the user opens the library,
**Then** all image thumbnails display correctly with no host-file configuration
required.
2. **Given** the user opens an image detail page, **When** the page loads,
**Then** the full-size image displays correctly.
3. **Given** a request for an image that does not exist, **When** the client
requests its content, **Then** a not-found response is returned with no
storage-specific details exposed.
---
### User Story 2 — Storage Backend Remains Private (Priority: P1)
The image storage system is never directly reachable or identifiable from a
client browser. Image URLs in the application reference only the API, never the
storage backend's hostname, credentials, or bucket names.
**Why this priority**: Security and portability. A private storage backend means
credentials cannot be extracted from the browser and the storage layer can be
reconfigured without affecting clients.
**Independent Test**: Open browser developer tools while browsing the
application. Inspect all image `src` attributes, network requests, and API
responses. Confirm that no request goes directly to the storage backend and no
storage-specific URL, hostname, or credential appears in the browser's network
tab or page source.
**Acceptance Scenarios**:
1. **Given** the user is browsing the application, **When** they inspect network
traffic, **Then** all image content is fetched through the application's API
domain — no request goes directly to storage infrastructure.
2. **Given** any API response containing image metadata, **When** it is
inspected, **Then** no storage-specific URLs, hostnames, bucket names, or
credentials appear in the response body.
---
### Edge Cases
- What happens when the storage backend is temporarily unavailable? → The proxy
returns a generic server-error response; no storage internals are disclosed.
- What happens when the image exists in the database but the file is missing from
storage? → The proxy returns a not-found response with a generic message.
- What happens when image content is large (up to 50 MB)? → Content streams to
the client without exhausting server memory.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: The API MUST expose an endpoint that returns image binary content
when given a valid image identifier.
- **FR-002**: The API MUST serve image content with the correct file-type
indicator so that browsers render it natively.
- **FR-003**: The image content endpoint MUST be the sole mechanism by which
clients retrieve image binary data; no other image content URL format MUST be
presented to clients.
- **FR-004**: The storage backend MUST NOT be directly reachable by client
browsers; all image content MUST flow through the API.
- **FR-005**: The API MUST return a not-found response when content is requested
for a non-existent image.
- **FR-006**: The API MUST handle image content requests for files up to 50 MB
without exhausting server memory.
- **FR-007**: The image content endpoint MUST include caching metadata in its
response so that browsers can avoid redundant fetches for the same image.
- **FR-008**: The UI MUST construct all image display URLs using the API's image
content endpoint, for both library thumbnails and the detail view.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: A user can view all images in the library and on detail pages on a
freshly configured machine with no custom DNS or host-file entries for the
storage backend.
- **SC-002**: No image-related network request in the browser targets the storage
backend directly — verifiable by inspecting browser developer tools.
- **SC-003**: Images up to 50 MB load completely without a server-side memory or
timeout error.
- **SC-004**: A browser that has already loaded an image does not re-download it
on a second visit to the same page (browser caching headers are respected).
- **SC-005**: Image load time via the proxy is within 20% of the time taken when
served directly from storage on a local network connection.
## Assumptions
- The storage backend is accessible from the API server via an internal network
hostname that is not reachable from client browsers.
- Image files are immutable after upload; once a client has the content for a
given image identifier it need not be re-fetched.
- No authentication is required to access the image content endpoint in Phase 1,
consistent with the application's no-auth Phase 1 stance.
- This feature changes how image content is delivered but does not alter any
image metadata endpoints, the upload flow, or tag management behaviour.
- No new database entities or schema changes are required; only how the existing
stored file is retrieved and served changes.