Files
reactbin/specs/002-api-image-proxy/tasks.md
agatha cd89ba5dea Feat: Proxy image content through the API instead of redirecting to MinIO
Replace the presigned-URL redirect (302) in GET /api/v1/images/{id}/file
with a direct proxy that fetches bytes from S3 server-side and returns them
to the client. The browser never contacts the storage backend, eliminating
the /etc/hosts workaround needed in local development.

- StorageBackend: swap get_presigned_url for get(key) -> bytes
- S3StorageBackend: implement get() via aiobotocore get_object
- serve_image_file: return Response with ETag + Cache-Control: immutable
- test_serving: assert 200 + content-type + ETag; add no-storage-details test
- Spec Kit artifacts for feature 002-api-image-proxy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 16:36:43 +00:00

136 lines
7.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Tasks: API Image Proxy
**Input**: Design documents from `specs/002-api-image-proxy/`
**Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/api.md ✅
**TDD**: Tests are non-negotiable per constitution §5.1. Every test task MUST be written and confirmed failing before its implementation task runs.
**Organization**: Tasks are grouped by user story to enable independent implementation and testing.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies on incomplete tasks)
- **[Story]**: Which user story this task belongs to (US1, US2)
---
## Phase 1: Setup
No setup required. The project is already initialized; the feature branch `002-api-image-proxy` is active.
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Update the `StorageBackend` interface before any route or test change can reference the new method. These are pure interface/plumbing tasks with no callers yet — no preceding failing test is required because the tasks themselves introduce no observable behaviour change.
**⚠️ CRITICAL**: Both T001 and T002 must be complete before US1 work begins.
- [x] T001 Remove abstract method `get_presigned_url` and add `async def get(self, key: str) -> bytes` abstract method (with docstring "Return object content as bytes.") to `StorageBackend` in `api/app/storage/backend.py`
- [x] T002 [P] Remove `get_presigned_url` implementation; add `async def get(self, key: str) -> bytes` that opens an aiobotocore client, calls `client.get_object(Bucket=self._settings.s3_bucket_name, Key=key)`, and returns `await response["Body"].read()` in `api/app/storage/s3_backend.py`
**Checkpoint**: `ruff check api/` passes; `pytest api/` passes (all existing tests green — `serve_image_file` route still compiles but will change in Phase 3).
---
## Phase 3: User Story 1 — View Images Without Storage Configuration (Priority: P1) 🎯 MVP
**Goal**: Replace the 302 presigned-URL redirect with a 200 binary response so images load without any DNS or `/etc/hosts` configuration.
**Independent Test**: Start the application with no custom host entries. Open the library and detail page. All images load. Browser network tab shows only requests to `/api/v1/` — none to the storage hostname.
### Tests for User Story 1 (write FIRST — must FAIL before T004) ⚠️
- [x] T003 [US1] In `api/tests/integration/test_serving.py`: rename `test_file_redirect_returns_302` to `test_file_returns_200_with_content`; remove `follow_redirects=False`; replace the 302/Location assertions with: `response.status_code == 200`, `response.headers["content-type"].startswith("image/")`, `response.headers["etag"] == f'"{uploaded_image_hash}"'` (extract hash from upload response), `"immutable" in response.headers["cache-control"]`, and `len(response.content) > 0`; run `pytest api/tests/integration/test_serving.py::test_file_returns_200_with_content` and confirm it **fails**
### Implementation for User Story 1
- [x] T004 [US1] In `api/app/routers/images.py`: remove `RedirectResponse` from the `fastapi.responses` import; update `serve_image_file` to (a) call `data = await storage.get(image.storage_key)` inside a `try/except Exception` that raises an HTTPException with status 500 and code `"storage_error"`, then (b) return `Response(content=data, media_type=image.mime_type, headers={"ETag": f'"{image.hash}"', "Cache-Control": "public, max-age=31536000, immutable"})`
- [x] T005 [US1] Run `pytest api/tests/integration/test_serving.py` and confirm both `test_file_returns_200_with_content` and `test_file_unknown_id_returns_404` pass; then run `pytest api/` to confirm no regressions across all test files
**Checkpoint**: User Story 1 is fully functional. Images load via the API proxy with correct caching headers. No redirect to storage occurs.
---
## Phase 4: User Story 2 — Storage Backend Remains Private (Priority: P1)
**Goal**: Verify, with an explicit test, that no storage-specific detail (hostname, presigned URL, bucket name) appears anywhere in any API response after the proxy change.
**Independent Test**: Run the new test suite. Inspect browser DevTools network tab: zero requests to the storage host; zero occurrences of "minio", "s3://", or "amazonaws.com" in any response header or body.
### Tests for User Story 2 (write FIRST — must FAIL on pre-T004 code) ⚠️
- [x] T006 [US2] In `api/tests/integration/test_serving.py`: add test `test_file_response_exposes_no_storage_details` that uploads an image, calls `GET /api/v1/images/{id}/file`, then asserts: (a) `"location" not in response.headers` (no redirect target), (b) `"minio" not in response.text.lower()`, (c) `"s3://" not in response.text.lower()`, (d) `"amazonaws.com" not in response.text.lower()`; run the test against the **pre-T004** code and confirm it fails on assertion (a) (`Location` header is present in the redirect response)
### Implementation for User Story 2
No new implementation — US2 is fully satisfied by the T004 router change. The test written in T006 must pass once T004 is complete.
- [x] T007 [US2] Run `pytest api/tests/integration/test_serving.py` and confirm `test_file_response_exposes_no_storage_details` passes; this confirms US2 is satisfied by the US1 implementation
**Checkpoint**: Both user stories verified. Storage backend is provably private. Full proxy behaviour confirmed with explicit assertions.
---
## Phase 5: Polish & Cross-Cutting Concerns
- [x] T008 [P] Run `ruff check api/ && ruff format --check api/` and fix any lint or formatting issues in `api/app/storage/backend.py`, `api/app/storage/s3_backend.py`, and `api/app/routers/images.py`
- [x] T009 Run full test suite `pytest api/ -v` and confirm all tests pass (expected: existing 45 tests + 2 new/updated tests = 47 total); record final count
---
## Dependencies & Execution Order
### Phase Dependencies
- **Foundational (Phase 2)**: No dependencies — start immediately
- **US1 (Phase 3)**: Depends on Phase 2 complete (T001 + T002 done)
- **US2 (Phase 4)**: Depends on Phase 3 complete (T004 done, since T006 tests against pre-T004 code first)
- **Polish (Phase 5)**: Depends on Phases 3 and 4 complete
### User Story Dependencies
- **US1**: Independent — requires only Phase 2
- **US2**: Logically depends on US1 (its implementation task is zero — it reuses T004's work; T006 test is written pre-T004 to confirm it fails, then passes after T004)
### Within Each User Story
- Test tasks MUST be written and confirmed FAILING before implementation tasks run (§5.1)
- T001 and T002 can run in parallel (different files)
- T003 must precede T004
- T005 validates T004
- T006 is written pre-T004 but validated post-T004
---
## Parallel Example: Phase 2 (Foundational)
```bash
# T001 and T002 touch different files — run in parallel:
Task: "Remove get_presigned_url and add get() abstract method in api/app/storage/backend.py"
Task: "Remove get_presigned_url impl and add get() body in api/app/storage/s3_backend.py"
```
---
## Implementation Strategy
### MVP (Full feature — both user stories together)
1. Complete Phase 2: T001, T002 (parallel)
2. Complete Phase 3: T003 (fail), T004 (implement), T005 (verify)
3. Complete Phase 4: T006 (fail → passes), T007 (verify)
4. Complete Phase 5: T008 (lint), T009 (final count)
5. **STOP and VALIDATE**: Confirm zero browser requests to storage host in DevTools
### Total tasks: 9 (T001T009)
---
## Notes
- [P] tasks touch different files and have no mutual dependencies
- T006 must be run against pre-T004 code to confirm it fails — this is the TDD "red" step for US2
- `storage_key` in image metadata responses is a SHA-256 hex string (not a URL); it does not constitute an exposed storage credential and is out of scope for this feature per research.md Decision 5
- After this feature, `get_presigned_url` has no callers anywhere in the codebase — its removal is safe