# Tasks: API Image Proxy **Input**: Design documents from `specs/002-api-image-proxy/` **Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/api.md ✅ **TDD**: Tests are non-negotiable per constitution §5.1. Every test task MUST be written and confirmed failing before its implementation task runs. **Organization**: Tasks are grouped by user story to enable independent implementation and testing. ## Format: `[ID] [P?] [Story] Description` - **[P]**: Can run in parallel (different files, no dependencies on incomplete tasks) - **[Story]**: Which user story this task belongs to (US1, US2) --- ## Phase 1: Setup No setup required. The project is already initialized; the feature branch `002-api-image-proxy` is active. --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Update the `StorageBackend` interface before any route or test change can reference the new method. These are pure interface/plumbing tasks with no callers yet — no preceding failing test is required because the tasks themselves introduce no observable behaviour change. **⚠️ CRITICAL**: Both T001 and T002 must be complete before US1 work begins. - [x] T001 Remove abstract method `get_presigned_url` and add `async def get(self, key: str) -> bytes` abstract method (with docstring "Return object content as bytes.") to `StorageBackend` in `api/app/storage/backend.py` - [x] T002 [P] Remove `get_presigned_url` implementation; add `async def get(self, key: str) -> bytes` that opens an aiobotocore client, calls `client.get_object(Bucket=self._settings.s3_bucket_name, Key=key)`, and returns `await response["Body"].read()` in `api/app/storage/s3_backend.py` **Checkpoint**: `ruff check api/` passes; `pytest api/` passes (all existing tests green — `serve_image_file` route still compiles but will change in Phase 3). --- ## Phase 3: User Story 1 — View Images Without Storage Configuration (Priority: P1) 🎯 MVP **Goal**: Replace the 302 presigned-URL redirect with a 200 binary response so images load without any DNS or `/etc/hosts` configuration. **Independent Test**: Start the application with no custom host entries. Open the library and detail page. All images load. Browser network tab shows only requests to `/api/v1/` — none to the storage hostname. ### Tests for User Story 1 (write FIRST — must FAIL before T004) ⚠️ - [x] T003 [US1] In `api/tests/integration/test_serving.py`: rename `test_file_redirect_returns_302` to `test_file_returns_200_with_content`; remove `follow_redirects=False`; replace the 302/Location assertions with: `response.status_code == 200`, `response.headers["content-type"].startswith("image/")`, `response.headers["etag"] == f'"{uploaded_image_hash}"'` (extract hash from upload response), `"immutable" in response.headers["cache-control"]`, and `len(response.content) > 0`; run `pytest api/tests/integration/test_serving.py::test_file_returns_200_with_content` and confirm it **fails** ### Implementation for User Story 1 - [x] T004 [US1] In `api/app/routers/images.py`: remove `RedirectResponse` from the `fastapi.responses` import; update `serve_image_file` to (a) call `data = await storage.get(image.storage_key)` inside a `try/except Exception` that raises an HTTPException with status 500 and code `"storage_error"`, then (b) return `Response(content=data, media_type=image.mime_type, headers={"ETag": f'"{image.hash}"', "Cache-Control": "public, max-age=31536000, immutable"})` - [x] T005 [US1] Run `pytest api/tests/integration/test_serving.py` and confirm both `test_file_returns_200_with_content` and `test_file_unknown_id_returns_404` pass; then run `pytest api/` to confirm no regressions across all test files **Checkpoint**: User Story 1 is fully functional. Images load via the API proxy with correct caching headers. No redirect to storage occurs. --- ## Phase 4: User Story 2 — Storage Backend Remains Private (Priority: P1) **Goal**: Verify, with an explicit test, that no storage-specific detail (hostname, presigned URL, bucket name) appears anywhere in any API response after the proxy change. **Independent Test**: Run the new test suite. Inspect browser DevTools network tab: zero requests to the storage host; zero occurrences of "minio", "s3://", or "amazonaws.com" in any response header or body. ### Tests for User Story 2 (write FIRST — must FAIL on pre-T004 code) ⚠️ - [x] T006 [US2] In `api/tests/integration/test_serving.py`: add test `test_file_response_exposes_no_storage_details` that uploads an image, calls `GET /api/v1/images/{id}/file`, then asserts: (a) `"location" not in response.headers` (no redirect target), (b) `"minio" not in response.text.lower()`, (c) `"s3://" not in response.text.lower()`, (d) `"amazonaws.com" not in response.text.lower()`; run the test against the **pre-T004** code and confirm it fails on assertion (a) (`Location` header is present in the redirect response) ### Implementation for User Story 2 No new implementation — US2 is fully satisfied by the T004 router change. The test written in T006 must pass once T004 is complete. - [x] T007 [US2] Run `pytest api/tests/integration/test_serving.py` and confirm `test_file_response_exposes_no_storage_details` passes; this confirms US2 is satisfied by the US1 implementation **Checkpoint**: Both user stories verified. Storage backend is provably private. Full proxy behaviour confirmed with explicit assertions. --- ## Phase 5: Polish & Cross-Cutting Concerns - [x] T008 [P] Run `ruff check api/ && ruff format --check api/` and fix any lint or formatting issues in `api/app/storage/backend.py`, `api/app/storage/s3_backend.py`, and `api/app/routers/images.py` - [x] T009 Run full test suite `pytest api/ -v` and confirm all tests pass (expected: existing 45 tests + 2 new/updated tests = 47 total); record final count --- ## Dependencies & Execution Order ### Phase Dependencies - **Foundational (Phase 2)**: No dependencies — start immediately - **US1 (Phase 3)**: Depends on Phase 2 complete (T001 + T002 done) - **US2 (Phase 4)**: Depends on Phase 3 complete (T004 done, since T006 tests against pre-T004 code first) - **Polish (Phase 5)**: Depends on Phases 3 and 4 complete ### User Story Dependencies - **US1**: Independent — requires only Phase 2 - **US2**: Logically depends on US1 (its implementation task is zero — it reuses T004's work; T006 test is written pre-T004 to confirm it fails, then passes after T004) ### Within Each User Story - Test tasks MUST be written and confirmed FAILING before implementation tasks run (§5.1) - T001 and T002 can run in parallel (different files) - T003 must precede T004 - T005 validates T004 - T006 is written pre-T004 but validated post-T004 --- ## Parallel Example: Phase 2 (Foundational) ```bash # T001 and T002 touch different files — run in parallel: Task: "Remove get_presigned_url and add get() abstract method in api/app/storage/backend.py" Task: "Remove get_presigned_url impl and add get() body in api/app/storage/s3_backend.py" ``` --- ## Implementation Strategy ### MVP (Full feature — both user stories together) 1. Complete Phase 2: T001, T002 (parallel) 2. Complete Phase 3: T003 (fail), T004 (implement), T005 (verify) 3. Complete Phase 4: T006 (fail → passes), T007 (verify) 4. Complete Phase 5: T008 (lint), T009 (final count) 5. **STOP and VALIDATE**: Confirm zero browser requests to storage host in DevTools ### Total tasks: 9 (T001–T009) --- ## Notes - [P] tasks touch different files and have no mutual dependencies - T006 must be run against pre-T004 code to confirm it fails — this is the TDD "red" step for US2 - `storage_key` in image metadata responses is a SHA-256 hex string (not a URL); it does not constitute an exposed storage credential and is out of scope for this feature per research.md Decision 5 - After this feature, `get_presigned_url` has no callers anywhere in the codebase — its removal is safe