Skip to content

Latest commit

 

History

History
235 lines (200 loc) · 16.6 KB

File metadata and controls

235 lines (200 loc) · 16.6 KB

Spec: bytesized.co - OpenAI Image Generation using SwiftWASM + Hummingbird

1. Objective

Implement a web app where:

  • The page loads in the browser as a SwiftWASM app.
  • The app automatically requests an image on the home page, article pages, and paginated archive pages.
  • A same-session revisit of the same page on the same UTC day reuses that page's last returned image from client-side session storage when available.
  • The backend persists a daily per-page image key so repeat requests for the same page, UTC day, and request country reuse the existing image instead of generating a new one.
  • Returning on the next UTC day uses a new page-cache key, causing the backend to generate or assign a new image for that page.
  • When the daily generation budget is exhausted, the backend returns a random previously generated image instead of requesting a new one.
  • The backend waits for image generation to finish before replying.
  • The final image is rendered from a public S3 HTTPS URL.
  • The image is generated by the OpenAI image generation API using the fixed model gpt-image-1.5.
  • The generation prompt should use the request's origin country when the server can resolve it from the client IP through country.is, and otherwise fall back to a generic worldwide prompt.
  • The backend performs country.is lookups and OpenAI image-generation requests with AsyncHTTPClient.
  • The backend uses a long-lived Hummingbird server.

2. System Overview

2.1 Components

  • Frontend: SwiftWASM app using Parcel for typed HTTP JSON requests in the browser.
  • Backend: Hummingbird server exposing a single synchronous POST endpoint.
  • Storage/delivery: A dedicated public S3 bucket exposing generated PNG objects under a dedicated prefix.

2.2 High-Level Flow

  1. The browser loads the SwiftWASM bundle and page HTML.
  2. The app reads the page context from the mount element.
  3. If session storage already contains an image URL for the same page path, page type, and UTC day, the app reuses that URL and skips the API call.
  4. Otherwise, the app calls POST <API_URL> with page context.
  5. The server derives the client IP from proxy forwarding headers, preferring X-Real-IP when present and otherwise falling back to X-Forwarded-For, then looks up the origin country with country.is.
  6. The server validates input and checks for a daily page-cache key derived from the current UTC date, page context, and resolved country before considering a new generation.
  7. If the page-cache key already exists, the server returns that image immediately.
  8. Otherwise, the server counts generated PNG objects already present under the current UTC day prefix in S3 to decide whether its soft daily generation budget has remaining capacity.
  9. If budget remains, the server creates a fresh unique dated image key, builds a country-aware prompt when country lookup succeeded, calls OpenAI, uploads the PNG to S3, writes the same image to the daily page-cache key, and returns 200 OK.
  10. If the daily budget is exhausted, the server selects a random existing generated PNG from S3, copies it to the daily page-cache key, and returns 200 OK with that page-cache image instead.
  11. The app swaps the placeholder image source to the returned or cached URL.
  12. On successful API responses, the app stores the returned image URL in session storage with the current UTC date for future visits to the same page on the same UTC day in the current browser session.

2.3 Published SwiftWASM Assets

  • The BytesizedCafe SwiftWASM package is built into the repo-root bytesized-cafe-app/ directory.
  • The site generator publishes that directory at /bytesized-cafe-app/.
  • Published asset paths must preserve the generated nested package layout, including /bytesized-cafe-app/platforms/browser.js.
  • Production deploys upload /bytesized-cafe-app/BytesizedCafe.wasm as gzipped content at the canonical .wasm key with Content-Type: application/wasm, Content-Encoding: gzip, and long-lived immutable caching.

3. S3 Design

3.1 Canonical Public Origin

The backend derives the public image origin from GENERATED_IMAGES_BUCKET and AWS_REGION as: https://<generated-images-bucket>.s3.<region>.amazonaws.com

3.2 Bucket Access Model

  • Keep S3 Object Ownership set to Bucket owner enforced.
  • Keep object ACLs disabled.
  • Public read comes from bucket policy, not object ACLs.
  • Store generated images in a dedicated public S3 bucket separate from the static site bucket.
  • Keep generated images under IMAGE_GEN_PREFIX.
  • Grant anonymous s3:GetObject on arn:aws:s3:::<generated-images-bucket>/<IMAGE_GEN_PREFIX>/*.
  • Do not upload with public-read ACLs.

3.3 Object Key Format

  • Freshly generated image:
    • {IMAGE_GEN_PREFIX}/{YYYY}/{MM}/{DD}/{UUID}-{country-slug}.png when the request country is known
    • {IMAGE_GEN_PREFIX}/{YYYY}/{MM}/{DD}/{UUID}.png when the request country is not known
  • Daily page-cache image:
    • {IMAGE_GEN_PREFIX}/page-cache/{YYYY}/{MM}/{DD}/{pageType}/{normalized-page-path}-{country-slug}.png when the request country is known
    • {IMAGE_GEN_PREFIX}/page-cache/{YYYY}/{MM}/{DD}/{pageType}/{normalized-page-path}-anywhere.png when the request country is not known
  • Random fallback image:
    • Prefer an existing PNG under the current UTC date prefix whose key ends in the current request's -{country-slug}.png
    • Fall back to any existing PNG under the current UTC date prefix when no country-matching image is available

Rules:

  • Fresh generation keys must not be derived from page context.
  • Daily page-cache keys must be derived from the current UTC date, page context, and resolved country.
  • API responses should prefer the daily page-cache key whenever one exists or is created during the request.

3.4 Object Metadata

When uploading a freshly generated image:

  • Content-Type: image/png
  • Cache-Control: public, max-age=31536000, immutable

3.5 Lifecycle Policy

Configure S3 Lifecycle expiration to prevent unbounded storage growth:

  • Expire IMAGE_GEN_PREFIX/ after 30 days if regeneration on cache miss is acceptable.

4. API Design

4.1 Routing

This API uses a single action endpoint:

  • POST /api/cafe/generate triggers generation or fallback selection.
  • OPTIONS /api/cafe/generate is handled by Hummingbird CORS middleware.

4.2 CORS

Enable CORS on the server endpoint for browser access:

  • Allowed methods: POST, OPTIONS
  • Allowed headers: Content-Type
  • Allowed origins: site origin(s) used to host the SwiftWASM app

5. API Contract

5.1 POST <API_URL>

Request JSON:

{
  "context": {
    "pagePath": "/posts/example-article",
    "pageType": "article"
  }
}

pageType must be one of:

  • index
  • article
  • archive

Response:

  • Status: 200 OK
  • Body:
{
  "url": "https://<public-base-domain>/generated/v2/page-cache/2026/04/23/article/posts/example-article-france.png"
}

Rules:

  • url is the final public image URL and must use the generated-images bucket public origin.
  • The response may return a daily per-page cache key when the page already has an assigned image for the current UTC day.
  • Return 200 only after the image has been uploaded successfully or a random fallback image has been selected successfully.
  • Invalid input returns 4xx.
  • If the daily budget is exhausted and no fallback image exists, return 503.
  • Terminal upstream failures return 5xx.

6. Server Behavior

6.1 Hummingbird Server Responsibilities

  • Parse and validate input JSON.
  • Encapsulate S3 operations behind one S3ImageStore client object that owns the bucket configuration and AWS client lifecycle for image upload and lookup operations.
  • Resolve the client IP address by preferring X-Real-IP when present and otherwise falling back to X-Forwarded-For.
  • Look up the request origin country with https://api.country.is/{ip} and convert the returned region code into an English country name when available.
  • Derive a daily page-cache key from the current UTC date, page context, and resolved country, and return it immediately when that object already exists in S3.
  • Check the soft daily generation budget by counting PNG objects already present under the current UTC date prefix in S3.
  • Build the public url.
  • When budget remains:
    • Generate a fresh unique image key.
    • Build the prompt as one randomly chosen prepared dish, snack, pastry, or street food genuinely eaten in the request country.
    • Include a normalized -{country-slug} suffix in the generated key when country lookup succeeds.
    • Instruct the model to prefer specific regional, city, market, bakery, holiday, breakfast, dessert, or home-cooked foods over the first national stereotype.
    • Instruct the model to show exactly one hero food item with no combo meals, side dishes, drinks, menus, flags, labels, or text.
    • Instruct the model to avoid defaulting to globally common fast food such as hamburgers, fries, pizza, or hot dogs unless the subject is a distinctive named local variation.
    • Fall back to the same prompt structure scoped to somewhere in the world when the client IP or country cannot be resolved.
    • Call the OpenAI image generation API with model gpt-image-1.5.
    • Upload the PNG to the generated image key used for the dated generation pool.
    • Upload the same PNG to the daily page-cache key.
    • Return the page-cache url.
  • When budget is exhausted:
    • Prefer a random existing generated PNG key from the current UTC date prefix whose key suffix matches the current request country.
    • Fall back to a random existing generated PNG key from the current UTC date prefix when no country-matching key is available.
    • Copy the selected fallback image to the daily page-cache key without calling OpenAI.
    • Return the page-cache url.

7. Frontend Behavior

  • Show a loading placeholder immediately.
  • Read page context from the mount element.
  • If session storage contains a URL for the same page path, page type, and UTC day, reuse that URL and skip the API call.
  • Otherwise, start a single POST request to the configured API URL.
  • When the request succeeds, swap the placeholder image source to the returned url.
  • Persist the returned image URL in session storage keyed to the current page and UTC day so the next same-session visit of that page can reuse it only until the UTC day changes.

8. Environment Variables

8.1 Hummingbird Server

  • GENERATED_IMAGES_BUCKET
  • OPENAI_API_KEY
  • OPENAI_IMAGE_MODEL
  • IMAGE_GEN_PREFIX
  • AWS_REGION
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • HOST
  • PORT

Local repo tooling may provide BACKEND_HOST and BACKEND_PORT as aliases for the backend runtime HOST and PORT values.

8.2 Site Build

  • BYTESIZED_CAFE_API_URL

8.3 Site Deploy

  • AWS_S3_BUCKET
  • CLOUDFRONT_DISTRIBUTION_ID

9. Validation

The implementation is considered complete when:

  • A same-session revisit of the same page on the same UTC day reuses the last returned image URL from session storage without making a new backend request.
  • A same-session revisit of the same page after the UTC day changes makes a backend request instead of reusing yesterday's session-storage URL.
  • A backend request for a page that already has a daily page-cache object returns that existing image URL without making a new OpenAI request.
  • The backend returns 200 only after a fresh image upload succeeds or a random fallback image has been selected.
  • When the daily budget is exhausted, the backend returns a random existing generated image instead of making a new OpenAI request.
  • Fresh generations use the request origin country in the prompt when the server can resolve it from the client IP, and otherwise fall back to the generic worldwide prompt.
  • Fresh generation prompts ask for one specific, non-stereotyped hero food item and discourage generic fast-food defaults and combo meals.
  • Fresh generations include a country slug suffix in the image key when the request country is known.
  • When the daily budget is exhausted, fallback selection prefers existing images whose keys match the current request country and otherwise falls back to any existing image.
  • The backend persists deterministic daily per-page cache keys separately from the dated generation pool.

10. Deployment

10.1 Container Build

  • The backend container image is built from Backend/ using the checked-in Backend/Dockerfile.
  • The backend container build and runtime stages pin the official swift:6.3.0-bookworm and swift:6.3.0-bookworm-slim images.
  • The checked-in Backend/railway.toml codifies the Railway deploy settings that should live in source control, currently the Dockerfile builder and /health healthcheck.
  • The deployable product is the Server executable.
  • Railway builds and runs the production image from GitHub pushes, targeting the backend service with railway up Backend --ci --path-as-root.
  • Deployment config changes are validated with just validate-deployment, which delegates to ./Scripts/validate-deployment-config.sh to build the Docker image and validate workflow YAML parsing.

10.2 Railway Infrastructure

  • Railway hosts the public backend service, injects runtime environment variables, and exposes a healthchecked HTTPS endpoint for the Server container.
  • The Railway service should define a stable custom domain so the static site can build against a fixed BYTESIZED_CAFE_API_URL.
  • The GitHub Actions workflow under .github/workflows/deploy.yml is the production deployment path and is intended to run on pushes to the primary deployment branch.
  • The backend deploy job authenticates with a Railway project token, synchronizes the backend runtime variables into Railway, and deploys the Backend/ directory directly to the configured Railway project, environment, and service.
  • Railway service-level GitHub autodeploy should be disabled when the GitHub Actions workflow is the active deployment path, to avoid duplicate backend deployments from the same push.
  • GitHub Actions repository variables and secrets are the source of truth for the backend runtime variables GENERATED_IMAGES_BUCKET, OPENAI_API_KEY, OPENAI_IMAGE_MODEL, IMAGE_GEN_PREFIX, AWS_REGION, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY.
  • The backend deploy workflow sets HOST=0.0.0.0 and PORT=8080 in Railway by default, unless the deploy job overrides RAILWAY_RUNTIME_HOST or RAILWAY_RUNTIME_PORT.
  • The site deploy job continues to sync Output/ to S3 using a fixed BYTESIZED_CAFE_API_URL.
  • The site deploy script excludes the raw SwiftWASM binary from the bulk sync, gzips it locally, and uploads the compressed bytes back to bytesized-cafe-app/BytesizedCafe.wasm with explicit wasm content metadata.
  • Paginated archive links use the literal deployed object paths under /page/<n>/index.html because the production S3 and CloudFront setup does not rewrite clean directory URLs to nested index.html objects.
  • After the S3 sync completes, the site deploy job invalidates the production CloudFront distribution with CLOUDFRONT_DISTRIBUTION_ID for /, /index.html, /page/*, /posts/*, /feed.rss, /bytesized-cafe-app/*, /css/*, /images/*, and /fonts/*.

11. Local Development

  • A repo-root justfile provides the primary entry point for common local tasks such as just wasm, just site, just site-local, just backend, and just local, along with deployment-oriented recipes like just site-release, just site-deploy, and just validate-deployment.
  • The repo's Swift package manifests target Swift tools version 6.3, the macOS GitHub Actions job installs Swift 6.3.0, and the SwiftWasm site build uses the compatible swift-6.3-RELEASE SDK tag.
  • Scripts/run-local.sh provides a one-command local stack for development and opens the local site in the default browser after the backend and static site server are ready.
  • The script rebuilds the BytesizedCafe SwiftWASM bundle, regenerates the site with BYTESIZED_CAFE_API_URL pointed at a localhost backend, prebuilds the backend to avoid counting SwiftPM compilation against the startup timeout, starts the Hummingbird server, and serves Output/ over a local static HTTP server.
  • Scripts/build-bytesized-cafe-app.sh prefers a SwiftWASM SDK ID matching the active swift --version release when multiple WASM SDKs are installed; SWIFT_WASM_SDK_ID or SWIFT_SDK_ID can still override the auto-detected SDK.
  • Scripts/build-bytesized-cafe-app.sh requires Binaryen's wasm-opt; PackageToJS performs its release optimization pass and the repo script follows with a final size-focused wasm-opt -Oz pass before the site generator copies the bundle into Output/.
  • Changes to shared browser dependencies such as Parcel must be validated through just wasm against the nested BytesizedCafe package; a green top-level swift run bytesized build alone does not prove the SwiftWASM app still resolves and compiles.