Skip to content

Latest commit

 

History

History
39 lines (26 loc) · 1.25 KB

File metadata and controls

39 lines (26 loc) · 1.25 KB

Reliability

Reliability expectations and practices for this project.

Health Checks

  • GET /health verifies B2 connectivity and returns healthy or degraded
  • Health endpoint is always available, even when B2 is down

Error Handling

  • HTTP handlers return structured error responses with appropriate status codes
  • External service failures (B2) are caught and surfaced as 500/503 responses
  • No unhandled exceptions leak stack traces to clients

Logging

  • Structured JSON logging via Python stdlib
  • Every request gets a request_id for tracing
  • Log levels: ERROR for failures, WARNING for degraded state, INFO for requests

Observability

  • Request timing middleware logs duration for every request
  • /metrics endpoint exposes basic Prometheus-format counters
  • Upload success/failure counts tracked

Graceful Degradation

  • File listing returns empty list (not error) when B2 has no objects
  • Metadata extraction failures don't block upload (return partial metadata)
  • Frontend shows skeleton states while loading, error states on failure

Deployment

  • Railway health checks on /health
  • Zero-downtime deploys via rolling updates
  • Environment-specific configuration via env vars (no config files in prod)