Skip to content

feat(quota): enforce monthly per-user API call quota (PR-M)#30

Merged
MrChengLen merged 1 commit into
mainfrom
pr-m-monthly-quota
May 11, 2026
Merged

feat(quota): enforce monthly per-user API call quota (PR-M)#30
MrChengLen merged 1 commit into
mainfrom
pr-m-monthly-quota

Conversation

@MrChengLen
Copy link
Copy Markdown
Owner

Closes #215. The pricing page advertises 500 / 10 000 / 100 000 calls per month for Free / Pro / Business tiers, and these limits have been sitting in app/core/quotas.py marked "informational (used for UI display / future enforcement)" since they were defined. PR-M wires that future enforcement so the system actually keeps the promise the pricing page makes.

Architecture

app/core/usage.py is the single home for the writer + the gate. Both helpers own their AsyncSession (mirrors app/core/audit.py and app/core/metrics.py — request-path code does not thread db=).

  • record_usage(user_id, api_key_id, endpoint, file_size_bytes, duration_ms): writes one UsageRecord row on every successful /convert + /compress (single + batch). Fire-and-forget; a failed insert logs at WARNING but never breaks the request.
  • enforce_monthly_quota(user): counts the user's UsageRecord rows for the current calendar month (UTC) and raises HTTPException 429 with a Retry-After header pointing at the next-month boundary if the user is at or above their tier limit.

Time window: calendar month, UTC. Picked over rolling-30-day because it matches how the pricing page is read ("you get 10k per month") and gives users a single, predictable reset boundary they can read off their own calendar.

Counting rule: one HTTP call = one quota use, regardless of batch size. A 25-file batch counts as 1, matching the pricing-page wording "API calls per month". File-level counts go to the metrics table for the cockpit. Failed conversions do NOT count toward the quota — only completed work moves the user toward their limit.

Bypass paths:

  • Anonymous tier (user is None): exempt; per-IP rate-limiter (10/min) is the only constraint.
  • Enterprise tier (api_calls_per_month=None): unlimited.
  • Community Edition without DATABASE_URL: gate is a no-op (nothing to count against); writer is a no-op too.

Wired into:

  • app/api/routes/convert.py::_do_convert (single)
  • app/api/routes/convert.py::_do_convert_batch (batch)
  • app/api/routes/compress.py::_do_compress (single)
  • app/api/routes/compress.py::_do_compress_batch (batch)

The gate runs AFTER the concurrency-slot acquisition and AFTER the file-size check, BEFORE any disk I/O, so a refused request never touches the temp dir.

Database

Migration 007_usage_quota_index adds a composite index on usage(user_id, timestamp). The gate query
COUNT(*) WHERE user_id=:uid AND timestamp >= :month_start becomes a fast index range scan even at 100 000 rows / Business user / month.

Without the index it sequentially scans the whole usage table on every /convert and /compress call — latency grows with total-rows-ever, not with current-month rows.

Tests (tests/test_monthly_quota.py — 15 cases)

  • _month_start, _next_month_start helpers (3 cases incl. Dec→Jan)
  • monthly_call_count: zero, current-month-only (last-month rows excluded)
  • enforce_monthly_quota: anonymous noop, enterprise noop, below-limit noop, at-limit raises 429 with Retry-After, pro tier 10k boundary, business tier 100k boundary (mocked count)
  • record_usage: inserts one row on success, anonymous noop
  • End-to-end /convert: returns 429 with Retry-After when user at limit, returns 200 + writes a UsageRecord row when below limit

Verification

pytest tests/test_monthly_quota.py -v → 15 passed
pytest tests/ → 554 passed (was 539)
ruff check + ruff format --check → clean

Docs

docs/api-reference.md "Rate Limiting" section now documents:

  • per-tier monthly quota table
  • what counts as one call (single + batch = 1 each)
  • 429 response shape with Retry-After + JSON body example
  • reset boundary (calendar-month UTC)

Out of scope (separate PRs)

  • Dashboard UI: "X / Y this month" progress bar (data is now available; render is cosmetic)
  • Cockpit per-user usage table (existing /cockpit/usage-summary is global-aggregate; per-user view is a follow-up)
  • 80% / 95% advisory headers ("X-Quota-Used: 9500/10000")
  • Email notification on hitting the limit

Closes #215. The pricing page advertises 500 / 10 000 / 100 000 calls
per month for Free / Pro / Business tiers, and these limits have been
sitting in app/core/quotas.py marked "informational (used for UI
display / future enforcement)" since they were defined. PR-M wires
that future enforcement so the system actually keeps the promise the
pricing page makes.

Architecture
------------
app/core/usage.py is the single home for the writer + the gate.
Both helpers own their AsyncSession (mirrors app/core/audit.py and
app/core/metrics.py — request-path code does not thread `db=`).

  - record_usage(user_id, api_key_id, endpoint, file_size_bytes,
    duration_ms): writes one UsageRecord row on every successful
    /convert + /compress (single + batch). Fire-and-forget; a failed
    insert logs at WARNING but never breaks the request.
  - enforce_monthly_quota(user): counts the user's UsageRecord rows
    for the current calendar month (UTC) and raises HTTPException 429
    with a Retry-After header pointing at the next-month boundary if
    the user is at or above their tier limit.

Time window: calendar month, UTC. Picked over rolling-30-day because
it matches how the pricing page is read ("you get 10k per month") and
gives users a single, predictable reset boundary they can read off
their own calendar.

Counting rule: one HTTP call = one quota use, regardless of batch
size. A 25-file batch counts as 1, matching the pricing-page wording
"API calls per month". File-level counts go to the metrics table for
the cockpit. Failed conversions do NOT count toward the quota — only
completed work moves the user toward their limit.

Bypass paths:
  - Anonymous tier (user is None): exempt; per-IP rate-limiter
    (10/min) is the only constraint.
  - Enterprise tier (api_calls_per_month=None): unlimited.
  - Community Edition without DATABASE_URL: gate is a no-op (nothing
    to count against); writer is a no-op too.

Wired into:
  - app/api/routes/convert.py::_do_convert (single)
  - app/api/routes/convert.py::_do_convert_batch (batch)
  - app/api/routes/compress.py::_do_compress (single)
  - app/api/routes/compress.py::_do_compress_batch (batch)

The gate runs AFTER the concurrency-slot acquisition and AFTER the
file-size check, BEFORE any disk I/O, so a refused request never
touches the temp dir.

Database
--------
Migration 007_usage_quota_index adds a composite index on
``usage(user_id, timestamp)``. The gate query
``COUNT(*) WHERE user_id=:uid AND timestamp >= :month_start`` becomes
a fast index range scan even at 100 000 rows / Business user / month.

Without the index it sequentially scans the whole usage table on
every /convert and /compress call — latency grows with
total-rows-ever, not with current-month rows.

Tests (tests/test_monthly_quota.py — 15 cases)
----------------------------------------------
  - _month_start, _next_month_start helpers (3 cases incl. Dec→Jan)
  - monthly_call_count: zero, current-month-only (last-month rows
    excluded)
  - enforce_monthly_quota: anonymous noop, enterprise noop,
    below-limit noop, at-limit raises 429 with Retry-After,
    pro tier 10k boundary, business tier 100k boundary (mocked count)
  - record_usage: inserts one row on success, anonymous noop
  - End-to-end /convert: returns 429 with Retry-After when user at
    limit, returns 200 + writes a UsageRecord row when below limit

Verification
------------
  pytest tests/test_monthly_quota.py -v        → 15 passed
  pytest tests/                                 → 554 passed (was 539)
  ruff check + ruff format --check              → clean

Docs
----
docs/api-reference.md "Rate Limiting" section now documents:
  - per-tier monthly quota table
  - what counts as one call (single + batch = 1 each)
  - 429 response shape with Retry-After + JSON body example
  - reset boundary (calendar-month UTC)

Out of scope (separate PRs)
---------------------------
  - Dashboard UI: "X / Y this month" progress bar (data is now
    available; render is cosmetic)
  - Cockpit per-user usage table (existing /cockpit/usage-summary
    is global-aggregate; per-user view is a follow-up)
  - 80% / 95% advisory headers ("X-Quota-Used: 9500/10000")
  - Email notification on hitting the limit
@MrChengLen MrChengLen merged commit 322d46b into main May 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant