Skip to content

Latest commit

 

History

History
149 lines (104 loc) · 4.17 KB

File metadata and controls

149 lines (104 loc) · 4.17 KB

Rate Limits

DOS AI enforces rate limits to ensure fair usage and platform stability for all users.

Default Limits

Metric Limit
Requests per minute 60
Window type Sliding window (60 seconds)

The rate limiter uses a sliding window algorithm. This means the limit is calculated over a rolling 60-second period, not fixed calendar minutes.

Rate Limit Headers

Every API response includes headers that report your current rate limit status:

Header Description
X-RateLimit-Limit Maximum requests allowed in the current window
X-RateLimit-Remaining Requests remaining in the current window

Example Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
Content-Type: application/json

Use these headers to proactively manage your request rate and avoid hitting the limit.

What Happens When You Hit the Limit

When you exceed the rate limit, the API returns a 429 Too Many Requests response:

{
  "error": {
    "message": "Rate limit exceeded. Please wait before making another request.",
    "type": "rate_limit_error",
    "code": 429
  }
}

The request is not processed and no credits are charged.

Handling 429 Errors

Exponential Backoff

The recommended strategy is exponential backoff with jitter. This progressively increases wait time between retries and adds randomness to prevent thundering herd problems.

Python Example

import time
import random
import requests

def call_with_backoff(url, headers, data, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)

        if response.status_code == 200:
            return response.json()

        if response.status_code == 429:
            base_delay = 2 ** attempt  # 1, 2, 4, 8, 16 seconds
            jitter = random.uniform(0, base_delay * 0.5)
            wait_time = base_delay + jitter

            print(f"Rate limited. Retrying in {wait_time:.1f}s...")
            time.sleep(wait_time)
            continue

        response.raise_for_status()

    raise Exception("Max retries exceeded")

Node.js Example

async function callWithBackoff(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.ok) {
      return await response.json();
    }

    if (response.status === 429) {
      const baseDelay = Math.pow(2, attempt) * 1000;
      const jitter = Math.random() * baseDelay * 0.5;
      const waitTime = baseDelay + jitter;

      console.log(`Rate limited. Retrying in ${(waitTime / 1000).toFixed(1)}s...`);
      await new Promise((resolve) => setTimeout(resolve, waitTime));
      continue;
    }

    throw new Error(`API error: ${response.status}`);
  }

  throw new Error("Max retries exceeded");
}

Proactive Rate Management

Use the rate limit headers to throttle requests before hitting the limit:

import time
import requests

def call_with_throttle(url, headers, data):
    response = requests.post(url, headers=headers, json=data)

    remaining = int(response.headers.get("X-RateLimit-Remaining", 60))

    if remaining < 5:
        time.sleep(2)
    elif remaining < 10:
        time.sleep(0.5)

    return response.json()

Tips for Staying Within Limits

  1. Batch your work -- Pace requests evenly rather than sending them all at once.
  2. Use streaming -- Streaming responses (stream: true) count as a single request regardless of response length.
  3. Cache responses -- Avoid making the same API call repeatedly.
  4. Monitor usage -- Check X-RateLimit-Remaining headers and slow down as you approach the limit.
  5. Use fewer, larger requests -- One comprehensive prompt is better than multiple small ones.

Enterprise Custom Limits

For organizations with high-volume needs, we offer custom limits:

  • Higher request-per-minute caps based on your workload
  • Burst allowances for predictable traffic spikes
  • Dedicated capacity with guaranteed throughput

Contact support@dos.ai to discuss custom rate limits for your organization.