Skip to main content

Rate Limiting and Throttling

1. API Call Limits (Max Requests)

To ensure fair usage and prevent system overload, the API enforces request limits:

  • 100 requests per minute for general API endpoints.
  • 10 requests per minute for sensitive endpoints (e.g., function creation).

2. Handling Rate Limit Exceedance (Retry, Backoff)

When the rate limit is exceeded, the server responds with a 429 Too Many Requests status. Handling strategies include:

  • Retry:
    Retry the request after the specified cooldown period (usually provided in the response headers).

  • Backoff:
    Gradually increase the delay between retries to avoid hitting the limit again.

3. Overload Handling Process

If the system experiences overload due to high traffic, it may return a 503 Service Unavailable status. Clients should:

  • Retry:
    Retry the request after a short delay.

  • Throttling:
    Ensure that the number of simultaneous requests stays within the defined rate limits.