Rate Limits

Rate limiting by tier and endpoint.

Rate Limits

Forge enforces rate limits per API key based on your subscription tier. Limits are applied per minute and per day to ensure fair usage and platform stability.

Limits by Tier

TierRequests/minRequests/dayTokens/minConcurrent
Free1010010,0002
Pro10010,000100,00010
Ultimate1,000Unlimited1,000,00050
EnterpriseCustomCustomCustomCustom

Rate Limit Headers

Every response includes rate limit headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1709000060
X-RateLimit-Limit-Day: 10000
X-RateLimit-Remaining-Day: 9456

Handling Rate Limits

When you exceed the rate limit, Forge returns a 429 status code with a retry_after value in seconds. Implement exponential backoff with jitter for retry logic:

async function requestWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err.status === 429) {
        const delay = Math.pow(2, i) * 1000 + Math.random() * 1000;
        await new Promise(r => setTimeout(r, delay));
      } else {
        throw err;
      }
    }
  }
}

Endpoint-Specific Limits

  • /v1/chat/completions: Standard tier limits apply
  • /v1/memory/*: 2x the standard rate limit
  • /v1/admin/*: 10 requests/min (admin keys only)
  • /v1/connect/actions/run: Subject to both Forge and provider rate limits