Rate Limits

Rate limiting by tier and endpoint.

Rate Limits

Forge enforces rate limits per API key based on your subscription tier. Limits are applied per minute and per day to ensure fair usage and platform stability.

Limits by Tier

Tier	Requests/min	Requests/day	Tokens/min	Concurrent
Free	10	100	10,000	2
Pro	100	10,000	100,000	10
Ultimate	1,000	Unlimited	1,000,000	50
Enterprise	Custom	Custom	Custom	Custom

Rate Limit Headers

Every response includes rate limit headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1709000060
X-RateLimit-Limit-Day: 10000
X-RateLimit-Remaining-Day: 9456

Handling Rate Limits

When you exceed the rate limit, Forge returns a 429 status code with a retry_after value in seconds. Implement exponential backoff with jitter for retry logic:

async function requestWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err.status === 429) {
        const delay = Math.pow(2, i) * 1000 + Math.random() * 1000;
        await new Promise(r => setTimeout(r, delay));
      } else {
        throw err;
      }
    }
  }
}

Endpoint-Specific Limits

/v1/chat/completions: Standard tier limits apply
/v1/memory/*: 2x the standard rate limit
/v1/admin/*: 10 requests/min (admin keys only)
/v1/connect/actions/run: Subject to both Forge and provider rate limits

Error Codes

Glossary

Back to all documentation