Chat Completions
Full reference for POST /v1/chat/completions.
Chat Completions
The Chat Completions endpoint is the primary interface for generating AI responses. It is fully compatible with the OpenAI Chat Completions API with additional Forge-specific parameters for routing, memory, security, caching, and ensemble processing.
Endpoint
POST /v1/chat/completions
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID or "auto" for intelligent routing |
messages | array | Yes | Array of message objects with role and content |
temperature | number | No | Sampling temperature (0-2). Default: 1 |
max_tokens | integer | No | Maximum tokens in the response |
stream | boolean | No | Enable Server-Sent Events streaming. Default: false |
top_p | number | No | Nucleus sampling parameter (0-1). Default: 1 |
n | integer | No | Number of completions to generate. Default: 1 |
stop | string|array | No | Stop sequences |
tools | array | No | Function/tool definitions for tool calling |
forge | object | No | Forge-specific extensions (see below) |
Forge Extensions
{
"forge": {
"cache": {
"enabled": true,
"ttl": 3600,
"namespace": "default"
},
"security": {
"level": "standard",
"pii": { "detect": true, "redact": true }
},
"memory": {
"enabled": true,
"userId": "user_123",
"layers": ["vector", "graph", "state"]
},
"ensemble": {
"enabled": false,
"strategy": "best-of-n",
"n": 3
},
"routing": {
"costSensitivity": "medium",
"failover": true,
"maxRetries": 3
}
}
}
Response Format
{
"id": "forge-abc123",
"object": "chat.completion",
"created": 1709000000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
},
"forge_metadata": {
"provider": "openai",
"routing_time_ms": 3,
"security_scan": "passed",
"cache_hit": false,
"cost_usd": 0.00031,
"trace_id": "trace_xyz789"
}
}
Streaming
Set stream: true to receive Server-Sent Events. Each event contains a delta chunk:
data: {"id":"forge-abc123","choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"id":"forge-abc123","choices":[{"delta":{"content":"!"},"index":0}]}
data: {"id":"forge-abc123","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]
Error Codes
| Code | Description |
|---|---|
400 | Invalid request parameters |
401 | Invalid or missing API key |
402 | Payment required (x402) |
403 | Feature not available on current tier |
429 | Rate limit exceeded |
500 | Internal server error |
503 | All providers unavailable |
curl Example
curl -X POST https://api.optima-forge.com/v1/chat/completions \
-H "Authorization: Bearer $FORGE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 150,
"forge": {
"cache": {"enabled": true},
"security": {"level": "standard"},
"memory": {"enabled": true, "userId": "user_123"}
}
}'
JavaScript Example
import { Forge } from "@optima-forge/sdk";
const forge = new Forge({ apiKey: process.env.FORGE_API_KEY });
const response = await forge.chat.completions.create({
model: "auto",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is the capital of France?" },
],
temperature: 0.7,
max_tokens: 150,
forge: {
cache: { enabled: true },
security: { level: "standard" },
memory: { enabled: true, userId: "user_123" },
},
});
console.log(response.choices[0].message.content);
Python Example
from optima_forge import Forge
forge = Forge(api_key="forge_sk_your_key")
response = forge.chat.completions.create(
model="auto",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
temperature=0.7,
max_tokens=150,
forge={
"cache": {"enabled": True},
"security": {"level": "standard"},
"memory": {"enabled": True, "userId": "user_123"},
},
)
print(response.choices[0].message.content)