Forge is preparing the requested surface and verifying the live route.
Forge is preparing the requested surface and verifying the live route.
Drop-in OpenAI-compatible API that works with LangChain, LlamaIndex, CrewAI, AutoGen, Vercel AI SDK, and major OpenAI-compatible frameworks. Change two lines of code. Keep everything else.
Forge exposes a fully OpenAI-compatible API at https://optimaforge.ai/v1. Any SDK, framework, or tool that supports OpenAI works with Forge by changing the base URL and API key. Your prompts, tools, streaming logic, function calling, and structured output all work identically. Behind the scenes, Forge adds intelligent routing, persistent memory, security scanning, and full observability to every request.
Integrating with Forge takes under a minute. Here are examples for the most popular SDKs and frameworks.
from openai import OpenAI
client = OpenAI(
base_url="https://optimaforge.ai/v1",
api_key="forge_sk_..."
)
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Hello"}],
extra_body={"forge": {"cache": True}}
)import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";
const { text } = await generateText({
model: openai("auto", {
baseURL: "https://optimaforge.ai/v1",
apiKey: "forge_sk_..."
}),
prompt: "Explain quantum computing"
});from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://optimaforge.ai/v1",
api_key="forge_sk_...",
model="auto"
)
response = llm.invoke("Summarize this document")Every framework that supports OpenAI-compatible APIs works with Forge. Here is a non-exhaustive list of frameworks we actively test against and support.
Use ChatOpenAI with Forge's base URL. All chains, agents, and retrieval pipelines work without modification. Memory, tools, and callbacks pass through transparently.
Configure the OpenAI LLM class with Forge credentials. Query engines, chat engines, and data agents run against Forge's routing layer with full memory and security.
Set Forge as the LLM provider for all crew agents. Role-based agent orchestration works natively, with Forge adding cross-provider routing and persistent memory.
Point AutoGen's model client at Forge. Multi-agent conversations, code execution, and group chats route through Forge with automatic failover and cost tracking.
Use the OpenAI connector with Forge's endpoint. Planners, plugins, and memory connectors work out of the box. Microsoft Copilot-style applications benefit from Forge's security pipeline.
Use the OpenAI provider with a custom base URL. Streaming, tool calling, and structured output generation work identically through Forge's gateway.
Route OpenAI Agents through Forge for multi-provider failover, cost optimization, and security scanning. Agent handoffs and tool calls are fully supported.
Configure the OpenAIChatGenerator with Forge's API URL. Pipelines, retrievers, and generators route through Forge with tracing and memory enabled.
Forge can serve as a backend for LiteLLM or replace it entirely. Both expose OpenAI-compatible APIs, so migration is a base URL change.
Structured output extraction with Pydantic models works through Forge. The gateway preserves function calling and JSON mode for all supported providers.
Configure DSPy's language model to use Forge's endpoint. Signature-based programming, optimizers, and assertions run through Forge with cost tracking.
Use Forge as the LLM backend for Mastra agents, workflows, and tool integrations. The OpenAI-compatible API ensures full compatibility.
Configure the OpenAI model with Forge's base URL. Structured responses, tool calls, and streaming all work natively through the gateway.
Java and Kotlin applications use Spring AI's OpenAI auto-configuration with Forge's endpoint URL. Chat clients, embedding clients, and function calling are supported.
Forge routes requests across the live provider catalog through a unified interface. Add providers to your account via the dashboard, and Forge handles authentication, rate limiting, and failover automatically.
GPT-4o, GPT-4o-mini, o1, o3
Opus 4, Sonnet 4, Haiku
Gemini 2.5 Pro, Flash, Ultra
Large, Medium, Small, Codestral
Llama 4 Scout, Maverick
Command R+, Command R
Sonar Pro, Sonar
Llama, Mixtral (ultra-fast inference)
Open-source models at scale
Optimized open-source serving
DeepSeek V3, Coder
Grok 3, Grok 3 Mini
Anthropic, Llama, Titan
GPT-4o, GPT-4o-mini (Azure-hosted)
Replace your OpenAI base URL with Forge's endpoint. Your existing code, prompts, and integrations work immediately. Gain routing, memory, security, and observability with zero refactoring.