Forge is preparing the requested surface and verifying the live route.
Forge is preparing the requested surface and verifying the live route.
Route, secure, observe, and monetize your AI workloads across the live provider catalog. One API endpoint, enterprise-grade infrastructure, zero vendor lock-in.
Forge is a unified AI gateway that sits between your application and every major LLM provider. Instead of integrating directly with OpenAI, Anthropic, Google, Mistral, and dozens of others, you integrate once with Forge. It handles intelligent routing to pick the best model for each request, persistent memory so conversations carry context across providers, a seven-layer security pipeline to protect against prompt injection and data leakage, built-in observability for cost and quality tracking, and native micropayments so you can monetize your AI features from day one.
Think of it as a smart reverse proxy for LLMs, but with memory, security, billing, and orchestration built directly into the request path. Your existing code works immediately because Forge exposes a fully OpenAI-compatible API. Just change the base URL and API key.
Every layer of Forge is purpose-built for production AI workloads. Each capability works independently and compounds when used together.
Route requests across the live provider catalog with cascading intent classification, ELO-scored quality routing, and automatic failover. RouteLLM delivers 85% cost reduction at sub-5ms latency via ONNX inference.
Learn morePersistent memory across conversations and providers. Vector search via Qdrant, graph relationships via Neo4j and Graphiti, and real-time state via Redis CRDTs. Graceful degradation to Turso when services are unavailable.
Learn moreForgeGuard pipeline with SpiceDB authorization, LlamaFirewall input scanning, DeBERTa-v3 semantic analysis, Presidio PII detection, Augustus adversarial probing, and MCP supply chain verification.
Learn moreNative x402 V2 micropayments for per-request billing, plus Stripe bridge for traditional subscriptions. Credit packs with volume bonuses, themed bundles, and public MCP tool a la carte billing.
Learn moreSelf-hosted Langfuse tracing on every request. OpenTelemetry-compatible telemetry, cost analytics, latency percentiles, and per-agent dashboards. Integrated with Opik for experiment tracking.
Learn moreDrop-in OpenAI-compatible API that works with LangChain, LlamaIndex, CrewAI, AutoGen, Semantic Kernel, and major OpenAI-compatible frameworks. Switch providers without changing a single line of application code.
Learn moreForge exposes a fully OpenAI-compatible API. Set model to "auto" and Forge picks the optimal provider based on cost, quality, and latency. Or specify a model directly like "gpt-4o" for explicit routing.
The optional forge object lets you enable semantic caching, set security levels, attach memory sessions, and control routing priority. Every parameter is optional and backward-compatible.
curl -X POST https://optimaforge.ai/v1/chat/completions \-H "Authorization: Bearer $FORGE_API_KEY" \-H "Content-Type: application/json" \-d '{"model": "auto","messages": [{"role": "user", "content": "Analyze this contract"}],"forge": {"cache": true,"security": "strict","memory": { "session": "ctx_abc123" },"priority": "quality"}}'
Dive into each subsystem to understand how Forge handles routing, memory, security, payments, observability, and more.
Cascading classifiers, RouteLLM, quality scoring, failover
Vector, graph, and state layers with cross-provider continuity
ForgeGuard 7-layer pipeline and OWASP Agentic Top 10
x402 micropayments, Stripe bridge, credit packs, bundles
Langfuse tracing, OpenTelemetry, cost and latency analytics
OpenAI-compatible API for major frameworks and the live provider catalog
SSO, Zanzibar permissions, compliance, data residency
Paid Forge subscriptions bundle the full managed engine layer. Tools and engines can still expose direct metered access for outside agents, enterprise buyers, and non-subscribers.