Infinite Context Engine
ICE Tool
Compress massive contexts by 50-65%, saving tokens and cost on every LLM call.
All ToolsSubscribe Now
ICE
Infinite Context Engine
ICE uses semantic compression algorithms to reduce context window usage by 50-65% while preserving meaning. Feed in massive documents, conversation histories, or code — ICE returns a compressed representation that LLMs understand perfectly. Stop paying for redundant tokens.
Features
- Semantic context compression (50-65% reduction)
- Auto-algorithm selection (entropy-based)
- Lossless decompression API
- Token count before/after metrics
- Cost savings calculation per request
- Batch compression support
- Multiple algorithm options (auto, extractive, abstractive)
API Endpoints
POST
/v1/tools/ice/compressCompress context
POST
/v1/tools/ice/decompressDecompress context
GET
/v1/tools/ice/statsUsage statistics
GET
/v1/tools/ice/healthHealth check
Use Cases
Fit more context into limited context windows
Reduce LLM costs on high-volume applications
Compress conversation history for long-running sessions
Pre-process RAG chunks before LLM calls
Quick Start
const res = await fetch('https://api.optima-forge.com/v1/tools/ice/compress', {
method: 'POST',
headers: {
'Authorization': 'Bearer ftk_ice_your_key_here',
'Content-Type': 'application/json',
},
body: JSON.stringify({
context: longDocument,
algorithm: 'auto',
}),
});
const { compressed, compression_ratio, tokens_saved, cost_saved_usd } = await res.json();
// compression_ratio: 0.38 (62% reduction)
// tokens_saved: 4,200
// cost_saved_usd: 0.01Ready to start?
Get your API key in seconds. $10/month, cancel anytime.