Calculation Methodology
Every number on Forge Planet is calculated using conservative lower-bound estimates from peer-reviewed research. This page documents every assumption, source, and formula.
Guiding Principles
- 1.Conservative by default. When published estimates vary, we use the lower bound.
- 2.Citation-backed. Every constant cites a specific paper, report, or data source.
- 3.Transparent assumptions. Where data is unavailable, we state the assumption and explain why.
- 4.Round down. When in doubt, we round savings down, never up.
Energy per 1,000 Tokens (kWh)
Conservative estimates based on Patterson (2021) and Luccioni (2022). Larger models consume more energy per token due to more parameters and compute.
| Model | kWh / 1,000 tokens |
|---|---|
| GPT-4o | 0.002 |
| GPT-4o-mini | 0.0003 |
| Claude 3.5 Sonnet | 0.002 |
| Claude 3.5 Haiku | 0.0003 |
| Claude 3 Opus | 0.008 |
| Gemini 1.5 Pro | 0.002 |
| Gemini 1.5 Flash | 0.0003 |
| Gemini 1.5 Flash 8B | 0.0001 |
| DeepSeek Chat | 0.001 |
| Llama 3.3 70B | 0.001 |
| Mixtral 8x7B | 0.0008 |
| Default (unknown) | 0.001 |
Carbon Intensity by Region (g CO2e/kWh)
Different cloud providers and regions have different carbon intensities depending on their energy mix.
| Region | g CO2e/kWh | Source |
|---|---|---|
| US Grid Average | 386 | US EPA eGRID 2024 |
| EU Grid Average | 230 | IEA 2024 |
| Google Cloud | 50 | Google 2024 Environmental Report |
| AWS US East | 130 | AWS estimate |
| Oracle Cloud | 180 | Oracle estimate |
| Default | 386 | Conservative: US grid average |
Three Savings Mechanisms
1. ICE Compression
kWh_saved = tokens_saved * (kWh_per_1000_tokens / 1000)
Forge's Infinite Context Engine compresses prompts before they reach the LLM. Fewer tokens = proportionally less inference energy.
2. Semantic Cache
kWh_saved = cache_hits * 1000 * (kWh_per_1000_tokens / 1000)
When a query matches a cached response (semantic similarity threshold), zero inference energy is consumed. We assume an average of 1,000 tokens per cached call (conservative).
3. Smart Routing
kWh_saved = tokens_routed * (kWh_per_1000_tokens / 1000) * 0.7
When Forge routes a simple query from a large model (e.g. GPT-4o) to a smaller one (e.g. GPT-4o-mini), the smaller model uses approximately 70% less energy. We apply a 70% savings factor.
// Total savings
total_kwh = ice_kwh + cache_kwh + routing_kwh
total_co2_g = total_kwh * co2_g_per_kwh
total_co2_kg = total_co2_g / 1000
Human-Readable Equivalents
| Equivalent | Formula | Source |
|---|---|---|
| Car miles avoided | co2_kg / 0.404 | EPA: 404g CO2/mile |
| Tree-days equivalent | (co2_kg / 21) * 365 | 21 kg CO2/tree/year |
| Smartphone charges | kwh / 0.01 | ~0.01 kWh per charge |
| LED bulb hours | kwh / 0.01 | 10W LED = 0.01 kWh/hour |
| Home hours equivalent | (kwh / 28.77) * 24 | US avg 28.77 kWh/day |
Sources
Patterson et al. (2021)
Carbon Emissions and Large Neural Network Training
Comprehensive analysis of training and inference energy for large language models.
Luccioni et al. (2022)
Estimating the Carbon Footprint of BLOOM
Detailed breakdown of energy per token for a 176B parameter model across training and inference.
Strubell et al. (2019)
Energy and Policy Considerations for Deep Learning in NLP
Foundational paper establishing per-token energy estimates for transformer models.
US EPA (2024)
eGRID — Emissions & Generation Resource Integrated Database
US grid average carbon intensity: 386 g CO2e/kWh. Used as conservative default.
IEA (2024)
Global Energy & CO2 Status Report
International grid carbon intensities by region.
Google (2024)
Environmental Report
Google Cloud carbon intensity: 50 g CO2e/kWh (heavily renewable-powered).
Questions about our methodology? We welcome scrutiny. Contact us.