Forge is preparing the requested surface and verifying the live route.

Calculation Methodology

Every number on Forge Planet is calculated using conservative lower-bound estimates from peer-reviewed research. This page documents every assumption, source, and formula.

Guiding Principles

1.Conservative by default. When published estimates vary, we use the lower bound.
2.Citation-backed. Every constant cites a specific paper, report, or data source.
3.Transparent assumptions. Where data is unavailable, we state the assumption and explain why.
4.Round down. When in doubt, we round savings down, never up.

Energy per 1,000 Tokens (kWh)

Conservative estimates based on Patterson (2021) and Luccioni (2022). Larger models consume more energy per token due to more parameters and compute.

Model	kWh / 1,000 tokens
GPT-4o	0.002
GPT-4o-mini	0.0003
Anthropic Sonnet 3.5	0.002
Anthropic Haiku 3.5	0.0003
Anthropic Opus 3	0.008
Gemini 1.5 Pro	0.002
Gemini 1.5 Flash	0.0003
Gemini 1.5 Flash 8B	0.0001
DeepSeek Chat	0.001
Llama 3.3 70B	0.001
Mixtral 8x7B	0.0008
Default (unknown)	0.001

Carbon Intensity by Region (g CO2e/kWh)

Different cloud providers and regions have different carbon intensities depending on their energy mix.

Region	g CO2e/kWh	Source
US Grid Average	386	US EPA eGRID 2024
EU Grid Average	230	IEA 2024
Google Cloud	50	Google 2024 Environmental Report
AWS US East	130	AWS estimate
Oracle Cloud	180	Oracle estimate
Default	386	Conservative: US grid average

Three Savings Mechanisms

1. ICE Compression

kWh_saved = tokens_saved * (kWh_per_1000_tokens / 1000)

Forge's Infinite Context Engine compresses prompts before they reach the LLM. Fewer tokens = proportionally less inference energy.

2. Semantic Cache

kWh_saved = cache_hits * 1000 * (kWh_per_1000_tokens / 1000)

When a query matches a cached response (semantic similarity threshold), zero inference energy is consumed. We assume an average of 1,000 tokens per cached call (conservative).

3. Smart Routing

kWh_saved = tokens_routed * (kWh_per_1000_tokens / 1000) * 0.7

When Forge routes a simple query from a large model (e.g. GPT-4o) to a smaller one (e.g. GPT-4o-mini), the smaller model uses approximately 70% less energy. We apply a 70% savings factor.

// Total savings

total_kwh = ice_kwh + cache_kwh + routing_kwh

total_co2_g = total_kwh * co2_g_per_kwh

total_co2_kg = total_co2_g / 1000

Human-Readable Equivalents

Equivalent	Formula	Source
Car miles avoided	co2_kg / 0.404	EPA: 404g CO2/mile
Tree-days equivalent	(co2_kg / 21) * 365	21 kg CO2/tree/year
Smartphone charges	kwh / 0.01	~0.01 kWh per charge
LED bulb hours	kwh / 0.01	10W LED = 0.01 kWh/hour
Home hours equivalent	(kwh / 28.77) * 24	US avg 28.77 kWh/day

Sources

Patterson et al. (2021)

Carbon Emissions and Large Neural Network Training

Comprehensive analysis of training and inference energy for large language models.

Luccioni et al. (2022)

Estimating the Carbon Footprint of BLOOM

Detailed breakdown of energy per token for a 176B parameter model across training and inference.

Strubell et al. (2019)

Energy and Policy Considerations for Deep Learning in NLP

Foundational paper establishing per-token energy estimates for transformer models.

US EPA (2024)

eGRID — Emissions & Generation Resource Integrated Database

US grid average carbon intensity: 386 g CO2e/kWh. Used as conservative default.

IEA (2024)

Global Energy & CO2 Status Report

International grid carbon intensities by region.

Google (2024)

Environmental Report

Google Cloud carbon intensity: 50 g CO2e/kWh (heavily renewable-powered).

Questions about our methodology? We welcome scrutiny. Contact us.

Back to Planet