Back to Planet

Calculation Methodology

Every number on Forge Planet is calculated using conservative lower-bound estimates from peer-reviewed research. This page documents every assumption, source, and formula.

Guiding Principles

  • 1.Conservative by default. When published estimates vary, we use the lower bound.
  • 2.Citation-backed. Every constant cites a specific paper, report, or data source.
  • 3.Transparent assumptions. Where data is unavailable, we state the assumption and explain why.
  • 4.Round down. When in doubt, we round savings down, never up.

Energy per 1,000 Tokens (kWh)

Conservative estimates based on Patterson (2021) and Luccioni (2022). Larger models consume more energy per token due to more parameters and compute.

ModelkWh / 1,000 tokens
GPT-4o0.002
GPT-4o-mini0.0003
Claude 3.5 Sonnet0.002
Claude 3.5 Haiku0.0003
Claude 3 Opus0.008
Gemini 1.5 Pro0.002
Gemini 1.5 Flash0.0003
Gemini 1.5 Flash 8B0.0001
DeepSeek Chat0.001
Llama 3.3 70B0.001
Mixtral 8x7B0.0008
Default (unknown)0.001

Carbon Intensity by Region (g CO2e/kWh)

Different cloud providers and regions have different carbon intensities depending on their energy mix.

Regiong CO2e/kWhSource
US Grid Average386US EPA eGRID 2024
EU Grid Average230IEA 2024
Google Cloud50Google 2024 Environmental Report
AWS US East130AWS estimate
Oracle Cloud180Oracle estimate
Default386Conservative: US grid average

Three Savings Mechanisms

1. ICE Compression

kWh_saved = tokens_saved * (kWh_per_1000_tokens / 1000)

Forge's Infinite Context Engine compresses prompts before they reach the LLM. Fewer tokens = proportionally less inference energy.

2. Semantic Cache

kWh_saved = cache_hits * 1000 * (kWh_per_1000_tokens / 1000)

When a query matches a cached response (semantic similarity threshold), zero inference energy is consumed. We assume an average of 1,000 tokens per cached call (conservative).

3. Smart Routing

kWh_saved = tokens_routed * (kWh_per_1000_tokens / 1000) * 0.7

When Forge routes a simple query from a large model (e.g. GPT-4o) to a smaller one (e.g. GPT-4o-mini), the smaller model uses approximately 70% less energy. We apply a 70% savings factor.

// Total savings

total_kwh = ice_kwh + cache_kwh + routing_kwh

total_co2_g = total_kwh * co2_g_per_kwh

total_co2_kg = total_co2_g / 1000

Human-Readable Equivalents

EquivalentFormulaSource
Car miles avoidedco2_kg / 0.404EPA: 404g CO2/mile
Tree-days equivalent(co2_kg / 21) * 36521 kg CO2/tree/year
Smartphone chargeskwh / 0.01~0.01 kWh per charge
LED bulb hourskwh / 0.0110W LED = 0.01 kWh/hour
Home hours equivalent(kwh / 28.77) * 24US avg 28.77 kWh/day

Sources

Patterson et al. (2021)

Carbon Emissions and Large Neural Network Training

Comprehensive analysis of training and inference energy for large language models.

Luccioni et al. (2022)

Estimating the Carbon Footprint of BLOOM

Detailed breakdown of energy per token for a 176B parameter model across training and inference.

Strubell et al. (2019)

Energy and Policy Considerations for Deep Learning in NLP

Foundational paper establishing per-token energy estimates for transformer models.

US EPA (2024)

eGRID — Emissions & Generation Resource Integrated Database

US grid average carbon intensity: 386 g CO2e/kWh. Used as conservative default.

IEA (2024)

Global Energy & CO2 Status Report

International grid carbon intensities by region.

Google (2024)

Environmental Report

Google Cloud carbon intensity: 50 g CO2e/kWh (heavily renewable-powered).

Questions about our methodology? We welcome scrutiny. Contact us.