Not all AI queries cost the same.

The architecture that makes Transparent Lab trustworthy also makes it one of the most efficient AI systems you can use.

0.02–0.08

Wh per query

Energy per typical query

4–15× less than ChatGPT, 3–12× less than Gemini

ml direct cooling

Zero-water inference

Primary hardware is air-cooled by design

0.008–0.03

g CO₂ per query

Minimal carbon footprint

Based on US grid average of ~0.4 kg CO₂/kWh

Why architecture matters more than intentions

The environmental cost of AI is determined by three design decisions made long before anyone types a question.

Inference hardware

Purpose-built inference accelerators are 3–10× more energy-efficient than GPUs — and some are air-cooled, eliminating direct water use entirely.

Model size

An 8B-parameter model uses a fraction of the energy of a 1.7T-parameter model. Mixture-of-experts architectures compound the savings.

Retrieve vs. generate

RAG systems fetch pre-indexed evidence and feed focused context to the model — shorter prompts, shorter outputs, fewer wasted tokens.

Generic chatbot

User query

Entire model activates (~400B+ params)

Generate answer from memory

Answer (no sources)

~0.3 Wh

Transparent Lab

User query

Embed query (tiny operation)

Retrieve relevant chunks

Focused model (~8–20B params)

Answer with citations

~0.045 Wh

Same question. Different architecture. 7× less energy.

Purpose-built inference hardware

Air-cooled by design — no evaporative cooling towers. 3–10× more energy-efficient per token than equivalent GPU-based inference with deterministic execution.

Efficient embedding & feed

Hybrid retrieval (pgvector + BM25) with one-time embedding cost at upload. The personalized feed uses pre-generated queries — zero LLM calls on refresh.

The numbers

All estimates are derived from published, peer-reviewed research and publicly disclosed figures.

Energy per query (Wh)

Transparent Lab

Google

Gemini

ChatGPT

Claude

Claude Opus (~4.05 Wh/query) excluded for readability — see full comparison table below. All figures are estimates based on published data.

Water per query (ml)

~1–3 ml

ResearchAssistant

—

GoogleSearch

—

Gemini

~16.9 ml

ChatGPT(conserv.)

~519 ml

ChatGPT(high est.)

Estimated total water per query (ml) — direct cooling + electricity generation. Bar heights are proportional. Dashed outlines = data not publicly disclosed.

Energy

TL (typical)0.015–0.08 Wh

Google Search~0.3 Wh

ChatGPT (GPT-4o)~0.3 Wh

Claude Sonnet~0.5 Wh

Claude Opus~4.05 Wh

Water

TL (total)~1–3 ml

Google / GeminiNot disclosed

ChatGPT (conserv.)~16.9 ml

ChatGPT (high est.)~519 ml

CO₂

TL (typical)~0.008–0.03 g

Google Search~0.12 g

ChatGPT (GPT-4o)~0.12–0.14 g

Claude Sonnet~0.20 g

Claude Opus~1.62–1.80 g

What this means at scale

For ~100 active users averaging 50 queries/day (150,000 queries/month):

0.2–0.5 L

Monthly water (TL)

vs. ~78 L for ChatGPT (high) — a bathtub

0.4–1.5 kg

Annual CO₂ (TL, 10 researchers)

vs. ~81–90 kg for Opus — 200+ driving miles

2.3–12 kWh

Annual energy (TL, 10 researchers)

vs. ~15–25 kWh for ChatGPT at same volume

Methodology and sources

For ChatGPT (GPT-4o):

Epoch AI (February 2025) estimated ~0.3 Wh per typical query. Sam Altman confirmed ~0.34 Wh in his June 2025 essay "The Gentle Singularity." For long-context queries, Epoch AI estimates 2.5 Wh for ~10,000 input tokens.

For Google Gemini:

Google disclosed in August 2025 that its median Gemini text query uses ~0.24 Wh, with a 33× efficiency improvement over the prior 12 months.

For Claude (Anthropic):

Anthropic does not publicly disclose per-query energy figures. External benchmarks estimate Claude 3 Opus at ~4.05 Wh per ~400-token exchange, and Claude 3 Haiku at ~0.22 Wh. The "How Hungry is AI?" benchmark (Jegham et al., 2025) found Claude 3.7 Sonnet scored highest on eco-efficiency.

For Transparent Lab:

Our estimates are based on: (a) published energy-per-token benchmarks for models in our size class running on our type of inference hardware, (b) measured token counts from our query pipeline, and (c) the hardware manufacturer's published efficiency claims (3–10× vs GPU), conservatively applied at 3–5×.

ChatGPT water estimates — important context:

The commonly cited "one bottle of water per prompt" figure (~519 ml from UC Riverside) includes the full lifecycle: direct cooling, electricity generation, and hardware manufacturing water. The more conservative IEEE Spectrum figure (~16.9 ml) counts only direct cooling and electricity generation. Sam Altman has stated ~0.32 ml direct water per query. The true figure depends on what you include in the boundary — researchers disagree on where that boundary should be.

Honesty & perspective

We show our work — including the uncertainties.

Our estimates

• Energy figures are estimates, not measurements. Derived from published benchmarks combined with hardware manufacturer claims.
• Hardware efficiency claims come from the manufacturer. Third-party comparisons support 3–10×, but not independently audited under our workload.
• Fallback path varies. During outages, queries route to GPU-based infrastructure with a different efficiency profile.

Competitor estimates

• ChatGPT's 0.3 Wh is for GPT-4o. Reasoning models (o1, o3) may use 30–50× more energy.
• Gemini's 0.24 Wh is a median. Distribution not published — cannot compare like-for-like query types.
• Claude figures are third-party estimates. Anthropic does not disclose per-query energy.
• Water estimates are the most uncertain. ChatGPT range: 16.9 ml to 519 ml depending on methodology.

What we are confident about

The directional conclusion is robust: a RAG system using small specialized models on purpose-built, air-cooled hardware is meaningfully more efficient — in energy, water, and carbon — than frontier consumer AI on GPU infrastructure. The magnitude (4–15× on energy, near-zero direct water) is large enough to survive significant estimation error.

Your personal AI usage is a tiny fraction of your overall footprint — a heavy user might consume 1–5 kWh/month vs. 875 kWh for the average US household. But collectively, with the industry projected to reach 347 TWh by 2030, the difference between 0.05 Wh and 0.3 Wh per query multiplied by billions of daily queries becomes the difference between manageable and catastrophic demand growth.

Transparent Lab wasn't built to be "eco-friendly AI." It was built for scientific rigor — and the same choices happen to be the most efficient ones available.

Built for scientific rigor. Efficient by design.

References

You, J. (2025). "How much energy does ChatGPT use?" Epoch AI Gradient Updates. epoch.ai
Altman, S. (2025). "The Gentle Singularity." OpenAI Blog.
Google (2025). Environmental Report 2025. Reported in Ritchie, H. (2025). Sustainability by Numbers
CarbonCredits.com (2026). "ChatGPT vs Claude AI: Carbon Footprints." carboncredits.com
Jegham et al. (2025). "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference." arXiv:2505.09598
Samsi et al. (2023). "From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference." IEEE HPEC.
Lin (2025). Reported in llm-tracker.info
EESI (2025). "Data Centers and Water Consumption." eesi.org
IEEE Spectrum (2025). "The Real Story on AI Water Usage at Data Centers." spectrum.ieee.org
Li et al. (2023). "Making AI Less Thirsty: Uncovering and Addressing the Secret Water Footprint of AI Models." UC Riverside / UT Arlington.
IEEE Spectrum (2025). "AI Energy Use: The Hidden Cost of ChatGPT Queries." spectrum.ieee.org
Towards Data Science (2025). Jegham et al. towardsdatascience.com
Brookings Institution (2025). "AI, data centers, and water." brookings.edu
Environmental Law Institute (2025). "AI's Cooling Problem." eli.org
US EPA. Greenhouse Gas Equivalencies Calculator. Average passenger vehicle emits ~0.4 kg CO₂ per mile.
US Energy Information Administration. Average US household electricity consumption: ~10,500 kWh/year.