Skip to main content

On This Page

Engineering Guide: Quantifying AI Workload Energy and Water Footprints

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

How to Actually Measure Your AI Workload’s Water and Energy Footprint

Engineers often operate with zero visibility into the physical resource consumption of their cloud-abstracted AI infrastructure. A single 100-hour A100 GPU workload can consume approximately 60 liters of water, roughly equivalent to one load of laundry.

Why This Matters

The technical challenge lies in the measurement gap created by cloud abstraction; standard industry-wide estimates often ignore local variables like facility efficiency and climate. While data centers consume a modest share of resources compared to agriculture, engineering teams require precise Power Usage Effectiveness (PUE) and Water Usage Effectiveness (WUE) data to move beyond ‘headline anxiety’ and provide stakeholders with actionable sustainability metrics.

Key Insights

  • Modern hyperscale data centers target a PUE of 1.1-1.2, whereas older facilities often range between 1.5 and 2.0.
  • Analysis published on the California Water Blog by UC Davis researchers indicates AI’s water footprint is a small fraction of agricultural consumption.
  • Cloud Carbon Footprint is an open-source tool used by engineering teams to estimate energy consumption by pulling billing data from AWS, GCP, and Azure.
  • Model distillation can reduce compute requirements by 10-50x, as seen when replacing a 70B parameter model with a 7B version for specific tasks.
  • WUE values vary significantly by location; a facility in Phoenix using evaporative cooling has a higher footprint than air-cooled facilities in Northern Europe.

Working Examples

Function to estimate water and energy usage based on GPU Thermal Design Power (TDP) and facility efficiency metrics.

def estimate_workload_water(gpu_hours, tdp_watts, pue, wue_liters_per_kwh):
    """Rough estimate of water consumption for a GPU workload."""
    # Total energy including facility overhead
    energy_kwh = (gpu_hours * tdp_watts / 1000) * pue
    # Water used for cooling
    water_liters = energy_kwh * wue_liters_per_kwh
    return {
        "energy_kwh": round(energy_kwh, 2),
        "water_liters": round(water_liters, 2),
        "water_gallons": round(water_liters * 0.264172, 2)
    }

# Example: 100 GPU-hours on an A100 (300W TDP)
# at a modern facility (PUE 1.1, WUE 1.8 L/kWh)
result = estimate_workload_water(gpu_hours=100, tdp_watts=300, pue=1.1, wue_liters_per_kwh=1.8)
print(result)

Commands to extract carbon data from Google Cloud and initialize the open-source Cloud Carbon Footprint dashboard.

gcloud beta billing accounts describe $BILLING_ACCOUNT_ID --format="json" | jq '.carbonInformation'

git clone https://github.com/cloud-carbon-footprint/cloud-carbon-footprint.git
cd cloud-carbon-footprint
yarn install
yarn start

Practical Applications

  • Model Distillation: Replacing massive models with fine-tuned 7B parameter versions for specific tasks to achieve 90% accuracy at 5% of the compute cost.
  • Geographic Optimization: Moving non-latency-sensitive workloads to regions like europe-north1 to leverage cooler climates and near-zero WUE.
  • Batching Inference: Grouping requests to reduce per-query energy consumption and minimize GPU idle time power draw.
  • Metric Pitfall: Relying on total water usage metrics rather than water-per-request, which fails to distinguish between business growth and technical inefficiency.

References:

Continue reading

Next article

How to Monitor Cron Jobs to Prevent Silent Failures

Related Content