I Was Wrong About Data Center Water Consumption
But there are also some issues with how the Berkeley Lab report is estimating it.
(In lieu of a reading list this week, please enjoy this bonus essay about data center water consumption.)
In my essay on water usage in the US, I noted that “you need to be careful when talking about water use: it’s very easy to take figures out of context, or make misleading comparisons.”
I should have taken my own advice!
In the portion of the essay about data center water consumption, I state that the annual indirect water usage of data centers (from their share of power plant water usage) was around 579 million gallons a day, which I pulled from a Lawrence Berkeley Lab report on data centers. I assumed that this figure was total water use, and that because most power plant water use is non-consumptive, the actual water consumption was a small fraction of this (~19 million gallons). This, combined with the report’s estimate for water consumed by data centers directly, yielded a water consumption rate of around 66 million gallons a day.
But I didn’t read the report closely enough, and apparently this 579 million gallons a day was in fact consumptive use. This would increase the total consumptive water use of data centers from 66 million to 628 million, a huge difference.
However, an estimate of 579 million gallons a day of consumptive water use seems extremely high. Per the Berkeley Lab report, in 2023 all US data centers were estimated to consume 176 terawatt hours (TWh) of electricity, or around 4.4% of all electricity generated in the US. Per the USGS, the amount of water consumed (not merely used) by thermoelectric power plants in the US is around 1.2 million gallons a day per TWh for once-through cooling, and 1.4 million gallons a day per TWh for recirculating cooling.
Even if we assume all data centers are powered by thermoelectric power plants, this implies a total indirect consumptive use for data centers of around 234 million gallons per day, substantially less than the 579 million gallons given by the LBL report. (The LBL report actually states “nearly 800 billion liters” in 2023, which works out to 579 million gallons daily use.)
It doesn’t seem to be a typo: later in the report they note that indirect water consumption from data center power use is around 4.5 liters per kilowatt/hour (which works out to 579 MGal/day), slightly higher than overall US electric power water consumption overall at 4.35 liters per kilowatt hour. 4.35 liters per kilowatt hour works out to 3.15 million gallons per day per TWh, more than twice the USGS value for thermoelectric power plant consumption (which as a reminder is 1.2-1.4 million gallons per day).
Why is the Berkeley Lab estimate for data center water consumption, and electric power water consumption more generally, so high?
A clue comes from a map the Berkeley Lab report includes on the water intensity of power generation. When we looked at water consumption of thermoelectric power plants, we noted that its concentrated in the eastern half of the country:
However, per the Berkeley lab, the most intensive water use (in terms of liters per kilowatt hour) is found in the western half of the US (and to a lesser extent parts of the southeast). The areas of the country with really high amounts of thermoelectric power water consumption, by contrast — Texas, Florida, parts of the northeast — actually have some of the lowest water consumption on a kilowatt hour basis.
The answer appears to be that the Berkeley Lab report includes the effects of water evaporation from hydroelectric dam reservoirs in their water use calculations. The areas of the country with very intensive electric power water consumption (the northwest, the southwest, and the southeast) are areas with large amounts of hydropower (the Bonneville Power Administration, dams on the Colorado River, and the Tennessee Valley Authority).
Per this 2003 report from the National Renewable Energy Laboratory, thermoelectric power plants consume about 0.47 gallons of water per kilowatt hour of electricity generated, a figure that aligns with USGS values for thermal power plant water consumption. Hydroelectric power plants, on the other hand, “consume” an astounding 18.27 gallons per kilowatt hour via evaporation, almost 40 times as much! This value is so high that it drives up the average level of electric power water consumption for the entire US.
Does it make sense to include this water evaporation in the share of water consumed by data centers? I think it’s debatable. On the one hand, a huge dam reservoir does increase the level of water evaporation relative to an undammed river by increasing the amount of water surface area. On the other hand, some of this loss will be offset by the fact that dams make more fresh water available for use by storing excess in rainy seasons for use in drier seasons (this was part of the rationale for constructing several of the huge dams on the Colorado River, such as the Hoover and Grand Coulee Dams). And a dam that creates a reservoir as a supply of fresh water will have evaporation whether or not it generates electric power.
The NREL report notes these complexities:
There are substantial regional differences in the use of hydroelectric power, and therefore a thorough understanding of local conditions is necessary to properly interpret these data. There are river basins where evaporation is a substantial percentage of the total river flow, and this evaporation reduces the available supply both for downstream human consumption as well as having environmental consequences for coastal ecosystems that depend on fresh water supply. On the other hand, consider the case of a hydroelectric project on a relatively small river, which provides the fresh water supply to a major metropolitan area. In this case, the reservoir may be a valuable fresh water resource, especially if evaporation as a percentage of the river flow rate is low. If the downstream consequences for human consumption and coastal ecosystems are low, then the water consumption from hydroelectric projects would be irrelevant—whether or not electric generation occurs, the evaporation will still happen as a necessary consequence of providing fresh water supply to the region. These issues are beyond the scope of this paper, but must be considered when interpreting these results.
It’s not clear to me if the Berkeley Report tried to take this into account, but I suspect it didn’t, and merely applied estimates of regional hydroelectric evaporation without doing a dam-by-dam counterfactual of whether that evaporation would occur in the absence of electric power generation. (It’s worth noting here that the USGS doesn’t include hydroelectric evaporation when calculating US water use.)
Another issue I came across in the Berkeley Report is that it specifically excludes any sort of power purchase agreement, and instead estimates water consumption based on regional patterns of electricity generation.
It is important to note that the methodology used here to calculate indirect water and emission impacts does not incorporate any power purchase agreements between individual data center facilities and their electricity providers or on-site “behind the meter” generation, which could significantly affect water consumption and emissions estimates, depending on the electricity source. Nevertheless, due to the unavailability of facility-level data, we are constrained to assume the same electricity grid mix as that provided by the local balancing authority for all data centers within its jurisdiction.
This is a serious issue, because the hyperscalers (which currently are responsible for around 1/3 of all data center indirect water consumption) are very large purchasers of renewable energy via power purchase agreements. Amazon, Meta, Google, and Microsoft all report that 100% of their electricity comes from renewable sources. And outside of the hyperscalers, some large data center colocation companies are also very large purchasers of PPAs. Equinix, one of the largest data center leasing companies in the US, reported 96% renewable use in 2023. Digital Realty, another large data center operator, also uses PPAs to achieve 100% renewable use in North America. (This renewable use isn’t necessarily direct consumption: often it takes the form of companies buying certificates for power generated elsewhere, though many companies are also attempting to achieve direct renewable energy purchases).
The effect of PPAs on water consumption depends on the exact type of renewable energy used (and on how the renewable energy accounting is done). Solar and wind projects don’t use water during their operation, while nuclear power plants and (arguably) hydroelectric power plants do. In practice, most renewable PPAs by hyperscalars seem to be for wind or solar electricity.
Conclusion
So to wrap up, I misread the Berkeley Report and significantly underestimated US data center water consumption. If you simply take the Berkeley estimates directly, you get around 628 million gallons of water consumption per day for data centers, much higher than the 66-67 million gallons per day I originally stated.
However, the methods used to produce these estimates are debatable, and seem to have been chosen to give the maximum possible value for data center water consumption. If you exclude the water “consumed” by hydroelectric plants via reservoir evaporation, you get something closer to perhaps 275 million gallons per day.1 And if you take into account the fact that lots of data center operators use renewable energy PPAs (mostly from wind and solar sources), my guess is that you get something closer to 200-250 million gallons per day (though I haven’t run a detailed calculation here).
275 million assumes that all power would come from thermoelectric plants, so the actual value would be somewhat less than this.
The key issue to always remember with water is that averages and totals are almost meaningless in assessing problems. Some watersheds have lots of water, others very little; some watersheds have lots of water users, some very few. You can transfer water from one watershed to another, but it's an expensive process. So even if water use by data centers were a small part of the overall total, it may very well be critical in specific watersheds, and contrariwise there may be watersheds where data centers account for most water use and yet it does not matter because there are few competing uses in that watershed. (Likewise, the often cited numbers showing that agriculture is the largest water user is less meaningful than it appears, as a non-negligible portion of that use is in watersheds where it does not compete with other uses, and could not easily be transferred to other watersheds where it would be more valuable.)
Interesting! Thanks for clarifying. The absence of a singular method to incorporate evaporation seems a poor excuse to simply include it all... For example, a multiplication of the % of downstream use during the driest season, or some other measured annual maximum, would be a good step while remaining very conservative. With the current waiting line for grid connections, not acknowledging PPAs is more understandable