This piece is the first in a new series from the Institute for Progress (IFP), called Compute in America: Building the Next Generation of AI Infrastructure at Home. In this series, we examine the challenges of accelerating the American AI data center buildout. Future pieces will be published at this link.
We often think of software as having an entirely digital existence, a world of “bits” that’s entirely separate from the world of “atoms." We can download endless amounts of data onto our phones without them getting the least bit heavier; we can watch hundreds of movies without once touching a physical disk; we can collect hundreds of books without owning a single scrap of paper.
But digital infrastructure ultimately requires physical infrastructure. All that software requires some sort of computer to run it. The more computing that is needed, the more physical infrastructure is required. We saw that a few weeks ago when we looked at the enormous $20 billion facilities required to manufacture modern semiconductors. And we also see it with state-of-the-art AI software. Creating a cutting-edge Large Language Model requires a vast amount of computation, both to train the models and to run them once they’re complete. Training OpenAI’s GPT-4 required an estimated 21 billion petaFLOP (a petaFLOP is 10^15 floating point operations).1 For comparison, an iPhone 12 is capable of roughly 11 trillion floating point operations per second (0.01 petaFLOP per second), which means that if you were able to somehow train GPT-4 on an iPhone 12, it would take you more than 60,000 years to finish. On a 100 Mhz Pentium processor from 1997, capable of a mere 9.2 million floating-point operations per second, training would theoretically take more than 66 billion years. And GPT-4 wasn’t an outlier, but part of a long trend of AI models getting ever larger and requiring more computation to create.
But, of course, GPT-4 wasn’t trained on an iPhone. It was trained in a data center, tens of thousands of computers and their required supporting infrastructure in a specially-designed building. As companies race to create their own AI models, they are building enormous compute capacity to train and run them. Amazon plans on spending $150 billion on data centers over the next 15 years in anticipation of increased demand from AI. Meta plans on spending $37 billion on infrastructure and data centers, largely AI-related, in 2024 alone. Coreweave, a startup that provides cloud and computing services for AI companies, has raised billions of dollars in funding to build out its infrastructure and is building 28 data centers in 2024. The so-called “hyperscalers,” technology companies like Meta, Amazon, and Google with massive computing needs, have enough estimated data centers planned or under development to double their existing capacity. In cities around the country, data center construction is skyrocketing.
But even as demand for capacity skyrockets, building more data centers is likely to become increasingly difficult. In particular, operating a data center requires large amounts of electricity, and available power is fast becoming the binding constraint on data center construction. Nine of the top ten utilities in the U.S. have named data centers as their main source of customer growth, and a survey of data center professionals ranked availability and price of power as the top two factors driving data center site selection. With record levels of data centers in the pipeline to be built, the problem is only likely to get worse.
The downstream effects of losing the race to lead AI are worth considering. If the rapid progress seen over the last few years continues, advanced AI systems could massively accelerate scientific and technological progress and economic growth. Powerful AI systems could also be highly important to national security, enabling new kinds of offensive and defensive technologies. Losing the bleeding edge on AI progress would seriously weaken our national security capabilities, and our ability to shape the future more broadly. And another transformative technology largely invented and developed in America would be lost to foreign competitors.
AI relies on the availability of firm power. American leadership in innovating new sources of clean, firm power can and should be leveraged to ensure the AI data center buildout of the future happens here.
Intro to data centers
A data center is a fundamentally simple structure: a space that contains computers or other IT equipment. It can range from a small closet with a server in it, to a few rooms in an office building, to a large, stand-alone structure built specifically to house computers.
Large-scale computing equipment has always required designing a dedicated space to accommodate it. When IBM came out with its System/360 in 1964, it provided a 200-page physical planning manual that gave information on space and power needs, operating temperature ranges, air filtration recommendations, and everything else needed for the computers to operate properly. But historically, even large computing operations could be done within a building mostly devoted to other uses. Even today, most “data centers” are just rooms or floors in multi-use buildings. According to the EIA, there were data centers in 97,000 buildings around the country as of 2012, including offices, schools, labs, and warehouses. These data centers, typically about 2,000 square feet in size, occupy just 2% of the building they’re in, on average.
What we think of as modern data centers, specially-built massive buildings that house tens of thousands of computers, are largely an artifact of the post-internet era. Google’s first “data center” was 30 servers in a 28 square-foot cage, in a space shared by AltaVista, eBay, and Inktomi. Today, Google operates millions of servers in 37 purpose-built data centers around the world, some of them nearly one million square feet in size. These, along with thousands of other data centers around the world, are what power internet services like web apps, streaming video, cloud storage, and AI tools.
A large, modern data center contains tens of thousands of individual computers, specially designed to be stacked vertically in large racks. Racks hold several dozen computers at a time, along with other equipment needed to operate them, like network switches, power supplies, and backup batteries. Inside the data center are corridors containing dozens or hundreds of racks.
The amount of computer equipment they house means that data centers consume large amounts of power. A single computer isn’t particularly power hungry: A rack-mounted server might use a few hundred watts, or about 1/5th the power of a hair dryer. But tens of thousands of them together create substantial demand. Today, large data centers can require 100 megawatts (100 million watts) of power or more. That’s roughly the power required by 75,000 homes, or needed to melt 150 tons of steel in an electric arc furnace.2 Power demand is so central, in fact, that data centers are typically measured by how much power they consume rather than by square feet (this CBRE report estimates that there are 3,077.8 megawatts of data center capacity under construction in the US, though exact numbers are unknown). Their power demand means that data centers require large transformers, high-capacity electrical equipment like switchgears, and in some cases even a new substation to connect them to transmission lines.
All that power eventually gets turned into heat inside the data center, which means it requires similarly robust equipment to move that heat out as swiftly as power comes on. Racks sit on raised floors, and are kept cool by large volumes of air pulled up from below and through the equipment. Racks are typically arranged to have alternating “hot aisles” (where hot air is exhausted) and “cold aisles” (where cool air is pulled in). The hot exhaust is removed by the data center’s cooling systems, chilled, and then recirculated. These cooling systems might be complex, with multiple “cooling loops” of heat exchange fluids, though nearly all data centers use air to cool the IT equipment itself.
These cooling systems are large, unsurprisingly. The minimum amount of air needed to remove a kilowatt of power is roughly 120 cubic feet per minute; for 100 megawatts, that means 12 million cubic feet per minute. Data center chillers have cooling systems with thousands of times the capacities of a typical home air conditioner. Even relatively small data centers will have enormous air ducts, high-capacity chilling equipment, and large cooling towers. This video shows a data center with a one million gallon “cold battery” water tank: Water is cooled down during the night, when power is cheaper, and used to reduce the burden on the cooling systems during the day.
Because of the amount of power they consume, substantial effort has gone into making data centers more energy efficient. A common data center performance metric is power usage effectiveness (PUE), the ratio of the total power consumed by a data center to the amount of power consumed by its IT equipment. The lower the ratio, the less power is used on things other than running computers, and the more efficient the data center.
Data center PUE has steadily fallen over time. In 2007, the average PUE for large data centers was around 2.5: For every watt used to power a computer, 1.5 watts were used on cooling systems, backup power, or other equipment. Today, the average PUE has fallen to a little over 1.5. And the hyperscalers do even better: Meta’s average data center PUE is just 1.09, and Google’s is 1.1. These improvements have come from things like more efficient components (such as uninterruptible power supply systems with lower conversion losses), better data center architecture (changing to a hot-aisle, cold-aisle arrangement), and operating the data center at a higher temperature so that less cooling is required.
There have also been efficiency improvements after the power reaches the computers. Computers must convert AC power from the grid into DC power; on older computers, this conversion was only 60-70% efficient, but modern components can achieve conversion efficiencies of up to 95%. Older computers would also use almost the same amount of power whether they were doing useful work or not. But modern computers are more capable of ramping their power usage down when they’re idle, reducing electricity consumption. And the energy efficiency of computation itself has improved over time due to Moore’s Law: Smaller and smaller transistors mean less electricity is required to run them, which means less power is required for a given amount of computation. From 1970 to 2020, the energy efficiency of computation has doubled roughly once every 1.5 years.
Because of these steady increases in data center efficiency, while individual data centers have grown larger and more power-intensive, power consumption in data centers overall has been surprisingly flat. In the U.S., data center energy consumption doubled between 2000 and 2007 but was then flat for the next 10 years, even as worldwide internet traffic increased by more than a factor of 20. Between 2015 and 2022, worldwide data center energy consumption rose an estimated 20 to 70%, but data center workloads rose by 340%, and internet traffic increased by 600%.
Beyond power consumption, reliability is another critical factor in data center design. A data center may serve millions of customers, and service interruptions can easily cost tens of thousands of dollars per minute. Data centers are therefore designed to minimize the risk of downtime. Data center reliability is graded on a tiered system, ranging from Tier I to Tier IV, with higher tiers more reliable than lower tiers.3
Most large data centers in the U.S. fall somewhere between Tier III and Tier IV. They have backup diesel generators, redundant components to prevent single points of failure, multiple independent paths for power and cooling, and so on. A Tier IV data center will theoretically achieve 99.995% uptime, though in practice human error tends to reduce this level of reliability.
Data center trends
Over time, the trend has been for data centers to grow larger and consume greater amounts of power. In the early 2000s, a single rack in a data center might use one kilowatt of power. Today, typical racks in an enterprise data center use 10 kilowatts or less, and in a hyperscaler data center, that might reach 20 kilowatts or more. Similarly, 10 years ago, nearly all data centers used fewer than 10 megawatts, but a large data center today will use 100 megawatts or more. And companies are building large campuses with multiple individual data centers, pushing total power demand into the gigawatt range. Amazon’s much-reported purchase of a nuclear-powered data center was one such campus; it included an existing 48 MW data center and enough room for expansion to reach 960 MW in total capacity. As hyperscalers occupy a larger fraction of total data center capacity, large data centers and campuses will only become more common.
Today data centers are still a small fraction of overall electricity demand. The IEA estimates that worldwide data centers consume 1 to 1.3% of electricity as of 2022 (with another 0.4% of electricity devoted to crypto mining). But this is expected to grow over time. SemiAnalysis predicts that data center electricity consumption could triple by 2030, reaching 3 to 4.5% of global electricity consumption. And because data center construction tends to be highly concentrated, data centers are already some of the largest consumers of electricity in some markets. In Ireland, for example, data centers use almost 18% of electricity, which could increase to 30% by 2028. In Virginia, the largest market for data centers in the world, 24% of the power sold by Virginia Power goes to data centers.
Power availability has already become a key bottleneck to building new data centers. Some jurisdictions, including ones where data centers have historically been a major business, are curtailing construction. Singapore is one of the largest data center hubs in the world, but paused construction of them between 2019 and 2022, and instituted strict efficiency requirements after the pause was lifted. In Ireland, a moratorium has been placed on new data centers in the Dublin area until 2028. Northern Virginia is the largest data center market in the world, but one county recently rejected a data center application for the first time in the county’s history due to power availability concerns.
In the U.S., the problem is made worse by difficulties in building new electrical infrastructure. Utilities are building historically low amounts of transmission lines, and long interconnection queues are delaying new sources of generation. Data centers can be especially challenging from a utility perspective because their demand is more or less constant, providing fewer opportunities for load shifting and creating more demand for firm power. One data center company owner claimed that the U.S. was nearly “out of power” for available data centers, primarily due to insufficient transmission capacity. Meta CEO Mark Zuckerberg has made similar claims, noting that “we would probably build out bigger clusters than we currently can if we could get the energy to do it." One energy consultant pithily summed up the problem as “data centers are on a one to two-year build cycle, but energy availability is three years to none."
Part of the electrical infrastructure problem is a timing mismatch. Utility companies see major electrical infrastructure as a long-term investment to be built in response to sustained demand growth. Any new piece of electrical infrastructure will likely be used far longer than a data center might be around, and utilities can be reluctant to build new infrastructure purely to accommodate them. In some cases, long-term agreements between data centers and utilities have been required to get new infrastructure built. An Ohio power company recently filed a proposal that would require data centers to buy 90% of the electricity they request from the utility, regardless of how much they use. Duke Energy, which supplies power to Northern Virginia, has similarly introduced minimum take requirements for data centers that require them to buy a minimum amount of power.
Data center builders are responding to limited power availability by exploring alternative locations and energy sources. Historically, data centers were built near major sources of demand (such as large metro areas) or major internet infrastructure to reduce latency.4 But lack of power and rising NIMBYism in these jurisdictions may shift their construction to smaller cities, where power is more easily available. Builders are also experimenting with alternatives to utility power, such as local solar and wind generation connected to microgrids, natural gas-powered fuel cells, and small modular reactors.
Influence of AI
What impact will AI have on data center construction? Some have projected that AI models will become so large, and training them so computationally intensive, that within a few years data centers might be using 20% of all electricity. Skeptics point out that historically increasing data center demand has been almost entirely offset by increased data center efficiency. They point to things like Nvidia's new, more efficient AI supercomputer (the GB200 NVL72), more computationally efficient AI models, and future potential ultra-efficient chip technologies like photonics or superconducting chips as evidence that this trend will continue.
We can divide the likely impact of AI on data centers into two separate questions: the impact on individual data centers and the regions where they're built and the impact of data centers overall on aggregate power consumption.
For individual data centers, AI will likely continue driving them to be larger and more power-intensive. As we noted earlier, training and running AI models requires an enormous amount of computation, and the specialized computers designed for AI consume enormous amounts of power. While a rack in a typical data center will consume on the order of 5 to 10 kilowatts of power, a rack in an Nvidia superPOD data center containing 32 H100s (special graphics processing units, or GPUs, designed for AI workloads that Nvidia is selling by the millions) can consume more than 40 kilowatts. And while Nvidia’s new GB200 NVL72 can train and run AI models more efficiently, it consumes much more power in an absolute sense, using an astonishing 120 kilowatts per rack. Future AI-specific chips may have even higher power consumption. Even if future chips are more computationally efficient (and they likely will be), they will still consume much larger amounts of power.
Not only is this amount of power far more than what most existing data centers were designed to deliver, but the amount of exhaust heat begins to bump against the boundaries of what traditional, air-based cooling systems can effectively remove. Conventional air cooling is likely limited to around 20 to 30 kilowatt racks, perhaps 50 kilowatts if rear heat exchangers are used. One data center design guide notes that AI demands might require such large amounts of airflow that equipment will need to be spaced out, with such large airflow corridors that IT equipment occupies just 10% of the floor space of the data center. For its H100 superPOD, Nvidia suggests either using fewer computers per rack, or spacing out the racks to spread out power demand and cooling requirements.
Because current data centers aren’t necessarily well-suited for AI workloads, AI demand will likely result in data centers designed specifically for AI. SemiAnalysis projects that by 2028, more than half of data centers will be devoted to AI. Meta recently canceled several data center projects so they could be redesigned to handle AI workloads. AI data centers will need to be capable of supplying larger amounts of power to individual racks, and of removing that power when it turns into waste heat. This will likely mean a shift from air cooling to liquid cooling, which uses water or another heat-conducting fluid to remove heat from computers and IT equipment. In the immediate future, this probably means direct-to-chip cooling, where fluid is piped directly around a computer chip. This strategy is already used by Google’s tensor processing units (TPUs) designed for AI work and for Nvidia’sGB200 NVL72. In the long term, we may see immersion cooling, where the entire computer is immersed in a heat-conducting fluid.
Regardless of the cooling technology used, the enormous power consumption of these AI-specific data centers will require constructing large amounts of new electrical infrastructure, such as transmission lines, substations, and firm sources of low-carbon power, to meet tech companies' climate goals. Unblocking the construction of this infrastructure will be critical for the U.S. to keep up in the AI race.
Our second question is what AI’s impact will be on the aggregate power consumption of data centers. Will AI drive data centers to consume an increasingly large fraction of electricity in the US, imperiling climate goals? Or will increasing efficiency mean a minimal increase in data center power consumption in aggregate, even as individual AI data centers grow monstrous?
This is more difficult to predict, but the outcome is likely somewhere in between. Skeptics are correct to note that historically data center power consumption rose far less than demand, that chips and AI models will likely get more efficient, and that naive extrapolation of current power requirements is likely to be inaccurate. But there's also reason to believe that data center power consumption will nevertheless rise substantially. In some cases, efficiency improvements are being exaggerated. The efficiency improvement of Nvidia's NVL72 is likely to be far less in practice than the 25x number used by Nvidia for marketing purposes. Many projections of power demand, such as those used internally by hyperscalers, already take future efficiency improvements into account. And while novel, ultra-lower power chip technologies like superconducting chips or photonics might be plausible options in the future, these are far-off technologies that will do nothing to address power concerns over the next several years.
In some ways, there are far fewer opportunities for data center energy reductions than there used to be. Historically, data center electricity consumption was flat largely due to increasing PUE (less electricity spent on cooling, UPS systems, etc). But many of these gains have already been achieved: the best data centers already use just 10% of their electricity for cooling and other non-IT equipment.
Skeptics also fail to appreciate how enormous AI models are likely to become, and how easily increased chip efficiency might get eaten by demands for more computation. Internet traffic took roughly 10 years to increase by a factor of 20, but cutting-edge AI models are getting four to seven times as computationally intensive every year. Data center projections by SemiAnalysis, which take into account factors such as current and projected AI chip orders, tech company capital expenditure plans, and existing data center power consumption and PUE, suggest that global data center power consumption will more than triple by 2030, reaching 4.5% of global electricity demand. Regardless of aggregate trends, rising power demands for individual data centers will still create infrastructure and siting challenges that will need to be addressed.
Conclusion
The rise of the internet and its digital infrastructure has required the construction of vast amounts of physical infrastructure to support it: data centers that hold tens of thousands of computers and other IT equipment. And as demands on this infrastructure rose, data centers became ever larger and more power-intensive. Modern data centers demand as much power as a small city, and campuses of multiple data centers can use as much power as a large nuclear reactor.
The rise of AI will accelerate this trend, requiring even more data centers that are increasingly power-intensive. Finding enough power for them will become increasingly challenging. This is already starting to push data center construction to areas with available power, and as demand continues to increase from data center construction and broader electrification, the constraint is only likely to get more binding.
A floating-point operation is a mathematical operation on decimal numbers, like 11.2 + 3.44 or 99.8 / 6.223.
Per the steel presentation, a typical electric arc furnace makes between 130 and 180 tons per hour, and requires 650 kilowatt-hours of power per ton. That yields 97,500 kilowatts, or 97.5 megawatts.
Other countries sometimes have their own data center grading systems that broadly correspond to this tiered system. Some providers claim they have even more reliable Tier V data centers, an unofficial tier that doesn’t seem to be endorsed by the Uptime Institute, a data center trade organization.
Being near major internet infrastructure is part of the reason why Northern Virginia became a data center hotspot.
Why not build data centers where there is lots of power and cooling is less of an issue due to cold outdoor temperatures - like the James Bay area of Quebec?
I'm kinda surprised at the NIMBY issue. I mean a data center is in some sense the perfect neighbor as they will occupy high value real estate and have virtually no burden in terms of transit nor burden most city services.