Could liquid cooling save the planet?
• Two-phase immersion-based liquid cooling could be a solution to data center escalation.
• By using less energy, less water and less space, it could revolutionize the footprint of modern data centers.
• The numbers in terms of air cooling cannot climb much higher.
Liquid cooling is a technology that might yet go a long way towards saving the planet from the toddler-with-an-AR15-style destructive glee of the human race.
Thank you for coming to our TED Talk.
Let’s take things back a few steps.
Some genies are never going to go back in their bottles. Electricity. The personal computer. The internet – yes, even with all the bad bits, whatever you think they are.
Generative AI is certainly a genie with a long-term leave to remain – it’s too appallingly, wonderfully, what-is-this-mystic-sorcery useful to put back in its bottle, despite all the prophecies of doom, and the much more realistic prophecies of tunnel vision over time and human civilization disappearing up its own algorithms.
But that leaves our societies with a big problem, because to support the kind of usefulness embodied by generative AI, at least right now, you need data centers.
And data centers, cooled as they’ve traditionally been by air-moving fans, are potentially more of a planet-killer than any factory or any bomb you care to name.
Why? Because you need truly colossal amounts of energy to power the fans that cool the stacks that run the models that power your business insights in Wisconsin or Beijing.
There’s also an associated water-thirst with traditional data centers, which haven’t historically been able to do a lot with the water they suck in once it’s been used in the cooling process – though that is changing in some places today.
That means traditional data centers represent a leech in any area where they’re sited – a sucker-up of power and resources that tends to be too powerful a symbol of aggressive late-stage capitalism for the locals to miss.
Data centers have seldom been friends to the local community.
That all being the case, and the need for more data centers, processing more data, needing a larger footprint and more resources, being obvious as generative AI evolves, we sat down with Kar-Wing Lau, CTO and co-founder of LiquidStack – a company that creates and sells innovative two-phase immersion-based liquid cooling systems for data centers, for a taste of how to get ourselves out of our data center double-bind.
An ineffective leech.
What’s the real issue with air-cooled data centers? They’ve worked pretty well up to now, haven’t they?
The main issue is that air cooling is a very ineffective way to transport heat away compared to liquid cooling. It requires much more surface area to be effective. And as you know, most high-powered chips today, like an AI accelerated source, and some of those heatsinks on some of the latest Nvidia chips, they can be as tall as four inches (or about 100 millimeters).
With those kinds of AI baseboards, a lot the industry is running into problems in terms of how they deal with this kind of new hardware for generative AI.
The great advantage of liquid cooling is that is has a much higher heat conductivity. We’re using fluids, specifically dielectric fluids, meaning that they’re not conducting any electricity.
The entire electronics build is actually completely immersed in that liquid, and that provides some cooling all around the thermal surface, not just in the focus areas, where traditionally you’d have a heatsink. That liquid cooling is much more effective in transporting the heat away than air cooling would be.
So with, for example, with the dielectric liquid being pumped across the electronics, we have built systems which are able to save as much as 90% in space versus air cooling.
Imagine what happens if you extrapolate that across a whole data center. How much space could you save on a data center scale?
So, liquid cooling is more energy efficient, more space efficient, and lets data centers tackling more and more complex AI workloads, without drawing as much power for the cooling process. Feels like there should be a catch, somewhere.
Liquid cooling – show us the money.
Yes, as far as we’re concerned, two phase immersion-based liquid cooling solves a lot of the issues that traditional data centers find themselves dealing with today in terms of AI workloads – and might well be able to help with more general sustainability as the industry goes forward.
For example, compare a hypothetical traditional 36-megawatt data center with our two-phase immersion-based liquid cooling technology. The Google API, for example, saves around 61% of the whitespace of the data center when it uses liquid cooling (that’s the inside area within the data center where you have it located). And that, of course, resulted in a lot of savings, not only on our main facility, but also on cooling electrical equipment.
Give it to us in dollars – what would be looking to save in that scenario?
Something like $123 million. So it works out at just around one-third of the cost of air cooling.
Has anyone ever told you you have quite the way with a statistic?
Ha. Yeah, it’s a really profound difference, obviously, not just a little bit better. If you have a power budget and power cap of a certain number of megawatts, and you spend a lot of those megawatts on cooling in your air-cooled data center, if you switch to liquid cooling, you can use that energy instead running more AI accelerator chips.
Another thing that happens is when you have traditional data centers, if you have either high operating costs, or a more expensive facility to cope with higher power density and more compute applications like generative AI, you don’t generally have long-term amortization – which you can with liquid cooling. So right away, you save a cost on the initial setup of your data center.
Upgrading to liquid cooling.
Forgive us, but this is worth nailing down. Right now, most data centers are air-cooled. With the liquid cooling approach, do you have to build your data centers from scratch? Because that’s going to take a generation or more to address the issues the modern data center is facing. Or is this a technology that can be retrofitted, like an upgrade?
Oh, yes, it can. I mean, obviously when you’re building new data centers, it helps to have the design freedom to go with liquid cooling from the start, but absolutely some existing data centers can be retrofitted.
Our data center products can definitely be placed into traditional data centers. The key requirements are that we sample the maximum flow loading, and can connect to various types of facility water connections.
Making more data centers more viable.
We’re alread in an environment where everybody from independents to hyperscalers are looking for more and more data centers, but running into the environmental, resource, space, and cost problems of what is a fundamentally unsustainable model.
So we’re saying if they were to switch to liquid cooling now, they could reap the benefits almost immediately, and down the line, rather than pursuing traditional air cooling to the point where the model collapses?
Absolutely. There’s a strong trend right now towards much more powerful AI accelerators, right? That’s especially an issue with gaming provision. Not everyone has immediate access to the new Nvidia chips, which cost $30-40,000 each, but the world is starting to look at these new GPUs. And the processing power can only go up.
Because the heatsink on those chips has become so big, they need much more space. So we definitely see a continuing trend towards needing more and more space and more and more expensive cooling to deliver the generative AI and gaming technologies that are out there – as you say, the genie is out of that bottle.
There was a study done quite some time ago, where researchers used a 20-kilowatt array as a template, and they calculated what it would require in terms of air, to cool down 20 kilowatts. They found out that it was around 2000 cubic feet of air per minute to cool those 20 kilowatt arrays.
So if you funnel that 2000 cubic feet of air per minute through a duct around one foot wide, you generate wind speeds of 56 kilometers per hour – or 35 miles per hour.
Data centers built in wind tunnels…
You did say I had a way with statistics.
This was a 20-kilowatt array, don’t forget. Some companies are designing clusters at 35 kilowatts. So now you can upgrade that statistic by a factor of 1.75 times on top of what I’ve just mentioned.
*Does rough math in head.*
61? 61 miles per hour? That’s like turning data centers into giant industrial hand dryers! Which is to say, significantly sub-optimal environments for humans to work in. And, as we’ve covered, the numbers are just going to keep going up.
Get the message?
In Part 2 of this article, we’ll explore how liquid cooling could be applied in the here and now to avoid turning modern data centers into intolerable environments for human beings to live in.
6 December 2023
5 December 2023
4 December 2023