القائمة الرئيسية

الصفحات

Microsoft is considering using liquid cooling for artificial intelligence chips

 Using liquids to cool computers in data centers is more efficient but not easy

Semiconductors, whose components are measured in nanometers, are considered one of the wonders of modern data centers specialized in artificial intelligence systems, but fans are still one of the most important machines used in these facilities, given that the absence of a continuous flow of cold air between rows of computers may expose advanced chips to overheating. excess. As the cost of keeping enough fans and air conditioning equipment running increases, chip makers and data center operators are looking for new and very different ways to address this problem.


These ambitions became evident on November 15, when Microsoft announced its first major initiative in the field of making chips for artificial intelligence systems. Its new chip (Maia 100), which aims to compete with the best products of Nvidia, is designed to connect to what may be described as a cold plate, which is a metal device that maintains its coolness thanks to the pumping of liquid under its surface. This technology could be an intermediate step towards full immersion cooling, where entire server racks operate inside tanks containing a special liquid.


Cooling experiments

Those who have to figure out how to cool computer servers have known about the benefits of liquid cooling for years: water is about four times better at absorbing heat than air. Some cryptocurrency miners have previously tested this technology, and some data centers have previously adopted cold plate technology, using it with chips that were originally designed to operate in normal air-cooled conditions. In addition, some passionate video gamers seeking to get the most out of their computers and reduce the annoying noise of powerful fans showed off some of the cooling systems they had invented that used illuminated water pipes.

However, there are drawbacks to liquid cooling, as water conducts electricity, which may damage expensive equipment, and thus requires alternative liquids in the event of contact between them and computers. For many large data centers, implementing a completely new cooling strategy is a huge infrastructure project. For example, operators must think about how to prevent floors from collapsing under the weight of all the liquid needed to submerge racks of computer cabinets that rise up to seven metres. This forced major data center operators to continue using fans, leaving liquid cooling technologies to the experimentalists.

However, the huge computational requirements of artificial intelligence systems changed the equation. Developments that increased the chip’s capacity also doubled its need for electricity, and the more electricity the chip consumes, the more heat it generates. Each Nvidia H100 artificial intelligence accelerator, which is considered the basic standard for the development of artificial intelligence systems, consumes at least 300 watts of electricity, about three times the amount consumed by a 65-inch flat-screen TV. A data center can use hundreds or even thousands of processors, each of which costs more than a family car.


Read also: “NVIDIA”: America needs 20 years to gain its independence in the chip industry

microsoft chips

Cooling is the fastest growing infrastructure cost for data centers, at a compound annual growth rate of 16%, according to a November 2023 report from Omdia Research. Up to 40% of the total electricity consumed by a data center goes into cooling, according to Jennifer Hufstetler, product sustainability executive at Intel.


“Electricity is the primary barrier to data centers,” she said. Cooling challenges have even forced some centers to limit some types of components, leave space between racks, or slow down some expensive chips to prevent them from overheating.


Microsoft's Maia chips are designed to work alongside massive cooling units, passing liquid through cold plates directly attached to them. This enables the chips to operate in regular data centers. Microsoft says it will begin installing it in 2024.

Mark Russinovich, chief technical officer of Microsoft's Azure division, who specializes in cloud services, pointed out that the company hopes that the role of liquid cooling will expand further in all operations in its data centers. “This is a technology that has proven successful, and it is in the production stage,” he said in an interview from his home, adding that this technology “has been in production for a long time, including working on it here under my desk on the computer that I use for games.”


Over the next few years, Microsoft also intends to develop data centers capable of accommodating immersion cooling, where computer racks operate inside cooling pools. While this method will be more effective than cold panels, it requires extensive verification of equipment at all levels.

Heating pools

The thorny question when adopting immersion cooling revolves around what type of liquid to use. Previous experiments used chemicals known as permanent, namely polyfluoroalkyl, which do not decompose naturally. However, concerns related to safety and environmental regulations have led to a reduction in the use of these materials, and 3M, the large producer of this material, announced at the end of 2022 that it would stop making it.


Microsoft has not revealed what type of liquids its systems will use. The energy company Shell has developed a process to convert natural gas into industrial liquids, and Intel announced that it is testing it.

Meanwhile, other chip companies' plans for liquid cooling are not yet clear. Hufstetler said that Intel recently amended its policies to allow its customers to manufacture their own liquid cooling systems, in order to cool some of its products, without this leading to the cancellation of its warranty for these products.


Substantial modernization may be necessary to ensure that data centers keep up with the requirements accompanying the development of artificial intelligence systems. But the issue of finding suitable places to establish these centers has begun to pose challenges, as some local communities refuse to host factories that consume a huge amount of electricity and provide few job opportunities.

However, liquid cooling could make AI facilities better neighbors if these facilities become a source of hot water.


John Lin of Equinix, the largest data center outsourcing vendor, is one of the operators who have begun implementing cold plate cooling. He said that the company will use water flow from one of its facilities in Paris to heat swimming pools during the 2024 Olympic Games.

You are now on the first topic

تعليقات