Recently, Microsoft announced the construction of a series of transcontinental data center super clusters to meet the training demands of future artificial intelligence models. These new facilities will connect multiple data centers, enabling efficient data transmission through high-speed networks, with the goal of supporting AI models with up to hundreds of trillions of parameters.
In October, Microsoft launched the first node at its Mount Pleasant data center campus in Wisconsin, connected to facilities located in Atlanta, Georgia. These data centers are not ordinary facilities; Microsoft refers to them as the "Fairwater" cluster. They are two-story buildings that use liquid cooling technology directly connected to chips, consuming almost no water resources. In the future, Microsoft plans to expand these clusters to tens of thousands of diverse GPUs to meet different workload requirements.
By interconnecting data centers, Microsoft is able to train larger-scale models and choose locations with low land costs, favorable climates, and abundant power resources for new facilities. Although Microsoft has not yet revealed the specific technologies used to connect these two data centers, there are various options available in the industry, including Cisco's 51.2Tbps router and Broadcom's new Jericho4 hardware, which can effectively connect data centers up to 1000 kilometers apart.
At the same time, Nvidia is also actively promoting the development of network technologies to meet the needs of AI training. Microsoft widely uses Nvidia's InfiniBand networking protocol in high-performance computing environments, demonstrating its commitment to efficient data transmission. Reducing bandwidth and latency issues in AI workloads remains a key focus for researchers.
Significant progress has been made in the field of AI. Earlier, the DeepMind team from Google released a report showing that many challenges can be overcome by compressing models during training and arranging communication between data centers appropriately.
Key points:
🌐 Microsoft is building transcontinental data center super clusters to support the training of large-scale AI models in the future.
💧 New facilities use efficient liquid cooling technology, consuming almost no water resources.
🚀 Various advanced network technologies will connect these data centers to improve the efficiency of AI training.
