Microsoft is rolling out its second-generation artificial intelligence chip, Maia 200 AI chip. The all-new AI chip aims to be a potential alternative to leading processors from Nvidia and to offerings from cloud rivals Amazon and Google. The Maia 200 chip comes two years after Microsoft said it had developed its first AI chip, the Maia 100, which was never made available for cloud clients to rent. The chips use Taiwan Semiconductor Manufacturing Co’s (TSMC) 3 nanometer process. Four are connected together inside each server. They rely on Ethernet cables, rather than the InfiniBand standard. Nvidia sells InfiniBand switches following its 2020 Mellanox acquisition.This time it will be different. Scott Guthrie, Microsoft’s executive vice president for cloud and AI, said in a blog post that, for the new chip, there will be “wider customer availability in the future.” In a post on X, formerly Twitter, Microsoft CEO Satya Nadella said, “Our newest AI accelerator Maia 200 is now online in Azure. Designed for industry-leading inference efficiency, it delivers 30% better performance per dollar than current systems. And with 10+ PFLOPS FP4 throughput, ~5 PFLOPS FP8, and 216GB HBM3e with 7TB/s of memory bandwidth it’s optimized for large-scale AI workloads. It joins our broader portfolio of CPUs, GPUs, and custom accelerators, giving customers more options to run advanced AI workloads faster and more cost-effectively on Azure.“
Guthrie called the Maia 200 “the most efficient inference system Microsoft has ever deployed.” Developers, academics, AI labs and people contributing to open-source AI models can apply for a preview of a software development kit.
Microsoft’s Superintelligence team to use Maia 200 AI chip
Some of the first units will go to Microsoft’s superintelligence team lead by Microsoft AI CEO Mustafa Suleyman. The chips will also be used to power the Copilot assistant for businesses and AI models, including OpenAI’s latest, that Microsoft rents to cloud customers. “It’s a big day. Our Superintelligence team will be the first to use Maia 200 as we develop our frontier AI models,” said Suleyman in a post on X.
Microsoft’s ‘message’ to Amazon and Google on Maia 200
Microsoft said each Maia 200 packs more high-bandwidth memory than a third-generation Trainium AI chip from Amazon Web Services or from Google’s seventh-generation tensor processing unit. “Maia 200 the most performant, first-party silicon from any hyperscaler, with three times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google’s seventh generation TPU. Maia 200 is also the most efficient inference system Microsoft has ever deployed, with 30% better performance per dollar than the latest generation hardware in our fleet today,” wrote Guthrie. For those unaware, a hyperscaler is a massive-scale cloud service provider (such as AWS, Microsoft Azure, or Google Cloud) that offers immense, on-demand computing power, storage, and networking capabilities, designed for extreme scalability. The big five hyperscalers are Amazon’s AWS, Google Cloud, Microsoft’s Azure, IBM Cloud and Oracle Cloud.Microsoft’s chip push started years after Amazon and Google-parent Alphabet began designing their own chips. As a report in Bloomberg says, all three tech giants have similar aim: Cost-effective machines that can be seamlessly plugged into data centers and offer savings and other efficiencies to cloud customers. The high costs and short supply of the latest industry-leading chips from Nvidia has fueled a scramble to find alternative sources of computing power.





