Microsoft shows how it combines Azure with NVIDIA chips to make AI supercomputers

Microsoft is promoting its efforts to create supercomputers using its Azure cloud computing program to help OpenAI with its ChatGPT chatbot. At the same time, it also announced a new AI virtual machine that used upgraded GPUs from NVIDIA.

The new ND H100 v5 VM from Microsoft use NVIDIA's H100 GPUs, an upgrade from the previous A100 GPUs. Companies that need to add AI features can access this virtual machine service that has the following features:

8x NVIDIA H100 Tensor Core GPUs interconnected via next gen NVSwitch and NVLink 4.0
400 Gb/s NVIDIA Quantum-2 CX7 InfiniBand per GPU with 3.2Tb/s per VM in a non-blocking fat-tree network
NVSwitch and NVLink 4.0 with 3.6TB/s bisectional bandwidth between 8 local GPUs within each VM
4th Gen Intel Xeon Scalable processors
PCIE Gen5 host to GPU interconnect with 64GB/s bandwidth per GPU
16 Channels of 4800MHz DDR5 DIMMs

This is in addition to Microsoft's previously announced ChatGPT in Azure OpenAI Service, which lets third parties access the chatbot tech through Azure.

In a separate blog post, Microsoft talks about how the company first started working with OpenAI to help create the supercomputers that are needed for ChatGPT's large language model (and for Microsoft's own Bing Chat). That meant linking up thousands of GPUs together in an all-new way. The blog offered up an explanation from Nidhi Chappell, Microsoft's head of product for Azure high performance computing and AI:

To train a large language model, she explained, the computation workload is partitioned across thousands of GPUs in a cluster. At certain phases in this computation – called allreduce – the GPUs exchange information on the work they’ve done. An InfiniBand network accelerates this phase, which must finish before the GPUs can start the next chunk of computation.

This hardware is combined with software to help optimize the use of both the NVIDIA GPUs and the network that keeps all of them working together. Microsoft says it is continuing to add GPUs and expanding its network while also trying to keep them working 24/7 via cooling systems, backup generators and uninterruptible power supply systems.

Microsoft shows how it combines Azure with NVIDIA chips to make AI supercomputers

User Feedback

0 Comments

Recommended Comments

There are no comments to display.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Add a comment...

× Pasted as rich text. Paste as plain text instead

Only 75 emoji are allowed.

× Your link has been automatically embedded. Display as a link instead

× Your previous content has been restored. Clear editor

× You cannot paste images directly. Upload or insert images from URL.

Insert image from URL

Sign In

Microsoft shows how it combines Azure with NVIDIA chips to make AI supercomputers

User Feedback

Recommended Comments

Join the conversation

Recently Browsing 0 members

nsane.down

News

Browse

Activity