Microsoft supercharges Azure ML with NVIDIA H200 VMs

Faster AI training, larger models, and high-scale inference

News

Rishaj Upadhyay

News Editor

News

2 min. read

Published on August 25, 2025

Microsoft is taking Azure Machine Learning one notch up with its latest addition, the ND H200 v5 virtual machines.

As Microsoft notes, these VMs are powered by NVIDIA’s H200 Tensor Core GPUs and are designed to handle the heaviest AI workloads, from training massive language models. This helps them to serve high-throughput inference at scale.

Worth noting that the ND H200 v5 packs eight H200 GPUs, offering a combined 1,128 GB of high-bandwidth HBM3e memory. That’s a massive 76% jump over the previous H100 generation.

In other words, the massive memory pool means larger models, longer context windows, and bigger batch sizes can now run with fewer compromises. Microsoft says this setup also reduces cross-GPU communication, cutting training overhead and boosting efficiency.

Moving on, NVIDIA NVLink delivers 900 GB/s per GPU inside a VM, enabling fast parallel training across all eight GPUs. Between VMs, each node is equipped with 3.2 Tb/s of InfiniBand bandwidth, complemented by GPUDirect RDMA for low-latency GPU-to-GPU communication.

This design makes scaling across hundreds of nodes smoother and more predictable, eventually helping teams move from experiments to production with fewer roadblocks.

On the software side, ND H200 v5 slots right into existing Azure ML workflows, supporting frameworks like PyTorch, TensorFlow, and JAX. Optimized containers, distributed training via NCCL, and direct CLI provisioning ensure that data science teams can get started quickly.

Early benchmarks suggest up to 35% better throughput for large model inference compared to previous-gen setups, especially for models like Llama 3.1 405B. Microsoft notes that the high-performance simulations and scientific workloads also stand to benefit from the combination of memory bandwidth and compute density.

With support for auto-scaling clusters, Azure ML users can spin up anything from a single ND H200 VM to hundreds of nodes, only paying for what they use. In short, this is not just a hardware bump, but a full-stack upgrade aimed at fueling the next wave of AI innovation.

More about the topics: Microsoft Azure, nvidia

Rishaj Upadhyay

News Editor

Rishaj is a tech writer who has been writing professionally for over four years, with a passion for Android, Windows, and all things tech. He initially joined Windows Report as a tech journalist and is now taking over as a news editor. When he's not breaking the keyboard, you can find him cooking, or listening to music/podcasts.

Readers help support Windows Report. We may get a commission if you buy through our links.

Improve this guide

User forum

0 messages

Sort by:

Leave a Reply Cancel reply