After months of staying in public preview, Microsoft launched Windows ML yesterday for developers building AI-powered apps on Windows 11. For the uninitiated, it is a built-in AI inferencing runtime designed to run models locally on CPUs, GPUs, and NPUs. Microsoft suggests that Windows ML further removes the need for cloud-only workloads.

As you may know, Microsoft introduced Windows ML back at the Build 2025 event. But it’s finally here as a production-ready framework to help teams deliver real-time, private, and efficient AI features across the Windows ecosystem.

Unlike previous approaches, Windows ML works as a hardware abstraction layer. In other words, developers are free to bring their own ONNX models or convert PyTorch ones with the AI Toolkit for Visual Studio Code.

Next, the runtime handles execution providers from AMD, Intel, NVIDIA, and Qualcomm. It helps automatically match models to the best silicon available on a PC. This process eventually reduces app size since developers no longer need to bundle runtimes or drivers.

Image: Microsoft

On the other hand, it ensures fine control, allowing teams to optimize workloads for low power on NPUs or high performance on GPUs. At this point, companies like Adobe, McAfee, Wondershare, and Topaz Labs are already preparing apps with Windows ML integration.

For example, Adobe plans to accelerate semantic search and scene detection in Premiere Pro and After Effects. Whereas McAfee is exploring deepfake detection that runs entirely on-device. It’s also worth noting that Windows ML is part of the Windows App SDK 1.8.1 and supports all devices running Windows 11 24H2 or later.

Microsoft, talking about Windows ML, says that it’s a major step forward to making hybrid AI real by combining the cloud with local intelligence directly on every Windows PC. For more technical details, head to this page.