Microsoft Expands Windows Local AI With New Aion Models and On-Device APIs


aion microsoft
Image credit: Microsoft

Microsoft has announced a major expansion of local AI capabilities on Windows, introducing new on-device AI models, expanded Windows AI APIs, and broader support for local agentic workflows.

The announcement comes as AI workloads continue to grow across Windows applications, especially for assistants, productivity tools, accessibility features, and developer workflows.

Microsoft wants more AI to run directly on Windows PCs

Microsoft says modern AI experiences often rely too heavily on cloud infrastructure. Agentic workflows, which involve constant reasoning, tool usage, and task coordination, can become expensive when every interaction requires cloud compute.

Instead, Microsoft wants large frontier models in the cloud to handle advanced reasoning tasks while smaller AI models handle everyday operations locally on Windows devices.

This hybrid approach could reduce latency, lower operating costs, and improve privacy for users and developers.

Aion 1.0 Instruct brings smaller local AI models to Windows

Microsoft introduced Aion 1.0 Instruct in preview, a compact on-device language model designed for lightweight AI tasks.

Microsoft says the model is designed to handle a range of everyday text-based AI tasks locally on Windows devices. These include summarizing content, rewriting text, detecting user intent, supporting accessibility features, and powering routine assistant-style interactions without relying heavily on cloud processing.

Microsoft plans to make Aion 1.0 Instruct available first through Edge Insider channels. Open weights for the model are expected to arrive on Hugging Face in July.

The model focuses on efficiency rather than raw scale, allowing Windows devices to process smaller AI tasks locally without depending on cloud inference.

Microsoft also introduced Aion 1.0 Plan for agentic workflows

Alongside the lightweight model, Microsoft announced Aion 1.0 Plan, a larger 14-billion-parameter reasoning and tool-calling model.

The model supports a 32K context window and is designed specifically for local agentic workflows on Windows.

Microsoft says Aion 1.0 Plan can help developers understand user intent, invoke local tools, manage files, coordinate sub-agents, and run local reasoning workflows directly on-device.

Unlike traditional cloud assistants, the model ships in-box with Windows on supported hardware, allowing developers to build AI workflows that run directly on-device.

New Speech Recognition API adds local speech-to-text to Windows

Microsoft also announced a new Speech Recognition API for Windows AI APIs.

The API supports both real-time and batch speech-to-text processing directly on-device. Developers can use it for live transcription, captions, dictation, audio-video applications, and accessibility tools. The API works with microphone input, streamed audio, and audio files.

Because the feature runs locally, apps can continue transcribing audio without internet access. Microsoft says this can improve privacy, reduce latency, and lower cloud AI costs.

The Speech Recognition API is entering public preview with initial English support. Microsoft says more languages and regional support will roll out gradually.

Windows AI APIs now support CPUs and GPUs beyond NPUs

Microsoft is also expanding Windows AI APIs beyond dedicated NPUs.

The company says local AI features will now support CPUs and GPUs as well, allowing more Windows 11 devices to run AI workloads even without high-end AI accelerators.

Microsoft says several features are already entering public preview with broader hardware support. The Windows inbox small language model will support capable GPUs, while video super resolution and the Speech Recognition API will support CPUs.

This broader hardware support could significantly expand the number of PCs capable of running local AI experiences.

AI models will download only when needed

Microsoft says Windows inbox AI models will not automatically download to every system.

Instead, models are downloaded only when an application requests them. The company says this approach helps reduce unnecessary storage usage and bandwidth consumption for users who do not actively use local AI features.

Microsoft positions Foundry on Windows as the local AI platform

Microsoft also highlighted growing developer adoption of Microsoft Foundry on Windows.

The platform is designed to help developers integrate local AI models, agentic reasoning systems, and coding workflows directly into Windows applications.

The company appears to be positioning Foundry as a core part of its broader Windows AI ecosystem strategy.

Microsoft continues broader Windows developer push

The local AI announcements were part of a wider set of Windows developer updates detailed in Microsoft’s latest Windows developer platform announcement.

Microsoft also announced Coreutils for Windows, built-in WSL containers, and the new GitHub Copilot Desktop app as part of its broader developer platform push.

The company also recently introduced new AI-focused hardware initiatives, including the new Surface RTX Spark Dev Box, aimed at developers building local and hybrid AI applications on Windows.

More about the topics: AI, microsoft, Microsoft Build 2026

Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more

User forum

0 messages