Microsoft AI Foundry adds GPT-5.4 mini and nano for developers

News

Akshay Kumar

News

2 min. read

Published on March 18, 2026

Microsoft just made a notable update to its AI Foundry platform by adding OpenAI’s newest compact models. Developers can now access GPT-5.4 mini and GPT-5.4 nano directly through the service. These smaller versions of the flagship AI are built specifically for speed and efficiency.

Instead of relying on massive computing power for every single task, companies can use these lightweight options to run basic background processes much faster and cheaper.

Why developers care about smaller and faster AI models

Big AI models are great for complex reasoning, but they take time to generate answers and cost a lot to run. The tech industry is shifting toward smaller tools that get specific jobs done without the heavy computing overhead. GPT-5.4 mini handles moderate tasks like drafting text or summarizing long reports quickly.

The nano version is even smaller. It is designed for situations where instant responses are absolutely critical. Think about basic customer service bots that need to reply in milliseconds or simple text sorting tools on smartphones.

By putting these models right into the Foundry ecosystem, Microsoft lets developers swap between heavy and light models depending on exactly what their application needs at any given moment.

What this integration means for everyday software costs

Running AI features constantly can drain a company’s budget quickly. Every time a user interacts with a smart feature, the company pays for the computing power to generate the response. The mini and nano options drastically lower that base cost.

Creators using Microsoft Foundry can now route simple user requests to the cheap nano model. They only need to wake up the larger, more expensive models when a user asks a highly complicated question. This routing strategy makes it economically possible to put smart features into free apps or standard software without losing money.

Smaller businesses that previously could not afford the cloud bills for top-tier artificial intelligence now have a realistic entry point. Ultimately, this means everyday users will start seeing quick text analysis and helpful tools pop up in more standard applications, simply because the developers can finally afford to run them continuously.

For context, running the standard GPT-5.4 model currently costs $2.50 per million input tokens and $15.00 per million output tokens. The new mini and nano options operate at a mere fraction (starting from $0.20/million) of these rates.

More about the topics: AI, GPT-5.4, microsoft

Akshay Kumar

Akshay Kumar is a veteran tech journalist who has penned thousands of articles for publications like AndroidHeadlines, 91mobiles, TechNerdiness, and GizBot. Armed with a BTech in Computer Science, Akshay is passionate about the entire tech landscape, with a specialized focus on the Microsoft ecosystem. He stays on the bleeding edge of Windows, Xbox, and other Microsoft services to deliver top-tier news and feature stories. When he isn't punishing his keyboard to break the latest tech news, you can find him grinding the ranks in competitive multiplayer games like Counter-Strike and Call of Duty.

Readers help support Windows Report. We may get a commission if you buy through our links.

Improve this guide

User forum

0 messages

Sort by:

Why developers care about smaller and faster AI models

What this integration means for everyday software costs

Leave a Reply Cancel reply