Microsoft’s training Small Language Models to outperform ChatGPT

News

2 min. read

Published on June 27, 2023

by Kareem Anderson

published on June 27, 2023

Share this article

Readers help support Windows Report. We may get a commission if you buy through our links.

Microsoft may have a $10 billion dollar investment and partnership with OpenAI for its ChatGPT Large Language Model but it looks like the company could be hedging its bets with its own smaller transformer technology.

Microsoft researchers recently revealed its Phi-1 1.3B transformer-based language model beat much larger models including HumanEval, MBPP, and even partnered ChatGPT when tasked with coding.

Combing “textbook quality” from The Stack and The StackOverflow datasets to train the artificial intelligence of Phi-1 1.3B and the use of eight NVIDIA A1000 GPUs over four days produced 6 billion high-quality training tokens based on GPT-4 classifiers and 7 billion generated using GPT 3.5 standards.

Not only did Phi-1 1.3B outperform some of its larger language model counterparts, it did so using fewer parameters.

While the researchers may be popping bottles of champagne in excitement, Phi-1 1.3Bs achievements are tempered by its comparative limited versatility. Unlike larger models, Phi-1 1.3B gains ground through its specialized training in Python programming and as such, misses out on specific API programming resulting in less knowledge about domain specifics than larger models tend to have.

At the end of the day, Phi-1 1.3B success highlights the need for higher quality data to flow through these language models to optimize their output.

Microsoft’s other SML Orca has also proven to outperform ChatGPT in similar testing, further lending credence to the necessity for high quality data to shrink the resource question of LLMs.

Microsoft is planning to open-source Phi-1 1.3B through HuggingFace, but as of now, there is no official date for the release.

Kareem Anderson

Networking & Security Specialist

Kareem is a journalist from the bay area, now living in Florida. His passion for technology and content creation drives are unmatched, driving him to create well-researched articles and incredible YouTube videos. He is always on the lookout for everything new about Microsoft, focusing on making easy-to-understand content and breaking down complex topics related to networking, Azure, cloud computing, and security.

User forum

0 messages

Sort by:

Leave a Reply Cancel reply