Microsoft's latest Phi-4 is plagued by at least 4 bugs that dramatically slow it down
The bugs were discovered by Unsloth.
2 min. read
Published on
Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more
Microsoft’s latest addition to the Phi family, the Phi-4 model, which features 14 billion parameters, was released in December, and the Redmond-based tech giant claimed it was designed to offer the same high-quality results as its larger counterparts but at a fraction of the size.
One of the key advantages of Phi-4 is its ability to perform complex reasoning, specifically in subjects such as mathematics, making it a versatile tool for a range of tasks beyond conventional language processing.
Microsoft has released Phi-4 on its recently released Azure AI Foundry platform and Hugging Face, a popular hub for sharing and deploying machine learning models. The company has also published a technical paper with benchmarks showing Phi-4 outperforming much larger models on math competition problems.
Phi-4’s open-source model has been an opportunity for developers to test, and on Reddit, the Unsloth team has already found and fixed four bugs that caused Phi-4 to have an approximately 5%-10% drop in accuracy, as well as breaking fine-tuning runs.
Here’s what Unsloth found:
- Tokenizer Fix: Phi-4 incorrectly uses <|endoftext|> as EOS instead of <|im_end|>.
- Finetuning Fix: Use a proper padding token (e.g., <|dummy_87|>).
- Chat Template Fix: Avoid adding an assistant prompt unless specified to prevent serving issues.
The group managed to fix these bugs and increase the accuracy rate of the Phi-4 model, and according to their blog post, the fixes were then tested by other users and worked entirely.
If you’re using the Phi-4, it might be wise to try to implement these fixes so you can get the best out of this AI model.
Unsloth also made their GitHub repository public here.
User forum
0 messages