Microsoft unveils ResLoRA, a method capable of advanced AI training faster than ever

And it can run on decent systems as well.

2 min. read

Published on March 7, 2024

published on March 7, 2024

Readers help support Windows Report. We may get a commission if you buy through our links.

Microsoft has been at the forefront of AI research ever since the concept gained popularity, and the Redmond-based tech giant has been investing in new AI models and new methods to train them.

The results are not shying away: the latest AI innovation comes in the form of ResLoRA, developed by a group of AI researchers at the School of Computer Science and Engineering, Beihang University, Beijing, China in collaboration with Microsoft.

The method is capable of advanced AI training in a much faster time, and it’s suited to train AI models with specific capabilities, including natural language generation (NLG), natural language understanding (NLU), and task-to-image tasks.

ResLoRA is an improved version of LoRA, which is an existing PEFT (techniques to fine-tune large language models efficiently) method used for fine-tuning LLMs. When LoRA couldn’t be updated to suit the needs of the current AI training technologies, the scientists at Microsoft added extra paths (or residual paths) to it.

Then, while using the model, they merged the extra paths to get better results in training. They managed to do it: ResLoRA achieved better performance with fewer training steps and without any extra parameters or inference costs compared to LoRA.

The paper, which can be read in its entirety here, explains that ResLoRA works well for tasks such as natural language generation (NLG), natural language understanding (NLU), and text-to-image tasks, and it’s capable of providing advanced training to AI models in regards to them, much faster.

Microsoft made the code for ResLoRA available on GitHub, and AI enthusiasts can use it to train their own AI models with advanced capabilities in generative tasks. Plus, as the team behind it says, the tech requirements are not impossible.

Our experiments run on 8 NVIDIA Tesla V100 GPU. The results may vary due to different GPU models, drivers, CUDA SDK versions, floating-point precisions, and random seeds.

AI modeling and training are known to consume a lot of power and need advanced computing to work, but ResLoRA doesn’t ask for much.

The method seems to be light years away from models such as Project Rumi, Orca 13B, or many others that Microsoft helped research and develop. But with AI technology growing at an incredible rate, this is expected.

What do you think of the model?