MIT researchers make AI image generation 30x faster by simplifying the process

The new technology is capable of real-time image generation without quality loss

3 min. read

Published on March 27, 2024

published on March 27, 2024

Readers help support Windows Report. We may get a commission if you buy through our links.

MIT Distribution Matching Distillation makes image generation 30x faster

A team of researchers from MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) found a way to make image-generation LLMs such as DALL-E 3 and Stable Diffusion a lot faster.

They managed to simplify the process into a single-step, but without compromising the image quality.

Right now, AI is using a so called diffusion model to generate an image. That involves recreating a noisy-state image and progressively giving it structure until it gets clear. It sounds pretty simple, but in fact, it takes a lot of steps to get from a fuzzy nonsense to a clear, crisp image.

The CSAIL researchers have a new framework that transforms that multi-step approach into a single action.

Researchers use a new method called Distribution Matching Distillation

According to the press release, MIT’s approach is called DMD, or Distribution Matching Distillation, and matches the generative adversarial networks with diffusion models to achieve an unprecedented speed for image generation.

Our work is a novel method that accelerates current diffusion models such as Stable Diffusion and DALLE-3 by 30 times. This advancement not only significantly reduces computational time but also retains, if not surpasses, the quality of the generated visual content. Theoretically, the approach marries the principles of generative adversarial networks (GANs) with those of diffusion models, achieving visual content generation in a single step — a stark contrast to the hundred steps of iterative refinement required by current diffusion models. It could potentially be a new generative modeling method that excels in speed and quality.
Tianwei Yin, lead researcher on the DMD framework

The idea behind the new DMD framework is to use two diffusion models. This way, they succeed to overcome the instability and mode collapse issues from GAN models.

The results were amazing and if you watch the clip above, you will see that the new DMD is generating roughly 20 images per second compared with Stable Diffusion 1.5 that needs 1.4 seconds to generate a single image.

According to TechSpot, the MIT researchers are not the only ones applying a single-step approach for image generation. Stability AI and their method called Adversarial Diffusion Distillation (ADD) can generate an image in only 207 ms by using only a Nvidia A100 AI GPU accelerator.

Image generation becomes faster every day and we hope that this will also apply to video-generation models like Sora AI.

What do you think about the new technology? Share your thought in the comments section below.