Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more

Readers help support Windows Report. We may get a commission if you buy through our links.

Imagine being able to generate true-to-life images from scratch using simple text instructions to combine and manipulate multiple images to create the perfect photo – in just seconds. Imagine no more.

Vidu, the flagship product of ShengShu Technologies, has just launched a new ‘Reference-to-Image’ feature that claims to have literally reinvented photography. It’s a bold claim, but one that could soon have a massive impact on how businesses, brands and pretty much everybody approaches photography and image creation.

AI-driven image creation technologies have been around for a while now. A simplified version looks like this. We start with a mountainside scene, then add an image of an old man with a beard holding a water flask – a few hours work for someone using Photoshop, but a task that takes seconds with an AI at the helm.

The problem, however, is that AI produced images typically lack consistency. The flask might appear in the image, but it might be added in a way that seems unnatural – it could even be held upside down. How easy is it to remove the beard, or add trees to the scene? Can the AI help with that? Until now, probably not.

Vidu’s approach to solving these issues is twofold. Firstly, it allows the compositing of up to seven images, a considerable leap from the limitations imposed by most image creation tools that max out at three images. Being able to add more images opens up the door to more complex photo-realistic image creation – scenes which blend several objects, settings and people in one scene. So yes, you can add trees, birds, clouds, or any other object to the scene.

The second innovation is semantic understanding. Vidu has trained the AI to better interpret the relationship between images so that objects, scenery, furniture and people are added to the image with greater consistency, an innovation that Vidu claims as a world first. Impressively, it all happens with simple text prompts. Describe what images to combine, remove, and adjust, and you’re seconds away from having an image that maintains realism and consistency.

Vidu creator, ShengShu Technology is one of a handful of companies pushing the boundaries of AI-driven image and video creation. Arguably, the biggest rival to Vidu’s Reference-to-Image tech is Google’s recently launched Nano Banana.

However, Vidu is confident that it currently outperforms Nano Banana when it comes to character and image consistency, natural image blending and realism. The company also claims that it deals with embedded text better than its rivals.

How does this impact photographers and creatives around the world? It means studio-grade photography is set to become within reach for all of us.

Professional photography can be an expensive overhead for many businesses. It requires substantial investment in cameras, lenses, tripods, lighting equipment and other expensive equipment.

Plus, studio time starts to become even more expensive when you factor in models, makeup, and additional post-processing. New AI image creation tools like Vidu are challenging all of these assumptions, massively reducing time and costs, making professional-grade more photography accessible to all.

One thing is for sure, the world of photography is about to change with the arrival of AI-driven tools like Vidu and its new Reference-to-Image tech. It’s going to be very interesting to see how this space develops in the months to come.