SnapFusion creates unique images from text on mobile

3 Jul 2023

So you want to a new creative but you’re not a graphic designer and you don’t have access to a powerful bank of computers running the latest AI trained imaging software? Well don’t worry because the company behind SnapChat – Snap Inc, has you covered with SnapFusion.

ShapFusion has done what everyone said was impossible – they’ve developed the ability to create images from text inputs in under two seconds on mobile devices. Break that down, they’ve not only shortcut the image creation process, they’ve used so little procession power that it can be performed on a mobile device meaning you don’t need access to multiple processors and cloud computing. SnapFusion brings advanced AI image generation into the hands of normal people like you and me.

We’re excited to share that Snap Research has developed a new model called SnapFusion that shortens the model runtime from text input to image generation on mobile to under two seconds–the fastest time published to date by the academic community.

Snap Research achieved this breakthrough by optimizing the network architecture and denoising process, making it incredibly efficient, while maintaining image quality. So, now it’s possible to run the model to generate images based on text prompts, and get back crisp clear images in mere seconds on mobile rather than minutes or hours, as other research presents.

While it is still early days for this model, this work has the potential to supercharge high quality generative AI experiences on mobile in the future.
– Snap

What a picture of Charles III in the style of Picasso? Not a problem, even though Picasso died before he was crowned. If you are a bog roll company and your puppy has grown up into a full size labrador and you don’t have any replacement pups trained up? Again not a problem SnapFusion has you covered (The images above are all examples of what SnapFusion can create)

This really is a remarkable development and if you’re interested in the tech side of how it’s been done, Snap have discovered they can cut the process down into just 8 denoising steps and get better results than much more power systems can get in 50 steps. You can read the tech blurb here… it makes little sense to normal people but you don’t need to understand how it does it, you just need the vision to realise what text input to image creation in under two seconds can do for your business.

Obvious examples would be taking a plush toy and getting SnapFusion to create an image in a certain setting, or maybe you’re a fashion retailer not at Paris Fashion week but you still want your garments pictured on models with the Eiffel Tower in the background?

We’ll be watching with interest how SnapFusion starts to be used in the advertising and ecommerce . If you’re one of the first to get to grips with it we’d love you to share your results.