Alibaba Cloud Generative AI Text-to-Image

7 Jul 2023

Alibaba Cloud has opened a generative AI text-to-image model called Tongyi Wanxiang for testing by corporate customers in China, part of its growing suite of artificial intelligence-based applications. Tongyi Wanxiang means “tens of thousands of images” in Mandarin.

Tongyi Wanxiang can generate images from natural-language prompts, ranging from watercolors and oil paintings to animation and 3D cartoons. Besides the simple generation of images, it can also apply the style of one image to another, or create variations of images similar in content and style to the original.

Tongyi Wanxiang is powered by Alibaba Cloud’s proprietary large model Composer, a text-to-image diffusion model capable of generating photo-realistic images given any text input.

Alibaba cite use cases including everything from innovative AI art and creative expression for businesses across industries from ecommerce, gaming and design to advertising.

This sounds similar to SnapFusion, although the version from Snapchat works on mobile with a generation speed of under 2 seconds. Indeed, Alibaba note that other image creation platforms have been released globally, such as Midjourney and Stable Diffusion, but Tongyi Wanxiang will be particularly adaptable to companies’ needs in the world’s second-largest economy. Tongyi Wanxiang is capable of understanding prompts in Mandarin and English.

Also on Friday, Alibaba Cloud launched ModelScopeGPT. This framework uses large language models in Alibaba’s open-source tech community, ModelScope, as an interface to link together various AI models to perform tasks more efficiently. By combing the power of various AI models on ModelScope, ModelScopeGPT can produce output in texts, images, audio and videos. Alibaba Cloud said the framework can help enterprises and developers perform sophisticated AI tasks across languages, visuals, and speech.