Alibaba have launched limited beta access to HappyHorse 1.0, a video generation model designed to help creators produce high-quality, cinematic-style video content.
HappyHorse 1.0 is now accessible to creators and enterprise customers globally via HappyHorse official website and through API service on Alibaba Cloud Model Studio, while individual users can experience the model through Alibaba’s consumer facing AI application Qwen App.
And it’s pretty amazing!
Advanced Video Generation and Editing with Exceptional Aestetic Expression
HappyHorse 1.0 supports Text-to-Video (T2V), Image-to-Video (I2V), and Subject-to-Video (S2V) generation — enabling users to create video from a text prompt, animate a still image into a video clip, or insert a specific subject from a reference image into a generated video while preserving their appearance and identity.
The model supports generation of up to 15 seconds of 1080p video with multiple shots, and delivers synchronized audio-visual output — including lip-synced dialogue, ambient soundscapes, and emotionally expressive vocal performances — for a fully immersive viewing experience. It excels in cinematic framing with wide apertures, the model can easily convey a strong atmospheric mood, delivering refined texture and detail, as well as rich spatial depth and visual layering.
Example of HappyHorse AI generated Video
Text to Video Prompt
A cinematic script scene set in a sun-drenched Parisian café, golden afternoon light spilling through arched windows. A sharp-dressed man in a tailored navy suit sits across from an elegant woman in a flowing crimson dress, half-empty coffee cups between them. The air is thick with unspoken tension. He leans forward, voice low and steady: “You knew from the beginning, didn’t you? That none of this was real.” She holds his gaze without flinching, a ghost of a smile on her lips, slowly stirring her coffee: “Everything was real. That’s exactly what makes it so dangerous.” Cinematic wide-angle composition, warm golden hour lighting, shallow depth of field, film grain texture, muted vintage color palette with deep crimson accents, highly detailed wardrobe and facial expressions, noir romantic aesthetic, emotionally charged atmosphere, European street photography style, dramatic storytelling, 35mm film look.
HappyHorse AI generated video result
Video Editing
Beyond generation, HappyHorse 1.0 offers powerful video editing functions. The Video-to-Video (V2V) function allows users to modify an existing video while preserving its original structure and motion. The Subject-and-Video-to-Video (SV2V) function enables users to seamlessly replace or insert a specific subject from a reference image, while preserving the original video’s motion, composition, and unaffected regions.