Stability AI has unveiled its groundbreaking generative model, Stable Video Diffusion, designed to create short videos from text or image descriptions. This innovative AI service is currently in beta testing, and eager users can join a waiting list for access.
The core of this service is built upon Stable Diffusion, a model capable of generating static images based on textual queries. The developer has made the model's source code available on GitHub, and the necessary weights for local use can be obtained from Hugging Face. A comprehensive research paper detailing the technical capabilities of the model has also been published by the company.
Stable Video Diffusion presents remarkable adaptability for various tasks. Users can configure it to create short videos from single images, opening up possibilities in diverse fields. Stability AI envisions building an entire ecosystem of generative models based on this foundation.
The current release of Stable Video Diffusion offers two versions of image-to-video models, generating either 14 or 25 frames. Users can customize frame rates, ranging from 3 to 30 frames per second. Although these models exhibit exceptional performance, it's important to note that they are not intended for commercial use at this stage but are exclusively for research purposes.
Looking ahead, Stability AI plans to introduce a web interface that allows users to generate videos directly from text descriptions. This Text-To-Video tool showcases the practical applications of Stable Video Diffusion across sectors like advertising, education, and entertainment, among others.