OpenAI, the leading research organization in artificial intelligence, has announced a new breakthrough in video generation. Sora, a generative video model that can create realistic and high-definition videos from text descriptions, is the latest innovation from the San Francisco-based firm. Sora can produce videos up to a minute long, with stunning details and 3D effects.
How Sora works
Sora is based on the technology behind DALL-E 3, the third version of OpenAI’s flagship text-to-image model. DALL-E 3 uses a diffusion model, which is trained to turn a fuzz of random pixels into a picture. Sora takes this approach and applies it to videos rather than still images.
Sora can handle complex scenes and interactions, such as occlusion, motion, and perspective. It can also generate videos in different styles and genres, such as animation, documentary, or advertisement. Sora can even create original characters and scenarios based on the text input.
What Sora can do
OpenAI has shared four sample videos that demonstrate Sora’s capabilities. The videos are based on the following text prompts:
- animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. the use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.
- a gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures
- a documentary-style video of a Tokyo street scene at night, with a voice-over narration describing the culture and history of the city
- a short film inspired by Westworld, featuring a futuristic theme park where androids are indistinguishable from humans
The videos are impressive and captivating, showing Sora’s ability to generate realistic and diverse content from text. The videos also showcase Sora’s attention to detail, such as the texture of the monster’s fur, the movement of the fish, the reflection of the lights, and the expression of the androids.
Why Sora matters
Sora is a significant milestone for the field of video generation, which is a challenging and emerging area of artificial intelligence. Video generation has many potential applications, such as entertainment, education, advertising, and journalism. Sora could enable anyone to create high-quality videos with minimal effort and cost.
However, Sora also raises ethical and social issues, such as the misuse of video generation for deception, manipulation, or harm. OpenAI is aware of these risks and has decided to limit the access to Sora for now. The firm is only sharing Sora with a small group of safety testers, who will provide feedback and suggestions on how to ensure the responsible use of Sora. OpenAI has not released a technical report or demonstrated the model actually working. And it says it won’t be releasing Sora anytime soon.
“We think building models that can understand video, and understand all these very complex interactions of our world, is an important step for all future AI systems,” says Tim Brooks, a scientist at OpenAI. “But we also want to make sure that we are doing this in a way that is aligned with our vision of creating beneficial and trustworthy AI for everyone.”