OpenAI unveils Sora the AI text-to-video generator

AI Text-to-Video - Concept image of AI

The company that created ChatGPT is now rolling out a new model that creates videos.

OpenAI, the company that brought ChatGPT to the world, has now unveiled a new AI text-to-video generator model that will create videos as long as a minute using only text input.

The new model has been named Sora and is currently in its early testing phase.

The Sora AI text-to-video generator is being tested by a select number of users and artists and has not yet been rolled out for the broader public.  This unveiling’s arrival has placed the product in direct competition with the Make-a-Video product by Meta, Facebook’s parent company.

AI Text-to-Video - Image of OpenAI Logo and Sora
Credit: Photo by depositphotos.com

Meta’s product was first unveiled in October 2022, but it has not yet had its own public release.  Other limited artificial intelligence products of this nature do exist, but only on a much smaller scale. This means that if Sora sees its wide launch any time soon, it would become the first major player in this particular consumer market.

The AI text-to-video generator is still in a learning and testing process and isn’t ready for launch.

“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction,” read a statement from OpenAI published on the Sora website. The company also published a white paper on the development of this new model.

As a part of the current testing phase, OpenAI is requesting feedback from the select designers, visual artists, and filmmakers testing it in order to determine how Sora can become a better tool to assist and support creative artists.

From there, the company will use what it learns from the artist feedback to identify the specific strengths of Sora’s AI text-to-video capabilities and what artificial intelligence capabilities are on their way.

“Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” said the company. “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”

Leave a Comment


This site uses Akismet to reduce spam. Learn how your comment data is processed.