OpenAI’s new Sora text-to-video model can make shockingly realistic content

(Image credit: OpenAI)

OpenAI breaks new ground as the AI giant has revealed its first text-to-video model called Sora, capable of creating shockingly realistic content.

We’ve been wondering when the company was finally going to release its own video engine as so many of its rivals, from Stability AI to Google, have beaten them to the punch. Perhaps OpenAI wanted to get things just right before a proper launch. At this rate, the quality of its outputs could eclipse its contemporaries.

According to the official page, OpenAI Sora can generate “realistic and imaginative scenes” from a single text prompt; much like other text-to-video AI models. The difference with this engine is the technology behind it.

Lifelike content

Open AI claims its artificial intelligence can understand how people and objects “exist in the physical world”. This gives Sora the ability to create scenes featuring multiple people, varying types of movement, facial expressions, textures, and objects with a high amount of detail. Generated videos lack the plastic look or the nightmarish forms seen in other AI content – for the most part, but more on that later.

Sora is also multimodular. Users will reportedly be able to upload a still image to serve as the basis of a video. The content inside the picture will become animated with a lot of attention paid to the small details. It can even take a pre-existing video “and extend it or fill in missing frames.”

Prompt: A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in. pic.twitter.com/G1qhJRV9tgFebruary 15, 2024

You can find sample clips on OpenAI’s website and on X (the platform formerly known as Twitter). One of our favorites features a group of puppies playing in the snow. If you look closely, you can see their fur and the snow on their snouts have a strikingly lifelike quality to them. Another great clip shows a Victoria-crowned pigeon bobbing around like an actual bird.

A work in progress

As impressive as these two videos may be, Sora is not perfect. OpenAI admits its “model has weaknesses.” It can have a hard time simulating the physics of an object, confuse left from right, as well as misunderstand “instances of cause and effect.” You can have an AI character bite into a cookie, but the cookie lacks a bite mark.

It makes a lot of weird errors too. One of the funnier mishaps involves a group of archeologists unearthing a large piece of paper which then transforms into a chair before ending up as a crumpled piece of plastic. The AI also seems to have trouble with words. “Otter” is misspelled as “Oter” and “Land Rover” is now “Danover”.

even the sora mistakes are mesmerizing pic.twitter.com/OvPSbaa0L9February 15, 2024

Moving forward, the company will be working with its “red teamers” who are a group of industry experts “to assess critical areas for harms or risks.” They want to make sure Sora doesn’t generate false information, hateful content, or have any bias. Additionally, OpenAI is going to implement a text classifier to reject prompts that violate their policy. These include inputs requesting sexual content, violent videos, and celebrity likenesses among other things.

No word on when Sora will officially launch. We reached out for info on the release. This story will be updated at a later time. In the meantime, check out TechRadar's list of the best AI video editors for 2024.

See more Computing News

TOPICS

Cesar Cadenas has been writing about the tech industry for several years now specializing in consumer electronics, entertainment devices, Windows, and the gaming industry. But he’s also passionate about smartphones, GPUs, and cybersecurity.

Recommended reading

OpenAI’s new Sora text-to-video model can make shockingly realistic content

Lifelike content

A work in progress

You might also like