[ad_1]
TL;DR
- OpenAI announced a new AI model called Sora.
- Text-to-Video generation AI tools can create up to 60 seconds of video content.
- The company said it is currently working with Red Team to conduct adversarial testing of the model.
Earlier today, Google announced the release of Gemini version 1.5 for developers and enterprise users. Not to be outdone, OpenAI, one of Google’s biggest competitors, also made a big AI announcement today. However, this announcement includes a new text-to-video AI model.
OpenAI announced a new text-to-video generative AI model called Sora in a blog post and subsequent social media. The announcement was accompanied by clips created by the software, ranging from Chinese New Year celebrations to animated monsters playing with red candles.
We introduce Sora, a text-to-video conversion model.
Sora can create videos up to 60 seconds long featuring highly detailed scenes, complex camera movements, and multiple characters with vivid emotions. https://t.co/7j2JN27M3W
OpenAI says Sora is currently available to red teamers to “assess critical areas for harm or risk.” These red teams include experts in areas such as misinformation, hateful content, and bias. In addition to this test, Sora will also reportedly be tasked with security measures for his DALLă»E 3. The company added that it is working on a tool to detect whether a video was generated by Sora.
While other companies like Pika and Stability AI completely beat OpenAI when it comes to AI video generation, there are a few things that make Sora stand out. First, Sora can create videos up to 60 seconds long, while its competitors can only create around 4 seconds. Then there’s the clarity, resolution, and accuracy of the world around you.
You can check out over 35 examples on OpenAI’s website. Although the results are impressive, the model is far from perfect. As the company admits:
The current model has weaknesses. You may struggle to accurately simulate the physics of complex scenes, and you may not be able to understand certain instances of cause and effect. For example, even if a person bites into a cookie, there may not be a bite mark left on the cookie afterwards.
Additionally, the model can confuse the spatial details of the prompt (for example, confusing left and right) and has difficulty accurately describing events that occur over time, such as following a particular camera trajectory. There is likely to be.
An example of this can be seen in the first video posted on the blog. This video features a woman walking around Tokyo. If you watch closely, you’ll notice that the woman’s legs sometimes switch and stutter, that her feet slide on the ground, and that her clothing and hair change near the end.
Even though Sora is not available to the general public, CEO Sam Altman is accepting prompts from X (formerly Twitter) users.
[ad_2]
Source link