Meta unveils Movie Gen: AI video creator to rival OpenAI's Sora

October 07, 2024 11:19 AM

Show more sharing options

AI-generated video of Moo Deng, the pygmy hippo, swimming created using Meta's Movie Gen AI model

Meta is set to take on OpenAI’s Sora model with its own AI video generation system: Movie Gen.

Meta Movie Gen is billed as a generative AI video tool to help content creators generate custom videos and edit existing videos — but it won’t be released any time soon.

Subscribe today for free

The connectivity news and insights that matter - straight to your inbox

Meta Movie Gen can turn text prompts into short videos. The 30 billion parameter model can output video content up to 16 seconds in length of 16 frames per second.

The video generation model understands the concepts and themes in text inputs, allowing it to create outputs with appropriate camera movements and object motions. This ensures that the narrative remains coherent and results in visually engaging content that aligns with the user’s request.

Movie Gen can also take images of people and transform them into short videos containing rich visual details informed by the text prompt.

Meta claimed the model achieved state-of-the-art results for creating personalised videos that preserve human identity and motion.

Example of Meta's Movie Gen AI model creating personalised videos from text prompts

Movie Gen can also edit existing videos, taking video and text prompts as input, and returning content in completely different styles or with elements removed.

For example, users can ask Movie Gen to change a video to a different art style or add objects like a hat to a dog.

Traditionally, such actions required proficiency in applications like Premier Pro and Lightroom. Meta wants to democratise video editing abilities, providing a tool that anyone can use even if they lack creative skills.

Audio generation: Making music

In addition to Movie Gen’s video capabilities, the multimodal model can also produce audio outputs, like ambient sound, sound effects, and instrumental background music — something OpenAI’s Sora can’t do.

A 13 billion parameter audio generation model can take video and text prompts and generate high-quality and high-fidelity audio for up to 45 seconds that’s synced to video content produced by Movie Gen.

The audio generation capabilities enable Movie Gen to produce short content with accurate sounds, like rainforest noises in a video about the jungle.

Movie Gen makes use of an audio extension technique that enables it to generate audio for videos of arbitrary lengths.

How to access Meta Movie Gen

Like Sora, Meta’s Movie Gen won’t be available anytime soon. The team behind it are keeping it from the public for now, iterating on it and gathering feedback ahead of a potential future release.

Chris Cox, chief product officer at Meta confirmed in a post on Threads that Movie Gen is not ready for release as a product “anytime soon.”

“It’s still expensive and generation time is too long — but we wanted to share where we are since the results are getting quite impressive,” Cox said.

Post by @chriscox

View on Threads

Meta has been working on AI video generation systems for years, from initial work like its Make-A-Scene system that transforms images and freeform sketches into short gifs to Emu Video, which can generate videos based on natural language text, images, or both.

A blog post unveiling Meta’s latest AI video generation system saw the company suggest it is working closely with filmmakers and creators to gather their thoughts on Movie Gen to ensure it’s designed to enhance creativity.

“While there are many exciting use cases for these foundation models, it’s important to note that generative AI isn’t a replacement for the work of artists and animators,” Meta said. “We’re sharing this research because we believe in the power of this technology to help people express themselves in new ways and to provide opportunities to people who might not otherwise have them.”

While Movie Gen and OpenAI’s Sora have not been made available to the public, for now, a wide variety of AI video generation tools are currently accessible to create custom content.

Canva offers a text-to-video AI generator tool, while DeepAI offers both text-to-video and image-to-video.

Runway, meanwhile, touts its video generation solutions as offering high-end fidelity, consistency, and motion in content. The startup even launched its own AI Film Festival, showcasing AI-generated video content.

For video content featuring AI avatars, HeyGen and Synthesia offer services that let users create content that features digital spokespeople with expressive speech for use cases like marketing videos or staff training videos.

OpenAI launches text-to-video tool, Sora

Capacity Reacts to... Sora, OpenAI's text-to-video platform