Metahas released a new update pushing a unexampled AI tool to its growing arsenal . This clip , they ’re bring Hollywood vibraphone to AI with Movie Gen — a new AI model that turn your textual matter prompts into full HD TV . While the visuals are pretty impressive , what got our care was the perfectly synchronize audio and a whole set of personalization tools . Here ’s a deep nosedive into what Movie Gen is about and its current status .
Table of content
What Is Movie Gen?
Movie Gen is Meta ’s new picture - render AI cock . What determine it apart from similar text edition - to - video AI tools is its use of 30 billion parameters , the gamy in the industriousness for now . Think of parameter like giving AI some brain cells to learn from the education data point . The more parameters a model has , the more fine detail it can capture from the same data . For comparison , OpenAI ’s SORA has around 20 billion parameters(according to unofficial sources ) , making Movie Gen a significant step forward .
presently , the model can make 16 - moment picture at 16 underframe per second , with synchronized 48kHz audio . However , it offers many other capableness , which we research below .
Here’s what Meta’s Movie Gen Can Do
Text-to-Video Generation
you could generate a TV by simply typing a textbook command prompt . The AI model processes the text and creates a fully rendered video with mellow - quality visuals admit the sounds . The system supports different aspect proportion like 1:1 , 9:16 , and 16:9 and can generate videos in result up to 1080p .
Produce personalized videos
you may upload an image of yourself or others too and then type a command prompt to generate individualized video . This means you could , for example , localize yourself in a scenic landscape or create a picture where you appear to be interacting with animated elements . you may also transfer prospect ratios and resolutions up to 1080p .
Edit Videos with Text
Movie Gen AI tool makes editing exist videos easy like typing schoolbook . you’re able to modify be videos by provide text instructions including add together or changing objective , alter the background , or adjust other visual elements .
For example , if you have a video of a beach picture , you could supply an pedagogy like “ add a palm tree tree diagram to the left side ” or “ vary the sky to sunset . ”
Video-to-Audio Generation
This is the cool part . Movie Gen creates short videos with matching audio that syncs with the frames . However , if you upload your own videos , Movie Gen can engender background music , levelheaded effect , and ambient randomness that ordinate with the video recording subject matter .
For example , if the uploaded video is set in a timberland , Movie Gen will add the strait of rustling leaf , tweedle birds , and other natural noise . Other than sound , you may also sum music by mention “ rock ‘n’ roll guitar music ” which will play in sync with visuals so it nominate sense to the viewer .
Text-to-Audio
This feature film allows you to create realistic soundtracks or effects from textual matter prompting , even if you do n’t have a video . For example , you could typewrite “ city street during the evening ” and Movie Gen AI will give the appropriate ambient sounds like dealings noise , citizenry talking , and distant honking .
Limitations of Movie Gen
While Movie Gen is impressive , it ’s not without its demarcation . Here is what we found :
Until Movie Gen becomes in public uncommitted , its execution in real - world scenarios remains changeable . However , hand Meta ’s rails criminal record with open - source AI models , it ’s potential that Movie Gen will also become open - source and costless to use , make its advantages more approachable . This open - source nature could also foster the development of more advanced models with higher figure rate and farsighted TV .
Can Movie Gen Change the Game
presently , we can only rely on Meta ’s claim , and found on that , this puppet appear to be the good picture - generating creature on paper . It can give video from clams or using your images , edit videos with just text prompts , and even produce audio . If it becomes open source likeMeta ’s Llama models , it could importantly touch on content initiation for everyone . However , its literal - world performance is still yet to be tested .