Google VideoPoet, zero-shot video generation LLM with amazing effects
AI

Google VideoPoet, zero-shot video generation LLM with amazing effects

December 21, 2023

). Today, Google officially released VideoPoet, an advanced zero-shot video generation large language model (LLM), and the results look impressive.

VideoPoet has the following seven video generation capabilities:

  1. : Converts simple text descriptions into vivid video content.
  2. : Creates dynamic videos from static images.
  3. : Brings different visual styles to videos.
  4. : Performs advanced video editing and modifications.
  5. : Adds content to the edges of a video.
  6. : Adds content within regions of a video.
  7. : Automatically composes appropriate music for videos.

VideoPoet detailed paper: https://storage.googleapis.com/videopoet/paper.pdf

The working principle of VideoPoet is concise and effective. It uses pre-trained MAGVIT V2 video tokenizer and SoundStream audio tokenizer to convert image, video, and audio clips into a series of discrete codes that are compatible with text-based language models. Through an autoregressive language model, VideoPoet learns across video, image, audio, and text modalities to predict the next video or audio token in a sequence.

Moreover, VideoPoet introduces multimodal generative learning objectives such as text-to-video, image-to-video, video frame continuation, video inpainting and outpainting, video stylization, and video-to-audio. All these tasks can be combined to achieve additional zero-shot capabilities.

VideoPoet's architecture supports ultra-high-resolution video generation using multi-axis attention and video modeling conditioned on low-resolution tokens and text embeddings. This simple approach shows that language models can synthesize and edit videos with high temporal consistency. VideoPoet demonstrates state-of-the-art performance in video generation, particularly excelling at producing large-scale, interesting, and high-fidelity motions.

There is currently no place to use it, but the video effects are impressive. We look forward to its availability.

ABOUT THE AUTHOR

Renee's Entrepreneurial JourneyEssay Editor

This is my little corner of the internet where I share thoughts, ideas, and interesting stuff I come across in the world of AI. Things in this field move fast, and I use this space to slow down a bit—to reflect, explore, and hopefully spark some good conversations.

GOOGLE

See More