# Generative Video

Here are some of the leading AI architectures in Generative Video modeling from text and video inputs:

**Text-to-Video:**

* Video Transformer - A transformer architecture combined with 3D convolutional nets to generate video from text. Pioneered by models like GPT-3 Turbo.
* VQVAE-2 - Uses a VQ-VAE model and transformer to generate video from text by predicting latent code. Used in tools like Anthropic's Claude.
* CTRL-V - Combines transformers, object detection and retrieval networks to synthesize video from text captions.
* DALL-E - Can generate simple videos from text using a transformer and object bank.

**Video-to-Video:**

* VideoGAN - Uses GAN architectures with spatio-temporal convolutional nets to convert low-res to high-res video.
* Vid2Vid - Employs encoder-decoder structure and novel video blocks to convert input video to target domains like segmentation.
* MoCoGAN - Decomposes motion and content for video using RNNs and GANs. Used for future prediction and style transfer.
* Recycle-GAN - Architectures using space-time memory networks to synthesize multi-modal video output from unstructured video input.
* SlowFast Networks - Two-stream 3D convolutional networks that model video at different speeds for generation.

So in summary, transformer-based architectures combined with deep convolutional nets have proven very effective for high-quality generative video modeling from both text and video inputs.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://metaverse-imagen.gitbook.io/ai-tools-research/ai-technology/generative-ai-architectures-and-models/generative-ai-models-for-video-and-image-synthesis/generative-video.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
