Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Video-to-video synthesis method: A GAN by NVIDIA & MIT CSAIL is now Open source

Save for later
  • 2 min read
  • 23 Aug 2018

article-image

Nvidia and the MIT Computer Science & Artificial Intelligence Laboratory (CSAIL) have open-sourced their video-to-video synthesis model. A generative adversarial learning framework is used as a method to generate high-resolution, photorealistic and temporally coherent results with various input format, including segmentation masks, sketches and poses.

There has been less research into video to video synthesis compared to image to image translation. Video to video synthesis aims to solve the problem of low visual quality and incoherency of video results in existing image synthesis approach. The research group proposed a novel video-to-video synthesis approach capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long.



An extensive experimental validation was performed on various datasets by the authors and the model showed better results than existing approaches in quantitative and qualitative perspectives. When this method was extended to multimodal video synthesis with identical input data, it produced new visual properties with high resolution and coherency.

Researchers suggested the model may be improved in the future by adding additional 3D cues such as depth maps to better synthesize turning cars. We can use object tracking to ensure an object maintains its colour and appearance throughout the video; and training with coarser semantic labels to solve issues in semantic manipulation.

The Video-to-Video Synthesis paper is on arxiv, the team’s model and data can be found on the Github page.


NVIDIA shows off GeForce RTX, real-time raytracing GPUs, as the holy grail of computer graphics to gamers

Nvidia unveils a new Turing architecture: “The world’s first ray tracing GPU”

Baidu announces ClariNet, a neural network for text-to-speech synthesis

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at AU $24.99/month. Cancel anytime