Video generation and synthesis network for long-term video interpolation

Published in 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018

[pdf] [bibtex]

In this paper, we propose a bidirectional synthesis video interpolation technique based on deep learning, using a forward and a backward video generation network and a synthesis network. The forward generation network first extrapolates a video sequence, given the past video frames, and then the backward generation network generates the same video sequence, given the future video frames. Next, a synthesis network fuses the results of the two generation networks to create an intermediate video sequence. To jointly train the video generation and synthesis networks, we define a cost function to approximate the visual quality and the motion of the interpolated video as close as possible to those of the original video. Experimental results show that the proposed technique outperforms the state-of-the art long-term video interpolation model based on deep learning.