This project presents a way of performing video generation, by integrating parts of Seg2Vid into DYAN. Authors Pan et. al. present the novel task of video generation from a single image in their paper Video Generation from Single Semantic Label Map. Lui et al. present DYAN, a neural network framework for video generation using an optical flow sequence as input. DYAN has been shown to produce good results for video generation, but is limited by the need to use an computationally expensive optical flow generator, such as PyFlow, to generate optical flow inputs.
In this project, I present a new method for obtaining optical flow inputs for DYAN, using portions of the Seg2Vid network as a neural network optical flow generator. By merging these two networks, I have shown better video generation than the Seg2Vid network on its own for the PlayingViolin portion of the UCF-101 dataset. Results can be explored in the ablation study.
Seg2Vid is used to generate optical flow inputs for DYAN. These optical flows are then manually scaled up, because the optical flows generated by Seg2Vid are much smaller than DYAN 's optical flow inputs.
Many thanks to Wen Lui and Professor Octavia Camps of Northeastern University for guidance with this project.
[1] Video Generation from Single Semantic Label Map, Pan et. al.
[2] DYAN: A Dynamical Atoms-Based Network for Video Prediction, Lui et. al.
[3] Learned Perceptual Image Patch Similarity (LPIPS) metric, Zhang et. al.