Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis



Shiming Chen1, Peng Zhang1, Xinge You1, Qinmu Peng1, Xin Liu2, Zehong Cao3, and Dacheng Tao4

1Huazhong University of Science and Technology (HUST), China     2Huaqiao University, China

3University of Tasmania (UTAS), Australia     4 University of Sydney (USYD), Australia

{shimingchen, zp_zhg, youxg, pengqinmu}@hust.edu.cn   xliu@hqu.edu.cn   zehong.cao@utas.edu.au   dacheng.tao@sydney.edu.au


Abstract

Dynamic texture (DT) exhibits statistical stationarity in the spatial domain and stochastic repetitiveness in the temporal dimension, indicating that different frames of DT possess high similarity correlation. However, there are no DT synthesis methods to consider the similarity prior for representing DT instead, which can explicitly capture the homogeneous and heterogeneous correlation between different frames of DT. In this paper, we propose a novel DT synthesis method (named Similarity-DT), which embeds the similarity prior into the representation of DT. Specifically, we first raise two hypotheses: the content of texture video frames varies over time-to-time, while the more closed frames should be more similar; the transition between frame-to-frame could be modeled as a linear or nonlinear function to capture the similarity correlation. Then, our proposed Similarity-DT integrates kernel learning and extreme learning machine (ELM) into a powerful unified synthesis model to learn kernel similarity embedding to represent the spatial-temporal transition among frame-to-frame of DTs. Extensive experiments on DT videos collected from internet and two benchmark datasets, i.e., Gatech Graphcut Textures and Dyntex, demonstrate that the learned kernel similarity embedding effectively exhibits the discriminative representation for DTs. Hence our method is capable of preserving long-term temporal continuity of the synthesized DT sequences with excellent sustainability and generalization. We also show that our method effectively generates realistic DT videos with fast speed and low computation, compared with the state-of-the-art approaches.

Material

If you wish to use our code, please cite the following paper :

Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis
Shiming Chen, Peng Zhang, Xinge You, Qinmu Peng, Xin Liu, Zehong Cao, Dacheng Tao
arXiv preprint arXiv: 1911.04254, 2019

Evaluation

There are two different benchmark datasets used for evaluating our method: Gatech Graphcut Textures dataset and Dyntex dataset. Some of these generated dynamic texture videos are used in paper, in which supports better vision quality. Indeed, we also evaluate our method on others DTs in-the-wild. If you are interested in the quantitative evaluation results (PSNR, SSIM), please take it from our paper.

Experiment 1: Synthesizing DTs using various kernel fuctions

In each example, the first one is the observed video, the others are the generated videos synthesized by Similarity-DT using different kernel function (left-to-right: Linear kernel, Rational Quadratic kernel, Polynomial kernel, Multiquadric kernel, Sigmoid kernel, Gaussian kernel)

Rotating wind ornament
windmill

Experiment 2: Synthesizing high-fidelity, long-term DTs (Sustainability Analysis)

In each row, the first three are the observed videos with 200 frames, the other three are the synthesized videos generated by Similarity-DT with 1000 frames. Here, we display 18 dynamic texture sequences of 6 classes (top-to-bottom: bulb, elevator, flowers swaying in the current, rotating wind ornament, water wave, windmill).

Experiment 3: Synthesizing DTs using transferred model (Gneralization Analysis)

In each group, the first row displays the observed videos (used for testing). As for other rows, the first video is the observed video (used for training), the others are the synthesized videos corresponding to the first row.

Cows
Tigers and Llamas

Experiment 4: Vision quality comparison with baseline methods

The first group displays observed DT videos, the others are the generated videos synthesized by different methods (top-to-bottom: Two-Stream [4], STGCN [3], DG [2], ours(Similarity-DT))

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China~(61571205 and 61772220), the Key Program for International S\&T Cooperation Projects of China~(2016YFE0121200). We thank Dr. Jianwen Xie for his suggestions.

Reference

[1] Xinge You et al. "Kernel Learning for Dynamic Texture Synthesis." IEEE Transactions on Image Processing (TIP), 2016.

[2] Jianwen Xie et al. "Learning Dynamic Generator Model by Alternating Back-Propagation through Time." In AAAI, 2019.

[3] Jianwen Xie et al. "Energy-based spatial-temporal generative convNet." IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019.

[4] Matthew Tesfaldet et al. "Two-stream convolutional networks for dynamic texture synthesis." In CVPR, 2018.

[5] Gatys Leon A. et al. "Texture synthesis using convolutional neural networks." In NeurIPS, 2015.

[6] Gianfranco Doretto et al. "Dynamic textures." International Journal of Computer Vision (IJCV), 2003.

Top