Synchformer: Efficient Synchronization from Sparse Cues

Synchformer: Efficient Synchronization from Sparse Cues

29 January 2024

Vladimir E. Iashin

Esa Rahtu

Andrew Zisserman

Papers citing "Synchformer: Efficient Synchronization from Sparse Cues"

9 / 9 papers shown

Title
DeepSound-V1: Start to Think Step-by-Step in the Audio Generation from Videos Yunming Liang Zihao Chen Chaofan Ding Xinhan Di DiffM VGen 55 0 0 28 Mar 2025
ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation Zixuan Wang Chi-Keung Tang Yu-Wing Tai DiffM VGen 58 0 0 10 Mar 2025
Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition Juncheng Wang Chao Xu Cheng Yu Lei Shang Zhe Hu Shujun Wang Liefeng Bo DiffM VGen 43 0 0 10 Mar 2025
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Ho Kei Cheng Masato Ishii Akio Hayakawa Takashi Shibuya A. Schwing Yuki Mitsufuji VGen 126 12 0 19 Dec 2024
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation Wei Guo Heng Wang Jianbo Ma Weidong Cai DiffM 90 3 0 23 Nov 2024
Temporally Aligned Audio for Video with Autoregression Ilpo Viertola Vladimir E. Iashin Esa Rahtu VGen 35 10 0 20 Sep 2024
Read, Watch and Scream! Sound Generation from Text and Video Yujin Jeong Yunji Kim Sanghyuk Chun Jiyoung Lee VGen DiffM 29 11 0 08 Jul 2024
Images that Sound: Composing Images and Sounds on a Single Canvas Ziyang Chen Daniel Geng Andrew Owens DiffM 50 9 0 20 May 2024
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 280 1,982 0 09 Feb 2021