Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.04432
Cited By
Audio-Visual Synchronisation in the wild
8 December 2021
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio-Visual Synchronisation in the wild"
16 / 16 papers shown
Title
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
40
28
0
02 Jan 2025
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Zixuan Wang
Chi-Keung Tang
Chi-Keung Tang
DiffM
VGen
LLMAG
49
4
0
04 Oct 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
50
9
0
20 May 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
33
2
0
09 Apr 2024
Synchformer: Efficient Synchronization from Sparse Cues
Vladimir E. Iashin
Weidi Xie
Esa Rahtu
Andrew Zisserman
24
11
0
29 Jan 2024
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
33
3
0
10 Oct 2023
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong
Liang Li
Yuankai Qi
Zhengjun Zha
Qi Wu
Wen-yu Wang
Bin Jiang
Ming Yang
Qin Huang
75
25
0
08 Dec 2022
Multimodal Transformer Distillation for Audio-Visual Synchronization
Xuan-Bo Chen
Haibin Wu
Chung-Che Wang
Hung-yi Lee
J. Jang
26
3
0
27 Oct 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
18
10
0
21 Jun 2022
VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
V. S. Kadandale
Juan F. Montesinos
G. Haro
24
23
0
05 Apr 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
V-SlowFast Network for Efficient Visual Sound Separation
Lingyu Zhu
Esa Rahtu
52
10
0
18 Sep 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
37
12
0
20 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
206
0
23 Jan 2020
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
167
784
0
16 Nov 2016
1