Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2108.02607
Cited By
UniCon: Unified Context Network for Robust Active Speaker Detection
ACM Multimedia (ACM MM), 2021
5 August 2021
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"UniCon: Unified Context Network for Robust Active Speaker Detection"
22 / 22 papers shown
BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Susan Liang
Dejan Marković
I. D. Gebru
Steven Krenn
Todd Keebler
Jacob Sandakly
Frank Yu
Samuel Hassel
Chenliang Xu
Alexander Richard
348
7
0
28 May 2025
ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
CVBM
243
5
0
11 Dec 2024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Computer Vision and Pattern Recognition (CVPR), 2024
Yunlong Tang
Junjia Guo
Hang Hua
Susan Liang
Mingqian Feng
...
Chao Huang
Jing Bi
Zeliang Zhang
Pooyan Fazli
Chenliang Xu
CoGe
499
19
0
17 Nov 2024
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
Andrea Appiani
Cigdem Beyan
CLIP
VLM
381
2
0
18 Oct 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Davide Berghi
Philip J. B. Jackson
281
1
0
01 Jun 2024
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
446
1
0
27 Mar 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
253
6
0
21 Dec 2023
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
I. Gurvich
Ido Leichter
Dharmendar Reddy Palle
Yossi Asher
Alon Vinnikov
Igor Abramovski
Vishak Gopal
Ross Cutler
Eyal Krupka
235
4
0
15 Sep 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos
Computer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSL
EgoV
443
9
0
10 Jul 2023
Target Active Speaker Detection with Audio-visual Cues
Interspeech (Interspeech), 2023
Yiding Jiang
Ruijie Tao
Zexu Pan
Haizhou Li
414
28
0
22 May 2023
WASD: A Wilder Active Speaker Detection Dataset
IEEE Transactions on Biometrics Behavior and Identity Science (TBBIS), 2023
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
212
7
0
09 Mar 2023
A Light Weight Model for Active Speaker Detection
Computer Vision and Pattern Recognition (CVPR), 2023
Junhua Liao
Haihan Duan
Kanghui Feng
Wanbing Zhao
Yanbing Yang
Liangyin Chen
258
68
0
08 Mar 2023
AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
Neural Information Processing Systems (NeurIPS), 2023
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
VGen
437
63
0
04 Feb 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Computer Vision and Pattern Recognition (CVPR), 2023
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
277
31
0
19 Jan 2023
Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge
Hugo C. C. Carneiro
C. Weber
S. Wermter
643
7
0
23 Nov 2022
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection
Spoken Language Technology Workshop (SLT), 2022
Xuan-Bo Chen
Haibin Wu
Helen Meng
Hung-yi Lee
J. Jang
AAML
297
5
0
03 Oct 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
European Conference on Computer Vision (ECCV), 2022
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
316
59
0
15 Jul 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
Yuanhang Zhang
Susan Liang
Shuang Yang
Shiguang Shan
274
4
0
22 Jun 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
International Workshop on Machine Learning for Signal Processing (MLSP), 2022
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
238
21
0
21 Jun 2022
End-to-End Active Speaker Detection
European Conference on Computer Vision (ECCV), 2022
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
335
39
0
27 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
IEEE transactions on multimedia (IEEE TMM), 2022
Jun Xiong
Can Ma
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
222
38
0
04 Mar 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection
Sourya Roy
Kyle Min
Subarna Tripathi
T. Guha
Somdeb Majumdar
244
3
0
02 Dec 2021
1
Page 1 of 1