ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02607
  4. Cited By
UniCon: Unified Context Network for Robust Active Speaker Detection

UniCon: Unified Context Network for Robust Active Speaker Detection

ACM Multimedia (ACM MM), 2021
5 August 2021
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
    CVBM
ArXiv (abs)PDFHTML

Papers citing "UniCon: Unified Context Network for Robust Active Speaker Detection"

22 / 22 papers shown
BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Susan Liang
Dejan Marković
I. D. Gebru
Steven Krenn
Todd Keebler
Jacob Sandakly
Frank Yu
Samuel Hassel
Chenliang Xu
Alexander Richard
348
7
0
28 May 2025
ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection
ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
CVBM
243
5
0
11 Dec 2024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?Computer Vision and Pattern Recognition (CVPR), 2024
Yunlong Tang
Junjia Guo
Hang Hua
Susan Liang
Mingqian Feng
...
Chao Huang
Jing Bi
Zeliang Zhang
Pooyan Fazli
Chenliang Xu
CoGe
499
19
0
17 Nov 2024
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection
Andrea Appiani
Cigdem Beyan
CLIPVLM
381
2
0
18 Oct 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Davide Berghi
Philip J. B. Jackson
281
1
0
01 Jun 2024
Robust Active Speaker Detection in Noisy Environments
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
446
1
0
27 Mar 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection
  and Localization
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
253
6
0
21 Dec 2023
A Real-Time Active Speaker Detection System Integrating an Audio-Visual
  Signal with a Spatial Querying Mechanism
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying MechanismIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
I. Gurvich
Ido Leichter
Dharmendar Reddy Palle
Yossi Asher
Alon Vinnikov
Igor Abramovski
Vishak Gopal
Ross Cutler
Eyal Krupka
235
4
0
15 Sep 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric
  Videos
Learning Spatial Features from Audio-Visual Correspondence in Egocentric VideosComputer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSLEgoV
443
9
0
10 Jul 2023
Target Active Speaker Detection with Audio-visual Cues
Target Active Speaker Detection with Audio-visual CuesInterspeech (Interspeech), 2023
Yiding Jiang
Ruijie Tao
Zexu Pan
Haizhou Li
414
28
0
22 May 2023
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection DatasetIEEE Transactions on Biometrics Behavior and Identity Science (TBBIS), 2023
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
212
7
0
09 Mar 2023
A Light Weight Model for Active Speaker Detection
A Light Weight Model for Active Speaker DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Junhua Liao
Haihan Duan
Kanghui Feng
Wanbing Zhao
Yanbing Yang
Liangyin Chen
258
68
0
08 Mar 2023
AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene
  Synthesis
AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene SynthesisNeural Information Processing Systems (NeurIPS), 2023
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
VGen
437
63
0
04 Feb 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
LoCoNet: Long-Short Context Network for Active Speaker DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
277
31
0
19 Jan 2023
Whose Emotion Matters? Speaking Activity Localisation without Prior
  Knowledge
Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge
Hugo C. C. Carneiro
C. Weber
S. Wermter
643
7
0
23 Nov 2022
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual
  Active Speaker Detection
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker DetectionSpoken Language Technology Workshop (SLT), 2022
Xuan-Bo Chen
Haibin Wu
Helen Meng
Hung-yi Lee
J. Jang
AAML
297
5
0
03 Oct 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Learning Long-Term Spatial-Temporal Graphs for Active Speaker DetectionEuropean Conference on Computer Vision (ECCV), 2022
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
316
59
0
15 Jul 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at
  ActivityNet Challenge 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
Yuanhang Zhang
Susan Liang
Shuang Yang
Shiguang Shan
274
4
0
22 Jun 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Rethinking Audio-visual Synchronization for Active Speaker DetectionInternational Workshop on Machine Learning for Signal Processing (MLSP), 2022
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
238
21
0
21 Jun 2022
End-to-End Active Speaker Detection
End-to-End Active Speaker DetectionEuropean Conference on Computer Vision (ECCV), 2022
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
335
39
0
27 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker
  Detection and Speech Enhancement
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech EnhancementIEEE transactions on multimedia (IEEE TMM), 2022
Jun Xiong
Can Ma
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
222
38
0
04 Mar 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection
Learning Spatial-Temporal Graphs for Active Speaker Detection
Sourya Roy
Kyle Min
Subarna Tripathi
T. Guha
Somdeb Majumdar
244
3
0
02 Dec 2021
1
Page 1 of 1