UniCon: Unified Context Network for Robust Active Speaker Detection

ACM Multimedia (ACM MM), 2021

5 August 2021

Papers citing "UniCon: Unified Context Network for Robust Active Speaker Detection"

22 / 22 papers shown

BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models

348

28 May 2025

ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection

243

11 Dec 2024

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?Computer Vision and Pattern Recognition (CVPR), 2024

...

499

17 Nov 2024

CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection

Andrea Appiani

Cigdem Beyan

CLIP VLM

381

18 Oct 2024

Audio-Visual Talker Localization in Video for Spatial Sound Reproduction

Davide Berghi

Philip J. B. Jackson

281

01 Jun 2024

Robust Active Speaker Detection in Noisy Environments

Siva Sai Nagender Vasireddy

Chenxu Zhang

Xiaohu Guo

Yapeng Tian

446

27 Mar 2024

Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization

Davide Berghi

Philip J. B. Jackson

253

21 Dec 2023

A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying MechanismIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

I. Gurvich

Ido Leichter

Dharmendar Reddy Palle

235

15 Sep 2023

Learning Spatial Features from Audio-Visual Correspondence in Egocentric VideosComputer Vision and Pattern Recognition (CVPR), 2023

443

10 Jul 2023

Target Active Speaker Detection with Audio-visual CuesInterspeech (Interspeech), 2023

Yiding Jiang

Ruijie Tao

Zexu Pan

Haizhou Li

414

22 May 2023

WASD: A Wilder Active Speaker Detection DatasetIEEE Transactions on Biometrics Behavior and Identity Science (TBBIS), 2023

212

09 Mar 2023

A Light Weight Model for Active Speaker DetectionComputer Vision and Pattern Recognition (CVPR), 2023

258

08 Mar 2023

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene SynthesisNeural Information Processing Systems (NeurIPS), 2023

437

04 Feb 2023

LoCoNet: Long-Short Context Network for Active Speaker DetectionComputer Vision and Pattern Recognition (CVPR), 2023

Xizi Wang

Feng Cheng

Gedas Bertasius

David J. Crandall

277

19 Jan 2023

Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge

Hugo C. C. Carneiro

C. Weber

S. Wermter

643

23 Nov 2022

Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker DetectionSpoken Language Technology Workshop (SLT), 2022

Haibin Wu

297

03 Oct 2022

Learning Long-Term Spatial-Temporal Graphs for Active Speaker DetectionEuropean Conference on Computer Vision (ECCV), 2022

316

15 Jul 2022

UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

274

22 Jun 2022

Rethinking Audio-visual Synchronization for Active Speaker DetectionInternational Workshop on Machine Learning for Signal Processing (MLSP), 2022

Abudukelimu Wuerkaixi

You Zhang

Z. Duan

Changshui Zhang

238

21 Jun 2022

End-to-End Active Speaker DetectionEuropean Conference on Computer Vision (ECCV), 2022

Juan Carlos León Alcázar

M. Cordes

Chen Zhao

Guohao Li

335

27 Mar 2022

$Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement$

Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech EnhancementIEEE transactions on multimedia (IEEE TMM), 2022

Jun Xiong

Can Ma

Peng Zhang

Lei Xie

Wei Huang

Yufei Zha

222

04 Mar 2022

Learning Spatial-Temporal Graphs for Active Speaker Detection

244

02 Dec 2021