Active Speakers in Context

20 May 2020

Juan Carlos León Alcázar

Papers citing "Active Speakers in Context"

41 / 41 papers shown

Title
LASER: Lip Landmark Assisted Speaker Detection for Robustness Le Thien Phuc Nguyen Zhuliang Yu Yong Jae Lee 37 1 0 21 Jan 2025
An Efficient and Streaming Audio Visual Active Speaker Detection System Arnav Kundu Yanzi Jin Mohammad Hossein Sekhavat Max Horton Danny Tormoen Devang Naik 16 0 0 13 Sep 2024
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges Victoria Mingote Alfonso Ortega A. Miguel Eduardo Lleida 30 0 0 09 Sep 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction Davide Berghi Philip J. B. Jackson 42 0 0 01 Jun 2024
Robust Active Speaker Detection in Noisy Environments Siva Sai Nagender Vasireddy Chenxu Zhang Xiaohu Guo Yapeng Tian 40 0 0 27 Mar 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization Davide Berghi Philip J. B. Jackson 48 5 0 21 Dec 2023
PodReels: Human-AI Co-Creation of Video Podcast Teasers Sitong Wang Zheng Ning Anh Truong Mira Dontcheva Dingzeyu Li Lydia B. Chilton 29 15 0 10 Nov 2023
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning Chaeyoung Jung Suyeon Lee Kihyun Nam Kyeongha Rho You Jin Kim Youngjoon Jang Joon Son Chung 20 9 0 21 Sep 2023
AdVerb: Visually Guided Audio Dereverberation Sanjoy Chowdhury Sreyan Ghosh Subhrajyoti Dasgupta Anton Ratnarajah Utkarsh Tyagi Tianyi Zhou 30 11 0 23 Aug 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos Sagnik Majumder Ziad Al-Halah Kristen Grauman SSL EgoV 36 4 0 10 Jul 2023
Target Active Speaker Detection with Audio-visual Cues Yiding Jiang Ruijie Tao Zexu Pan Haizhou Li 28 16 0 22 May 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation Bolin Lai Fiona Ryan Wenqi Jia Miao Liu James M. Rehg EgoV 32 8 0 06 May 2023
Egocentric Auditory Attention Localization in Conversations Fiona Ryan Hao Jiang Abhinav Shukla James M. Rehg V. Ithapu EgoV 29 16 0 28 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset Tiago Roxo Joana Cabral Costa Pedro R. M. Inácio Hugo Manuel Proença 19 3 0 09 Mar 2023
A Light Weight Model for Active Speaker Detection Junhua Liao Haihan Duan Kanghui Feng Wanbing Zhao Yanbing Yang Liangyin Chen 35 36 0 08 Mar 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection Xizi Wang Feng Cheng Gedas Bertasius David J. Crandall 26 15 0 19 Jan 2023
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection Rahul Sharma Shrikanth Narayanan 37 8 0 01 Dec 2022
Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge Hugo C. C. Carneiro C. Weber S. Wermter 25 5 0 23 Nov 2022
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection Xuan-Bo Chen Haibin Wu Helen Meng Hung-yi Lee J. Jang AAML 20 3 0 03 Oct 2022
Unsupervised active speaker detection in media content using cross-modal information Rahul Sharma Shrikanth Narayanan 21 3 0 24 Sep 2022
End-To-End Audiovisual Feature Fusion for Active Speaker Detection Fiseha B. Tesema Zheyuan Lin Shiqiang Zhu Wei Song J. Gu Hong-Chuan Wu 4 4 0 27 Jul 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection Kyle Min Sourya Roy Subarna Tripathi T. Guha Somdeb Majumdar 26 36 0 15 Jul 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022 Yuanhang Zhang Susan Liang Shuang Yang Shiguang Shan 10 4 0 22 Jun 2022
Using Active Speaker Faces for Diarization in TV shows Rahul Sharma Shrikanth Narayanan CVBM 30 8 0 30 Mar 2022
End-to-End Active Speaker Detection Juan Carlos León Alcázar M. Cordes Chen Zhao Guohao Li 24 27 0 27 Mar 2022
Visually Supervised Speaker Detection and Localization via Microphone Array Davide Berghi A. Hilton Philip J. B. Jackson 13 11 0 07 Mar 2022
$Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement$ Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement Jun Xiong Yu Zhou Peng Zhang Lei Xie Wei Huang Yufei Zha 28 20 0 04 Mar 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection Sourya Roy Kyle Min Subarna Tripathi T. Guha Somdeb Majumdar 35 3 0 02 Dec 2021
Joint Learning of Visual-Audio Saliency Prediction and Sound Source Localization on Multi-face Videos Minglang Qiao Yufan Liu Mai Xu Xin Deng Bing Li Weiming Hu Ali Borji CVBM 29 5 0 05 Nov 2021
Sub-word Level Lip Reading With Visual Attention Prajwal K R Triantafyllos Afouras Andrew Zisserman 12 92 0 14 Oct 2021
Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark Xun Gao Yin Zhao Jie Zhang Longjun Cai 27 6 0 23 Sep 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection Hugo C. C. Carneiro C. Weber S. Wermter CVBM 31 7 0 01 Sep 2021
Learning to Cut by Watching Movies Alejandro Pardo Fabian Caba Heilbron Juan Carlos León Alcázar Ali K. Thabet Guohao Li VGen 58 20 0 09 Aug 2021
UniCon: Unified Context Network for Robust Active Speaker Detection Yuanhang Zhang Susan Liang Shuang Yang Xiao-Chang Liu Zhongqin Wu Shiguang Shan Xilin Chen CVBM 29 36 0 05 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection Ruijie Tao Zexu Pan Rohan Kumar Das Xinyuan Qian Mike Zheng Shou Haizhou Li 22 173 0 14 Jul 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild Okan Kopuklu Maja Taseska Gerhard Rigoll 3DV 26 45 0 07 Jun 2021
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion Baptiste Pouthier L. Pilati Leela K. Gudupudi C. Bouveyron F. Precioso 12 11 0 07 Jun 2021
MAAS: Multi-modal Assignation for Active Speaker Detection Juan Carlos León Alcázar Fabian Caba Heilbron Ali K. Thabet Guohao Li 65 51 0 11 Jan 2021
Cross modal video representations for weakly supervised active speaker localization Rahul Sharma Krishna Somandepalli Shrikanth Narayanan 9 8 0 09 Mar 2020
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 251 2,233 0 14 Jun 2018
Lip Reading Sentences in the Wild Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 167 784 0 16 Nov 2016