Look Who's Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild

17 August 2021

Papers citing "Look Who's Talking: Active Speaker Detection in the Wild"

19 / 19 papers shown

Title
LASER: Lip Landmark Assisted Speaker Detection for Robustness Le Thien Phuc Nguyen Z. Yu Yong Jae Lee 34 1 0 21 Jan 2025
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges Victoria Mingote Alfonso Ortega A. Miguel Eduardo Lleida 30 0 0 09 Sep 2024
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos Heeseung Yun Ruohan Gao Ishwarya Ananthabhotla Anurag Kumar Jacob Donley Chao Li Gunhee Kim V. Ithapu Calvin Murdock 45 1 0 09 Aug 2024
Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition David Gimeno-Gómez Carlos David Martínez Hinarejos 32 1 0 20 Feb 2024
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning Chaeyoung Jung Suyeon Lee Kihyun Nam Kyeongha Rho You Jin Kim Youngjoon Jang Joon Son Chung 17 9 0 21 Sep 2023
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data Zheng Zhang Zheng Ning Chenliang Xu Yapeng Tian Toby Jia-Jun Li 59 6 0 27 Jul 2023
Target Active Speaker Detection with Audio-visual Cues Yiding Jiang Ruijie Tao Zexu Pan Haizhou Li 28 16 0 22 May 2023
WASD: A Wilder Active Speaker Detection Dataset Tiago Roxo Joana Cabral Costa Pedro R. M. Inácio Hugo Manuel Proença 19 3 0 09 Mar 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset J. Peymanfard Samin Heydarian Ali Lashini Hossein Zeinali Mohammad Reza Mohammadi N. Mozayani 23 10 0 21 Jan 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection Xizi Wang Feng Cheng Gedas Bertasius David J. Crandall 26 15 0 19 Jan 2023
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection Rahul Sharma Shrikanth Narayanan 37 8 0 01 Dec 2022
Unsupervised active speaker detection in media content using cross-modal information Rahul Sharma Shrikanth Narayanan 14 3 0 24 Sep 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection Abudukelimu Wuerkaixi You Zhang Z. Duan Changshui Zhang 18 10 0 21 Jun 2022
Using Active Speaker Faces for Diarization in TV shows Rahul Sharma Shrikanth Narayanan CVBM 25 8 0 30 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild Pingchuan Ma Stavros Petridis M. Pantic VLM 122 144 0 26 Feb 2022
Data standardization for robust lip sync C. Wang 38 0 0 13 Feb 2022
Self-supervised learning for audio-visual speaker diarization Yifan Ding Yong-mei Xu Shi-Xiong Zhang Yahuan Cong Liqiang Wang VLM 39 29 0 13 Feb 2020
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 227 2,233 0 14 Jun 2018
Lip Reading Sentences in the Wild Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 164 784 0 16 Nov 2016