Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.03682
Cited By
MAAS: Multi-modal Assignation for Active Speaker Detection
11 January 2021
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MAAS: Multi-modal Assignation for Active Speaker Detection"
18 / 18 papers shown
Title
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
120
0
0
06 May 2025
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
40
0
0
27 Mar 2024
Target Active Speaker Detection with Audio-visual Cues
Yiding Jiang
Ruijie Tao
Zexu Pan
Haizhou Li
28
16
0
22 May 2023
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
29
16
0
28 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
19
3
0
09 Mar 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Rahul Sharma
Shrikanth Narayanan
37
8
0
01 Dec 2022
Unsupervised active speaker detection in media content using cross-modal information
Rahul Sharma
Shrikanth Narayanan
14
3
0
24 Sep 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
24
36
0
15 Jul 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
Yuanhang Zhang
Susan Liang
Shuang Yang
Shiguang Shan
8
4
0
22 Jun 2022
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
24
27
0
27 Mar 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection
Sourya Roy
Kyle Min
Subarna Tripathi
T. Guha
Somdeb Majumdar
35
3
0
02 Dec 2021
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
12
92
0
14 Oct 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
19
45
0
07 Jun 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
227
2,233
0
14 Jun 2018
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
223
815
0
04 Apr 2018
Simple Online and Realtime Tracking with a Deep Association Metric
N. Wojke
Alex Bewley
Dietrich Paulus
VOT
228
3,465
0
21 Mar 2017
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
164
784
0
16 Nov 2016
1