Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.14250
Cited By
End-to-End Active Speaker Detection
27 March 2022
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Active Speaker Detection"
37 / 37 papers shown
Title
Multimodal Deep Learning
Cem Akkus
Jiquan Ngiam
Vladana Djakovic
Steffen Jauch-Walser
A. Khosla
...
Jann Goschenhofer
Honglak Lee
A. Ng
Daniel Schalk
Matthias Aßenmacher
106
3,169
0
12 Jan 2023
Fusion-GCN: Multimodal Action Recognition using Graph Convolutional Networks
Michael Duhme
Raphael Memmesheimer
Dietrich Paulus
71
24
0
27 Sep 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
Hugo C. C. Carneiro
C. Weber
S. Wermter
CVBM
45
7
0
01 Sep 2021
UniCon: Unified Context Network for Robust Active Speaker Detection
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
59
37
0
05 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
56
180
0
14 Jul 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
48
45
0
07 Jun 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
80
51
0
11 Jan 2021
JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition
Jinmiao Cai
Nianjuan Jiang
Xiaoguang Han
Kui Jia
Jiangbo Lu
37
85
0
16 Nov 2020
Active Speakers in Context
Juan Carlos León Alcázar
Fabian Caba Heilbron
Long Mai
Federico Perazzi
Joon-Young Lee
Pablo Arbelaez
Guohao Li
42
61
0
20 May 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
339
18,721
0
13 Feb 2020
SGAS: Sequential Greedy Architecture Search
Ge Li
Guocheng Qian
Itzel C. Delgadillo
Matthias Muller
Ali K. Thabet
Guohao Li
3DPC
51
186
0
30 Nov 2019
G-TAD: Sub-Graph Localization for Temporal Action Detection
Mengmeng Xu
Chen Zhao
D. Rojas
Ali K. Thabet
Guohao Li
108
435
0
26 Nov 2019
Personal VAD: Speaker-Conditioned Voice Activity Detection
Shaojin Ding
Quan Wang
Shuo-yiin Chang
Li Wan
Ignacio López Moreno
37
75
0
12 Aug 2019
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)
Joon Son Chung
40
54
0
25 Jun 2019
Point Clouds Learning with Attention-based Graph Convolution Networks
Zhuyang Xie
Junzhou Chen
B. Peng
3DPC
91
54
0
31 May 2019
Fast Graph Representation Learning with PyTorch Geometric
Matthias Fey
J. E. Lenssen
3DH
GNN
3DPC
214
4,334
0
06 Mar 2019
Simplifying Graph Convolutional Networks
Felix Wu
Tianyi Zhang
Amauri Souza
Christopher Fifty
Tao Yu
Kilian Q. Weinberger
GNN
220
3,172
0
19 Feb 2019
AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection
Joseph Roth
Sourish Chaudhuri
Ondˇrej Klejch
Radhika Marvin
Andrew C. Gallagher
...
S. Ramaswamy
Arkadiusz Stopczynski
Cordelia Schmid
Zhonghua Xi
C. Pantofaru
55
144
0
05 Jan 2019
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Yikang Li
Wanli Ouyang
Bolei Zhou
Jianping Shi
Yawen Cui
Xiaogang Wang
GNN
69
274
0
29 Jun 2018
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
348
2,274
0
14 Jun 2018
Rethinking Knowledge Graph Propagation for Zero-Shot Learning
Michael C. Kampffmeyer
Yinbo Chen
Xiaodan Liang
Hao Wang
Yujia Zhang
Eric Xing
156
305
0
29 May 2018
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
293
820
0
04 Apr 2018
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Xinyu Wang
Yufei Ye
Abhinav Gupta
145
589
0
21 Mar 2018
Dynamic Graph CNN for Learning on Point Clouds
Yue Wang
Yongbin Sun
Ziwei Liu
Sanjay E. Sarma
M. Bronstein
Justin Solomon
GNN
3DPC
255
6,132
0
24 Jan 2018
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Sijie Yan
Yuanjun Xiong
Dahua Lin
GNN
228
4,161
0
23 Jan 2018
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
3DPC
118
1,934
0
27 Nov 2017
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
277
8,902
0
21 Nov 2017
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
117
2,273
0
26 Jun 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
656
131,414
0
12 Jun 2017
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
231
3,796
0
19 May 2017
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNN
SSL
591
28,999
0
09 Sep 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.1K
193,814
0
10 Dec 2015
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Ashesh Jain
Amir Zamir
Silvio Savarese
Ashutosh Saxena
GNN
128
1,093
0
17 Nov 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
439
43,277
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
Kyunghyun Cho
B. V. Merrienboer
Dzmitry Bahdanau
Yoshua Bengio
AI4CE
AIMat
231
6,772
0
03 Sep 2014
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
237
7,526
0
09 Jun 2014
1