ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14250
  4. Cited By
End-to-End Active Speaker Detection

End-to-End Active Speaker Detection

27 March 2022
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
ArXivPDFHTML

Papers citing "End-to-End Active Speaker Detection"

37 / 37 papers shown
Title
Multimodal Deep Learning
Multimodal Deep Learning
Cem Akkus
Jiquan Ngiam
Vladana Djakovic
Steffen Jauch-Walser
A. Khosla
...
Jann Goschenhofer
Honglak Lee
A. Ng
Daniel Schalk
Matthias Aßenmacher
106
3,169
0
12 Jan 2023
Fusion-GCN: Multimodal Action Recognition using Graph Convolutional
  Networks
Fusion-GCN: Multimodal Action Recognition using Graph Convolutional Networks
Michael Duhme
Raphael Memmesheimer
Dietrich Paulus
71
24
0
27 Sep 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
Hugo C. C. Carneiro
C. Weber
S. Wermter
CVBM
45
7
0
01 Sep 2021
UniCon: Unified Context Network for Robust Active Speaker Detection
UniCon: Unified Context Network for Robust Active Speaker Detection
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
59
37
0
05 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for
  Audio-visual Active Speaker Detection
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
56
180
0
14 Jul 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
48
45
0
07 Jun 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
80
51
0
11 Jan 2021
JOLO-GCN: Mining Joint-Centered Light-Weight Information for
  Skeleton-Based Action Recognition
JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition
Jinmiao Cai
Nianjuan Jiang
Xiaoguang Han
Kui Jia
Jiangbo Lu
37
85
0
16 Nov 2020
Active Speakers in Context
Active Speakers in Context
Juan Carlos León Alcázar
Fabian Caba Heilbron
Long Mai
Federico Perazzi
Joon-Young Lee
Pablo Arbelaez
Guohao Li
42
61
0
20 May 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
339
18,721
0
13 Feb 2020
SGAS: Sequential Greedy Architecture Search
SGAS: Sequential Greedy Architecture Search
Ge Li
Guocheng Qian
Itzel C. Delgadillo
Matthias Muller
Ali K. Thabet
Guohao Li
3DPC
51
186
0
30 Nov 2019
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action Detection
Mengmeng Xu
Chen Zhao
D. Rojas
Ali K. Thabet
Guohao Li
108
435
0
26 Nov 2019
Personal VAD: Speaker-Conditioned Voice Activity Detection
Personal VAD: Speaker-Conditioned Voice Activity Detection
Shaojin Ding
Quan Wang
Shuo-yiin Chang
Li Wan
Ignacio López Moreno
37
75
0
12 Aug 2019
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection
  (AVA)
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)
Joon Son Chung
40
54
0
25 Jun 2019
Point Clouds Learning with Attention-based Graph Convolution Networks
Point Clouds Learning with Attention-based Graph Convolution Networks
Zhuyang Xie
Junzhou Chen
B. Peng
3DPC
91
54
0
31 May 2019
Fast Graph Representation Learning with PyTorch Geometric
Fast Graph Representation Learning with PyTorch Geometric
Matthias Fey
J. E. Lenssen
3DH
GNN
3DPC
214
4,334
0
06 Mar 2019
Simplifying Graph Convolutional Networks
Simplifying Graph Convolutional Networks
Felix Wu
Tianyi Zhang
Amauri Souza
Christopher Fifty
Tao Yu
Kilian Q. Weinberger
GNN
220
3,172
0
19 Feb 2019
AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection
AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection
Joseph Roth
Sourish Chaudhuri
Ondˇrej Klejch
Radhika Marvin
Andrew C. Gallagher
...
S. Ramaswamy
Arkadiusz Stopczynski
Cordelia Schmid
Zhonghua Xi
C. Pantofaru
55
144
0
05 Jan 2019
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph
  Generation
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Yikang Li
Wanli Ouyang
Bolei Zhou
Jianping Shi
Yawen Cui
Xiaogang Wang
GNN
69
274
0
29 Jun 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
348
2,274
0
14 Jun 2018
Rethinking Knowledge Graph Propagation for Zero-Shot Learning
Rethinking Knowledge Graph Propagation for Zero-Shot Learning
Michael C. Kampffmeyer
Yinbo Chen
Xiaodan Liang
Hao Wang
Yujia Zhang
Eric Xing
156
305
0
29 May 2018
Image Generation from Scene Graphs
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
293
820
0
04 Apr 2018
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Xinyu Wang
Yufei Ye
Abhinav Gupta
145
589
0
21 Mar 2018
Dynamic Graph CNN for Learning on Point Clouds
Dynamic Graph CNN for Learning on Point Clouds
Yue Wang
Yongbin Sun
Ziwei Liu
Sanjay E. Sarma
M. Bronstein
Justin Solomon
GNN
3DPC
255
6,132
0
24 Jan 2018
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action
  Recognition
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Sijie Yan
Yuanjun Xiong
Dahua Lin
GNN
228
4,161
0
23 Jan 2018
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
3DPC
118
1,934
0
27 Nov 2017
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
277
8,902
0
21 Nov 2017
VoxCeleb: a large-scale speaker identification dataset
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
117
2,273
0
26 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
656
131,414
0
12 Jun 2017
The Kinetics Human Action Video Dataset
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
231
3,796
0
19 May 2017
Semi-Supervised Classification with Graph Convolutional Networks
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNN
SSL
591
28,999
0
09 Sep 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.1K
193,814
0
10 Dec 2015
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Ashesh Jain
Amir Zamir
Silvio Savarese
Ashutosh Saxena
GNN
128
1,093
0
17 Nov 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
439
43,277
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
On the Properties of Neural Machine Translation: Encoder-Decoder
  Approaches
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
Kyunghyun Cho
B. V. Merrienboer
Dzmitry Bahdanau
Yoshua Bengio
AI4CE
AIMat
231
6,772
0
03 Sep 2014
Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
237
7,526
0
09 Jun 2014
1