ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.08907
  4. Cited By
Cross-modal Supervision for Learning Active Speaker Detection in Video

Cross-modal Supervision for Learning Active Speaker Detection in Video

29 March 2016
Punarjay Chakravarty
Tinne Tuytelaars
ArXivPDFHTML

Papers citing "Cross-modal Supervision for Learning Active Speaker Detection in Video"

13 / 13 papers shown
Title
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
Audio-Visual Activity Guided Cross-Modal Identity Association for Active
  Speaker Detection
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Rahul Sharma
Shrikanth Narayanan
42
8
0
01 Dec 2022
Unsupervised active speaker detection in media content using cross-modal
  information
Unsupervised active speaker detection in media content using cross-modal information
Rahul Sharma
Shrikanth Narayanan
32
3
0
24 Sep 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
26
36
0
15 Jul 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection
Learning Spatial-Temporal Graphs for Active Speaker Detection
Sourya Roy
Kyle Min
Subarna Tripathi
T. Guha
Somdeb Majumdar
43
3
0
02 Dec 2021
The Right to Talk: An Audio-Visual Transformer Approach
The Right to Talk: An Audio-Visual Transformer Approach
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
63
36
0
06 Aug 2021
UniCon: Unified Context Network for Robust Active Speaker Detection
UniCon: Unified Context Network for Robust Active Speaker Detection
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
29
36
0
05 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for
  Audio-visual Active Speaker Detection
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
27
176
0
14 Jul 2021
Active Speaker Detection as a Multi-Objective Optimization with
  Uncertainty-based Multimodal Fusion
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion
Baptiste Pouthier
L. Pilati
Leela K. Gudupudi
C. Bouveyron
F. Precioso
30
11
0
07 Jun 2021
Self-Supervised Learning of Audio-Visual Objects from Video
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
19
253
0
10 Aug 2020
Multimodal active speaker detection and virtual cinematography for video
  conferencing
Multimodal active speaker detection and virtual cinematography for video conferencing
Ross Cutler
Ramin Mehran
Sam Johnson
Cha Zhang
Adam G. Kirk
Oliver Whyte
Adarsh Kowdle
26
7
0
10 Feb 2020
VoxCeleb: a large-scale speaker identification dataset
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
49
2,252
0
26 Jun 2017
1