ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.10042
  4. Cited By
Who said that?: Audio-visual speaker diarisation of real-world meetings

Who said that?: Audio-visual speaker diarisation of real-world meetings

24 June 2019
Joon Son Chung
Bong-Jin Lee
Icksang Han
ArXivPDFHTML

Papers citing "Who said that?: Audio-visual speaker diarisation of real-world meetings"

12 / 12 papers shown
Title
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
35
9
0
25 Oct 2023
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced
  Audio-Visual Diarization
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization
Kyle Min
42
5
0
18 Jun 2023
Target Active Speaker Detection with Audio-visual Cues
Target Active Speaker Detection with Audio-visual Cues
Yiding Jiang
Ruijie Tao
Zexu Pan
Haizhou Li
33
16
0
22 May 2023
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
52
20
0
17 Aug 2021
The Right to Talk: An Audio-Visual Transformer Approach
The Right to Talk: An Audio-Visual Transformer Approach
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
63
36
0
06 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for
  Audio-visual Active Speaker Detection
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
29
176
0
14 Jul 2021
Self-Supervised Learning of Audio-Visual Objects from Video
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
19
253
0
10 Aug 2020
On the Role of Visual Cues in Audiovisual Speech Enhancement
On the Role of Visual Cues in Audiovisual Speech Enhancement
Zakaria Aldeneh
Anushree Prasanna Kumar
B. Theobald
Erik Marchi
S. Kajarekar
Devang Naik
Ahmed Hussen Abdelaziz
28
6
0
25 Apr 2020
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
266
2,242
0
14 Jun 2018
MatConvNet - Convolutional Neural Networks for MATLAB
MatConvNet - Convolutional Neural Networks for MATLAB
Andrea Vedaldi
Karel Lenc
192
2,946
0
15 Dec 2014
1