ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.01216
  4. Cited By
Spot the conversation: speaker diarisation in the wild

Spot the conversation: speaker diarisation in the wild

2 July 2020
Joon Son Chung
Jaesung Huh
Arsha Nagrani
Triantafyllos Afouras
Andrew Zisserman
    VGen
ArXivPDFHTML

Papers citing "Spot the conversation: speaker diarisation in the wild"

50 / 53 papers shown
Title
Guided Speaker Embedding
Guided Speaker Embedding
Shota Horiguchi
Takafumi Moriya
Atsushi Ando
Takanori Ashihara
Hiroshi Sato
Naohiro Tawara
Marc Delcroix
49
0
0
03 Jan 2025
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Yichen He
Yuan Lin
Jianchao Wu
Hanchong Zhang
Yuchen Zhang
Ruicheng Le
VGen
VLM
201
2
0
11 Nov 2024
A Benchmark for Multi-speaker Anonymization
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
49
1
0
08 Jul 2024
Systematic Evaluation of Online Speaker Diarization Systems Regarding
  their Latency
Systematic Evaluation of Online Speaker Diarization Systems Regarding their Latency
Roman Aperdannier
Sigurd Schacht
Alexander Piazza
44
0
0
05 Jul 2024
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and
  LAnguage in Conversational Environments
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
30
6
0
21 Nov 2023
Rethinking Session Variability: Leveraging Session Embeddings for
  Session Robustness in Speaker Verification
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
32
1
0
26 Sep 2023
Large-Scale Learning on Overlapped Speech Detection: New Benchmark and
  New General System
Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System
Zhao-Yu Yin
Jingguang Tian
Xinhui Hu
Xinkang Xu
Yang Xiang
25
1
0
11 Aug 2023
Target Active Speaker Detection with Audio-visual Cues
Target Active Speaker Detection with Audio-visual Cues
Yiding Jiang
Ruijie Tao
Zexu Pan
Haizhou Li
30
16
0
22 May 2023
Egocentric Auditory Attention Localization in Conversations
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
31
16
0
28 Mar 2023
Neural Diarization with Non-autoregressive Intermediate Attractors
Neural Diarization with Non-autoregressive Intermediate Attractors
Yusuke Fujita
Tatsuya Komatsu
Robin Scheibler
Yusuke Kida
Tetsuji Ogawa
42
11
0
13 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
Supervised Hierarchical Clustering using Graph Neural Networks for
  Speaker Diarization
Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization
Prachi Singh
Amrit Kaul
Sriram Ganapathy
BDL
38
8
0
24 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
23
26
0
20 Feb 2023
Towards Measuring and Scoring Speaker Diarization Fairness
Towards Measuring and Scoring Speaker Diarization Fairness
Yannis Tevissen
Jérôme Boudy
Gérard Chollet
Frédéric Petitpont
23
2
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
Probabilistic Back-ends for Online Speaker Recognition and Clustering
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
30
1
0
19 Feb 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for
  End-to-End Neural Diarization
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
Federico Landini
Mireia Díez
Alicia Lozano-Diez
L. Burget
39
15
0
12 Nov 2022
BER: Balanced Error Rate For Speaker Diarization
BER: Balanced Error Rate For Speaker Diarization
Tao Liu
K. Yu
20
4
0
08 Nov 2022
Wespeaker: A Research and Production oriented Speaker Embedding Learning
  Toolkit
Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Hongji Wang
Che-Yuan Liang
Shuai Wang
Zhengyang Chen
Binbin Zhang
Xu Xiang
Yan Deng
Y. Qian
35
118
0
31 Oct 2022
Target-Speaker Voice Activity Detection via Sequence-to-Sequence
  Prediction
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction
Ming Cheng
Weiqing Wang
Yucong Zhang
Xiaoyi Qin
Ming Li
VLM
56
33
0
28 Oct 2022
In search of strong embedding extractors for speaker diarisation
In search of strong embedding extractors for speaker diarisation
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesung Huh
A. Brown
Youngki Kwon
Shinji Watanabe
Joon Son Chung
44
16
0
26 Oct 2022
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Ziqing Du
Kai Liu
Xucheng Wan
Huan Zhou
25
0
0
24 Sep 2022
Unsupervised active speaker detection in media content using cross-modal
  information
Unsupervised active speaker detection in media content using cross-modal information
Rahul Sharma
Shrikanth Narayanan
32
3
0
24 Sep 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge
  2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai
Guoqiang Hong
Zhijian Ye
Ximin Li
Haizhou Li
43
7
0
23 Sep 2022
The BUCEA Speaker Diarization System for the VoxCeleb Speaker
  Recognition Challenge 2022
The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022
R. Zhou
Yu Du
Che-Ming Hu
22
0
0
20 Sep 2022
Multi-channel target speech enhancement based on ERB-scaled spatial
  coherence features
Multi-channel target speech enhancement based on ERB-scaled spatial coherence features
Yicheng Hsu
Yonghan Lee
M. Bai
31
1
0
17 Jul 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Rethinking Audio-visual Synchronization for Active Speaker Detection
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
18
10
0
21 Jun 2022
Magnitude-aware Probabilistic Speaker Embeddings
Magnitude-aware Probabilistic Speaker Embeddings
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
29
7
0
28 Feb 2022
Visual Speech Recognition for Multiple Languages in the Wild
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
Maja Pantic
VLM
130
145
0
26 Feb 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
A. Brown
Jaesung Huh
Joon Son Chung
Arsha Nagrani
Daniel Garcia-Romero
Andrew Zisserman
31
40
0
12 Jan 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
34
41
0
06 Jan 2022
End-to-end speaker diarization with transformer
End-to-end speaker diarization with transformer
Yongquan Lai
Xin Tang
Yuanyuan Fu
Rui Fang
31
1
0
14 Dec 2021
Learning-based personal speech enhancement for teleconferencing by
  exploiting spatial-spectral features
Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Yicheng Hsu
Yonghan Lee
M. Bai
32
10
0
10 Dec 2021
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Yucong Zhang
Qinjian Lin
Weiqing Wang
Lin Yang
Xuyang Wang
Junjie Wang
Ming Li
22
10
0
27 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
278
1,026
0
13 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker
  diarisation: disentangling noise and informing speech activity
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
You Jin Kim
Hee-Soo Heo
Jee-weon Jung
Youngki Kwon
Bong-Jin Lee
Joon Son Chung
32
3
0
07 Oct 2021
Multi-scale speaker embedding-based graph attention networks for speaker
  diarisation
Multi-scale speaker embedding-based graph attention networks for speaker diarisation
Youngki Kwon
Hee-Soo Heo
Jee-weon Jung
You Jin Kim
Bong-Jin Lee
Joon Son Chung
43
18
0
07 Oct 2021
XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021
XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021
Jie Wang
Fuchuan Tong
Zhi-Cong Chen
Lin Li
Q. Hong
Haodong Zhou
34
1
0
06 Sep 2021
The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021
  VoxCeleb Speaker Recognition Challenge
The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021 VoxCeleb Speaker Recognition Challenge
Weiqing Wang
Danwei Cai
Qingjian Lin
Lin Yang
Junjie Wang
Jin Wang
Ming Li
27
26
0
05 Sep 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
Hugo C. C. Carneiro
C. Weber
S. Wermter
CVBM
31
7
0
01 Sep 2021
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets
  Development
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Mingkuan Liu
Chi Zhang
Hua Xing
C. Feng
Mon-Chu Chen
Judith Bishop
Grace Ngapo
30
3
0
01 Sep 2021
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
52
20
0
17 Aug 2021
UniCon: Unified Context Network for Robust Active Speaker Detection
UniCon: Unified Context Network for Robust Active Speaker Detection
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
29
36
0
05 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for
  Audio-visual Active Speaker Detection
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
27
176
0
14 Jul 2021
Three-class Overlapped Speech Detection using a Convolutional Recurrent
  Neural Network
Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network
Jee-weon Jung
Hee-Soo Heo
Youngki Kwon
Joon Son Chung
Bong-Jin Lee
37
18
0
07 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
274
328
0
24 Jan 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
65
51
0
11 Jan 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
196
199
0
08 Jan 2021
Bayesian HMM clustering of x-vector sequences (VBx) in speaker
  diarization: theory, implementation and analysis on standard tasks
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks
Federico Landini
Jan Profant
Mireia Díez
L. Burget
216
200
0
29 Dec 2020
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge
Arsha Nagrani
Joon Son Chung
Jaesung Huh
Andrew Brown
Ernesto Coto
Weidi Xie
Mitchell McLaren
D. Reynolds
Andrew Zisserman
21
74
0
12 Dec 2020
12
Next