The Second DIHARD Diarization Challenge: Dataset, task, and baselines

18 June 2019

Sriram Ganapathy

Papers citing "The Second DIHARD Diarization Challenge: Dataset, task, and baselines"

45 / 45 papers shown

Title
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification Yichen He Yuan Lin Jianchao Wu Hanchong Zhang Yuchen Zhang Ruicheng Le VGen VLM 195 2 0 11 Nov 2024
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation D. Doukhan Christine Maertens William Le Personnic Ludovic Speroni Reda Dehak 38 2 0 06 Jun 2024
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments Shikha Baghel Shreyas Ramoji Somil Jain Pratik Roy Chowdhuri Prachi Singh Deepu Vijayasenan Sriram Ganapathy 30 6 0 21 Nov 2023
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer Zhengyang Chen Bing Han Shuai Wang Yan-min Qian 28 18 0 13 Sep 2023
Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System Zhao-Yu Yin Jingguang Tian Xinhui Hu Xinkang Xu Yang Xiang 25 1 0 11 Aug 2023
Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains Martin Lebourdais Théo Mariotte Marie Tahon Anthony Larcher Antoine Laurent Silvio Montrésor S. Meignier Jean-Hugh Thomas VLM 33 5 0 24 Jul 2023
Neural Diarization with Non-autoregressive Intermediate Attractors Yusuke Fujita Tatsuya Komatsu Robin Scheibler Yusuke Kida Tetsuji Ogawa 40 11 0 13 Mar 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge Jaesung Huh A. Brown Jee-weon Jung Joon Son Chung Arsha Nagrani D. Garcia-Romero Andrew Zisserman 23 26 0 20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering A. Sholokhov Nikita Kuzmin Kong Aik Lee Chng Eng Siong 30 1 0 19 Feb 2023
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis Zhihao Du Shiliang Zhang Siqi Zheng Zhijie Yan 24 14 0 18 Nov 2022
BER: Balanced Error Rate For Speaker Diarization Tao Liu K. Yu 20 4 0 08 Nov 2022
No-audio speaking status detection in crowded settings via visual pose-based filtering and wearable acceleration Jose Vargas-Quiros Laura Cabrera-Quiros Hayley Hung 29 1 0 01 Nov 2022
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0 Marie Kunesova Zbynek Zajíc SSL VLM 18 15 0 26 Oct 2022
Robust Acoustic Domain Identification with its Application to Speaker Diarization Kishore Kumar A Shefali Waldekar Md. Sahidullah G. Saha 24 0 0 05 Aug 2022
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors Shota Horiguchi Shinji Watanabe Leibny Paola García-Perera Yuki Takashima Y. Kawaguchi 39 23 0 06 Jun 2022
Baselines and Protocols for Household Speaker Recognition A. Sholokhov Xuechen Liu Md. Sahidullah Tomi Kinnunen 25 4 0 30 Apr 2022
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly Yu-Huai Peng Hung-Shin Lee Pin-Tuan Huang Hsin-Min Wang 21 0 0 30 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings Nikita Kuzmin Igor Fedorov A. Sholokhov 27 7 0 28 Feb 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge Fan Yu Shiliang Zhang Pengcheng Guo Yihui Fu Zhihao Du ... Kong Aik Lee Zhijie Yan B. Ma Xin Xu Hui Bu 18 28 0 08 Feb 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge A. Brown Jaesung Huh Joon Son Chung Arsha Nagrani Daniel Garcia-Romero Andrew Zisserman 31 40 0 12 Jan 2022
Shennong: a Python toolbox for audio speech features extraction Mathieu Bernard Maxime Poli Julien Karadayi Emmanuel Dupoux 26 5 0 10 Dec 2021
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information Zhihao Du Shiliang Zhang Siqi Zheng Weilong Huang Ming Lei BDL 21 1 0 28 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 275 1,026 0 13 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity You Jin Kim Hee-Soo Heo Jee-weon Jung Youngki Kwon Bong-Jin Lee Joon Son Chung 32 3 0 07 Oct 2021
Multi-scale speaker embedding-based graph attention networks for speaker diarisation Youngki Kwon Hee-Soo Heo Jee-weon Jung You Jin Kim Bong-Jin Lee Joon Son Chung 41 18 0 07 Oct 2021
Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization Prachi Singh Sriram Ganapathy SSL 31 7 0 14 Sep 2021
The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021 VoxCeleb Speaker Recognition Challenge Weiqing Wang Danwei Cai Qingjian Lin Lin Yang Junjie Wang Jin Wang Ming Li 27 26 0 05 Sep 2021
A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis Dimitris Gkoumas Bo Wang Adam Tsakalidis M. Wolters A. Zubiaga Matthew Purver M. Liakata 21 8 0 03 Sep 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System Hirofumi Inaguma Shun Kiyono Nelson Enrique Yalta Soplin Pengcheng Guo Jun Suzuki Kevin Duh Shinji Watanabe 3DV 40 2 0 01 Jul 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization Shota Horiguchi Yusuke Fujita Shinji Watanabe Yawen Xue Leibny Paola García-Perera 37 64 0 20 Jun 2021
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings Soumi Maiti Hakan Erdogan K. Wilson Scott Wisdom Shinji Watanabe J. Hershey 27 21 0 05 May 2021
Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network Jee-weon Jung Hee-Soo Heo Youngki Kwon Joon Son Chung Bong-Jin Lee 37 18 0 07 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning Tae Jin Park Naoyuki Kanda Dimitrios Dimitriadis Kyu Jeong Han Shinji Watanabe Shrikanth Narayanan VLM 274 327 0 24 Jan 2021
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks Federico Landini Jan Profant Mireia Díez L. Burget 216 199 0 29 Dec 2020
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge Arsha Nagrani Joon Son Chung Jaesung Huh Andrew Brown Ernesto Coto Weidi Xie Mitchell McLaren D. Reynolds Andrew Zisserman 21 74 0 12 Dec 2020
The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge Renyu Wang Ruilin Tong Y. Yeung Xiao Chen 6 1 0 22 Oct 2020
Microsoft Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2020 Xiong Xiao Naoyuki Kanda Zhuo Chen Tianyan Zhou Takuya Yoshioka ... Yu-Huan Wu Jian Wu Shujie Liu Jinyu Li Jiawei Liu 27 62 0 22 Oct 2020
Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews Bo Wang Yue Wu Niall Taylor Terry Lyons M. Liakata A. Nevado-Holgado K. Saunders 16 13 0 08 Aug 2020
DNN Speaker Tracking with Embeddings C. Castillo-Sanchez Leibny Paola García-Perera A. Martín-González 16 0 0 13 Jul 2020
End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors Shota Horiguchi Yusuke Fujita Shinji Watanabe Yawen Xue Kenji Nagamatsu 37 186 0 20 May 2020
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario Ivan Medennikov M. Korenevsky Tatiana Prisyach Yuri Y. Khokhlov Mariya Korenevskaya ... Anton Mitrofanov A. Andrusenko Ivan Podluzhny A. Laptev A. Romanenko 13 195 0 14 May 2020
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings Shinji Watanabe Michael I. Mandel Jon Barker Emmanuel Vincent Ashish Arora ... Emmanuel Vincent Shota Horiguchi Naoyuki Kanda Takuya Yoshioka Neville Ryant 20 297 0 20 Apr 2020
pyannote.audio: neural building blocks for speaker diarization H. Bredin Ruiqing Yin Juan Manuel Coria G. Gelly Pavel Korshunov Marvin Lavechin D. Fustes Hadrien Titeux Wassim Bouaziz Marie-Philippe Gill 202 313 0 04 Nov 2019
Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection Latané Bullock H. Bredin Leibny Paola García-Perera 27 94 0 25 Oct 2019
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 266 2,242 0 14 Jun 2018