Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.11332
Cited By
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon
22 June 2022
Robin Algayres
Tristan Ricoul
Julien Karadayi
Hugo Laurenccon
Salah Zaiem
Abdel-rahman Mohamed
Benoît Sagot
Emmanuel Dupoux
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon"
28 / 28 papers shown
Title
Towards Unsupervised Speech Recognition Without Pronunciation Models
Junrui Ni
Liming Wang
Yang Zhang
Kaizhi Qian
Heting Gao
Mark Hasegawa-Johnson
Chang D. Yoo
SSL
OffRL
129
0
0
10 Jan 2025
Unsupervised Word Discovery: Boundary Detection with Clustering vs. Dynamic Programming
Simon Malan
Benjamin van Niekerk
Herman Kamper
71
0
0
22 Sep 2024
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Algayres Robin
Adel Nabli
Benoît Sagot
Emmanuel Dupoux
SSL
50
8
0
11 Apr 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Herman Kamper
70
26
0
24 Feb 2022
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
180
2,966
0
14 Jun 2021
Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Laureano Moro-Velazquez
Najim Dehak
SSL
70
37
0
03 Jun 2021
Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation
C. Jacobs
Yevgen Matusevych
Herman Kamper
50
21
0
19 Mar 2021
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
251
360
0
01 Feb 2021
Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks
Herman Kamper
Benjamin van Niekerk
SSL
MQ
71
36
0
14 Dec 2020
Acoustic span embeddings for multilingual query-by-example search
Yushi Hu
Shane Settle
Karen Livescu
RALM
64
8
0
24 Nov 2020
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
Tu Nguyen
Maureen de Seyssel
Patricia Roze
M. Rivière
Evgeny Kharitonov
Alexei Baevski
Ewan Dunbar
Emmanuel Dupoux
SSL
126
107
0
23 Nov 2020
The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units
Ewan Dunbar
Julien Karadayi
Mathieu Bernard
Xuan-Nga Cao
Robin Algayres
Lucas Ondel
Laurent Besacier
S. Sakti
Emmanuel Dupoux
SSL
106
61
0
12 Oct 2020
Evaluating the reliability of acoustic speech embeddings
Robin Algayres
Mohamed Salah Zaiem
Benoît Sagot
Emmanuel Dupoux
67
29
0
27 Jul 2020
Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Najim Dehak
SSL
58
16
0
26 Jul 2020
Multilingual Jointly Trained Acoustic and Written Word Embeddings
Yushi Hu
Shane Settle
Karen Livescu
44
23
0
24 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
285
5,801
0
20 Jun 2020
pyannote.audio: neural building blocks for speaker diarization
H. Bredin
Ruiqing Yin
Juan Manuel Coria
G. Gelly
Pavel Korshunov
Marvin Lavechin
D. Fustes
Hadrien Titeux
Wassim Bouaziz
Marie-Philippe Gill
229
325
0
04 Nov 2019
Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models
Herman Kamper
SSL
71
68
0
01 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRL
SSL
327
10,302
0
10 Jul 2018
Sampling strategies in Siamese Networks for unsupervised speech representation learning
Rachid Riad
Corentin Dancette
Julien Karadayi
Neil Zeghidour
Thomas Schatz
Emmanuel Dupoux
SSL
49
28
0
30 Apr 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech
Yu-An Chung
James R. Glass
3DV
71
184
0
23 Mar 2018
The Zero Resource Speech Challenge 2017
Maarten Versteegh
Xuan-Nga Cao
Roland Thiollière
Thomas Schatz
Mathieu Bernard
A. Jansen
Xavier Anguera Miró
Emmanuel Dupoux
70
204
0
12 Dec 2017
An embedded segmental K-means model for unsupervised segmentation and clustering of speech
Herman Kamper
Karen Livescu
Sharon Goldwater
54
96
0
23 Mar 2017
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches
Shane Settle
Karen Livescu
61
87
0
08 Nov 2016
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,399
0
12 Sep 2016
A segmental framework for fully-unsupervised large-vocabulary speech recognition
Herman Kamper
A. Jansen
Sharon Goldwater
68
104
0
22 Jun 2016
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
680
31,512
0
16 Jan 2013
1