ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.11084
  4. Cited By
Unsupervised Speech Recognition

Unsupervised Speech Recognition

24 May 2021
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
    SSL
ArXivPDFHTML

Papers citing "Unsupervised Speech Recognition"

50 / 63 papers shown
Title
Deep Learning-based Intrusion Detection Systems: A Survey
Deep Learning-based Intrusion Detection Systems: A Survey
Zhiwei Xu
Yujuan Wu
Shiheng Wang
Jiabao Gao
Tian Qiu
Ziqi Wang
Hai Wan
Xibin Zhao
28
1
0
10 Apr 2025
Towards Unsupervised Speech Recognition Without Pronunciation Models
Towards Unsupervised Speech Recognition Without Pronunciation Models
Junrui Ni
Liming Wang
Yang Zhang
Kaizhi Qian
Heting Gao
Mark Hasegawa-Johnson
Chang D. Yoo
SSL
OffRL
94
0
0
10 Jan 2025
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Cheol Jun Cho
Nicholas Lee
Akshat Gupta
Dhruv Agarwal
Ethan Chen
Alan W Black
Gopala K. Anumanchipalli
34
0
0
09 Oct 2024
Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Yi-Jen Shih
Zoi Gkalitsiou
A. Dimakis
David Harwath
45
1
0
16 Sep 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
52
2
0
04 Jun 2024
Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition
Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition
O. Kundacina
V. Vincan
D. Mišković
BDL
104
0
0
03 May 2024
Preuve de concept dún bot vocal dialoguant en wolof
Preuve de concept dún bot vocal dialoguant en wolof
E. Gauthier
Papa Séga Wade
Thierry Moudenc
Patrice Collen
Emilie Guimier De Neef
Oumar Ba
Ndeye Khoyane Cama
Ahmadou Bamba Kebe
Ndeye Aissatou Gningue
Thomas MendoÓ Aristide
31
3
0
02 Apr 2024
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Dorjan Hitaj
Giulio Pagnotta
Fabio De Gaspari
Sediola Ruko
Briland Hitaj
Luigi V. Mancini
Fernando Perez-Cruz
42
4
0
06 Mar 2024
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
Wei-Yao Wang
Wei-Wei Du
Derek Xu
Wei Wang
Wenjie Peng
LMTD
50
7
0
02 Feb 2024
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for
  Automatic Speech Recognition
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Zhisheng Zheng
Ziyang Ma
Yu Wang
Xie Chen
36
2
0
28 Aug 2023
Improving Textless Spoken Language Understanding with Discrete Units as
  Intermediate Target
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Guanyong Wu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
31
5
0
29 May 2023
Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic
  Modeling of life histories of the Museum of the Person
Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person
L. Gris
R. Marcacini
Arnaldo Cândido Júnior
Edresson Casanova
A. S. Soares
S. Aluísio
21
7
0
23 May 2023
Enhancing Unsupervised Speech Recognition with Diffusion GANs
Enhancing Unsupervised Speech Recognition with Diffusion GANs
Xianchao Wu
DiffM
13
2
0
23 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across
  Languages
Supervised Acoustic Embeddings And Their Transferability Across Languages
Sreepratha Ram
Hanan Aldarmaki
SSL
24
3
0
03 Jan 2023
Efficient Self-supervised Learning with Contextualized Target
  Representations for Vision, Speech and Language
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLM
SSL
36
92
0
14 Dec 2022
Learning the joint distribution of two sequences using little or no
  paired data
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
30
2
0
06 Dec 2022
EURO: ESPnet Unsupervised ASR Open-source Toolkit
EURO: ESPnet Unsupervised ASR Open-source Toolkit
Dongji Gao
Jiatong Shi
Shun-Po Chuang
Leibny Paola García-Perera
Hung-yi Lee
Shinji Watanabe
Sanjeev Khudanpur
27
8
0
30 Nov 2022
Handling and extracting key entities from customer conversations using
  Speech recognition and Named Entity recognition
Handling and extracting key entities from customer conversations using Speech recognition and Named Entity recognition
Sharvi Endait
Ruturaj Ghatage
DD Kadam
16
2
0
28 Nov 2022
TESSP: Text-Enhanced Self-Supervised Speech Pre-training
TESSP: Text-Enhanced Self-Supervised Speech Pre-training
Zhuoyuan Yao
Shuo Ren
Sanyuan Chen
Ziyang Ma
Pengcheng Guo
Linfu Xie
29
5
0
24 Nov 2022
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Xiaoxue Gao
Xianghu Yue
Haizhou Li
30
7
0
18 Nov 2022
Introducing Semantics into Speech Encoders
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
36
3
0
15 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
34
12
0
06 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken
  Language Understanding via Phoneme level T5
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
29
6
0
01 Nov 2022
Waveform Boundary Detection for Partially Spoofed Audio
Waveform Boundary Detection for Partially Spoofed Audio
Zexin Cai
Weiqing Wang
Ming Li
24
25
0
01 Nov 2022
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised
  Speech Models
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Ramon Sanabria
Hao Tang
Sharon Goldwater
SSL
40
19
0
28 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken
  sentence embeddings
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
37
2
0
23 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero
  supervised speech ASR
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
51
17
0
18 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
  Self-Supervised Speech Representation Learning
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
41
33
0
16 Oct 2022
Learning Invariant Representation and Risk Minimized for Unsupervised
  Accent Domain Adaptation
Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation
Chendong Zhao
Jianzong Wang
Xiaoyang Qu
Haoqian Wang
Jing Xiao
SSL
43
1
0
15 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and
  Detection Methods
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
52
107
0
13 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
21
15
0
13 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural
  Networks on Phoneme Recognition
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
27
2
0
01 Oct 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based
  on Generative Adversarial Network
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
31
7
0
29 Jul 2022
Unsupervised data selection for Speech Recognition with contrastive loss
  ratios
Unsupervised data selection for Speech Recognition with contrastive loss ratios
Chanho Park
Rehan Ahmad
Thomas Hain
10
12
0
25 Jul 2022
Distilling a Pretrained Language Model to a Multilingual ASR Model
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi
Hyung-Min Park
VLM
31
11
0
25 Jun 2022
Speaker Identification using Speech Recognition
Speaker Identification using Speech Recognition
Syeda Rabia Arshad
Syed Mujtaba Haider
Abdul Basit Mughal
28
1
0
29 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
354
0
21 May 2022
Hearing voices at the National Library -- a speech corpus and acoustic
  model for the Swedish language
Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language
Martin Malmsten
Chris Haffenden
Love Borjeson
21
9
0
06 May 2022
Empirical Evaluation and Theoretical Analysis for Representation
  Learning: A Survey
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
29
4
0
18 Apr 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Algayres Robin
Adel Nabli
Benoît Sagot
Emmanuel Dupoux
SSL
23
8
0
11 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods
  to Improve Child Speech Recognition
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
28
31
0
06 Apr 2022
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An
  Extensive Benchmark on Air Traffic Control Communications
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Juan Pablo Zuluaga
Amrutha Prasad
Iuliia Nigmatulina
Seyyed Saeed Sarfjoo
P. Motlícek
Matthias Kleinert
H. Helmke
Oliver Ohneiser
Qingran Zhan
29
44
0
31 Mar 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken
  Language Model for Speech Processing Tasks
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
30
22
0
31 Mar 2022
KSoF: The Kassel State of Fluency Dataset -- A Therapy Centered Dataset
  of Stuttering
KSoF: The Kassel State of Fluency Dataset -- A Therapy Centered Dataset of Stuttering
Sebastian P. Bayerl
A. Gudenberg
Florian Honig
Elmar Nöth
Korbinian Riedhammer
35
35
0
10 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
45
106
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
19
11
0
01 Mar 2022
mSLAM: Massively multilingual joint pre-training for speech and text
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
30
111
0
03 Feb 2022
Discovering Phonetic Inventories with Crosslingual Automatic Speech
  Recognition
Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Piotr Żelasko
Siyuan Feng
Laureano Moro-Velazquez
A. Abavisani
Saurabhchand Bhati
O. Scharenborg
M. Hasegawa-Johnson
Najim Dehak
33
15
0
26 Jan 2022
Unsupervised Multimodal Word Discovery based on Double Articulation
  Analysis with Co-occurrence cues
Unsupervised Multimodal Word Discovery based on Double Articulation Analysis with Co-occurrence cues
Akira Taniguchi
Hiroaki Murakami
Ryo Ozaki
T. Taniguchi
23
2
0
18 Jan 2022
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
26
28
0
16 Dec 2021
12
Next