Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.03240
Cited By
An Unsupervised Autoregressive Model for Speech Representation Learning
5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Unsupervised Autoregressive Model for Speech Representation Learning"
50 / 146 papers shown
Title
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
Hyunji Lee
Danni Liu
Supriti Sinhamahapatra
Jan Niehues
113
0
0
21 Feb 2025
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
31
2
0
09 Sep 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
51
0
0
20 Aug 2024
MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
59
1
0
09 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
53
1
0
08 Jun 2024
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
Siavash Shams
Sukru Samet Dindar
Xilin Jiang
N. Mesgarani
Mamba
74
19
0
20 May 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
40
20
0
15 Apr 2024
Mai Hoómāuna i ka Ái: Language Models Improve Automatic Speech Recognition in Hawaiian
Kaavya Chaparala
Guido Zarrella
Bruce Torres Fischer
Larry Kimura
Oiwi Parker Jones
AuLLM
15
0
0
03 Apr 2024
What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis
Takanori Ashihara
Marc Delcroix
Takafumi Moriya
Kohei Matsuura
Taichi Asami
Yusuke Ijima
SSL
24
7
0
31 Jan 2024
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Zhichao Wang
Yuan-Jui Chen
Xinsheng Wang
Lei Xie
Yuping Wang
34
6
0
19 Jan 2024
Self-Supervised Learning for Audio-Based Emotion Recognition
Peranut Nimitsurachat
Peter Washington
30
3
0
23 Jul 2023
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
22
5
0
06 Jul 2023
Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?
Eklavya Sarkar
Mathew Magimai.-Doss
27
11
0
23 May 2023
Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces
Oli Danyi Liu
Hao Tang
Sharon Goldwater
SSL
33
12
0
21 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
34
10
0
21 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
27
25
0
17 May 2023
Accommodating Audio Modality in CLIP for Multimodal Processing
Ludan Ruan
Anwen Hu
Yuqing Song
Liang Zhang
S. Zheng
Qin Jin
VLM
29
10
0
12 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
42
3
0
07 Mar 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
21
0
0
16 Jan 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
29
7
0
16 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
35
21
0
01 Dec 2022
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models
Zih-Ching Chen
Yu-Shun Sung
Hung-yi Lee
29
16
0
01 Dec 2022
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
42
6
0
17 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
32
13
0
17 Nov 2022
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
36
3
0
15 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
35
4
0
14 Nov 2022
Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
AI4TS
30
15
0
11 Nov 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
28
3
0
10 Nov 2022
Biased Self-supervised learning for ASR
Florian Kreyssig
Yangyang Shi
Jinxi Guo
Leda Sari
Abdel-rahman Mohamed
P. Woodland
SSL
30
2
0
04 Nov 2022
Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models
Sathvik Udupa
Siddarth C
P. Ghosh
27
7
0
30 Oct 2022
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models
Sung-Lin Yeh
Hao Tang
SSL
BDL
35
1
0
29 Oct 2022
FedAudio: A Federated Learning Benchmark for Audio Tasks
Tuo Zhang
Tiantian Feng
Samiul Alam
Sunwoo Lee
Mi Zhang
Shrikanth S. Narayanan
Salman Avestimehr
FedML
27
23
0
27 Oct 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
21
0
0
25 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
36
33
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
41
7
0
16 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
23
48
0
13 Oct 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
32
5
0
26 Sep 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech
Jaejin Cho
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
SSL
39
18
0
10 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Szu-Jui Chen
Jiamin Xie
John H. L. Hansen
45
8
0
30 Jun 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
22
15
0
27 Jun 2022
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
44
7
0
24 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
33
10
0
11 Jun 2022
Speak Like a Dog: Human to Non-human creature Voice Conversion
Kohei Suzuki
Shoki Sakamoto
T. Taniguchi
Hirokazu Kameoka
27
2
0
09 Jun 2022
Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Arunkumar A
S. Umesh
SSL
34
8
0
09 Jun 2022
Self-supervised models of audio effectively explain human cortical responses to speech
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
33
42
0
27 May 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
31
11
0
27 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
43
15
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
354
0
21 May 2022
1
2
3
Next