Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.12607
Cited By
Generative Pre-Training for Speech with Autoregressive Predictive Coding
23 October 2019
Yu-An Chung
James R. Glass
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generative Pre-Training for Speech with Autoregressive Predictive Coding"
50 / 115 papers shown
Title
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Wupeng Wang
Zexu Pan
Xianrui Li
Shuai Wang
Haizhou Li
AI4TS
39
0
0
03 Apr 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
38
0
0
06 Jan 2025
Speech Separation with Pretrained Frontend to Minimize Domain Mismatch
Wupeng Wang
Zexu Pan
Xianrui Li
Shuai Wang
Yiming Li
34
4
0
05 Nov 2024
STAB: Speech Tokenizer Assessment Benchmark
Shikhar Vashishth
Harman Singh
Shikhar Bharadwaj
Sriram Ganapathy
Chulayuth Asawaroengchai
Kartik Audhkhasi
Andrew Rosenberg
Ankur Bapna
Bhuvana Ramabhadran
54
0
0
04 Sep 2024
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Kai-Wei Chang
Haibin Wu
Yu-Kai Wang
Yuan-Kuei Wu
Hua Shen
Wei-Cheng Tseng
Iu-thing Kang
Shang-Wen Li
Hung-yi Lee
53
3
0
23 Aug 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
49
0
0
20 Aug 2024
Mechanics of Next Token Prediction with Self-Attention
Yingcong Li
Yixiao Huang
M. E. Ildiz
A. S. Rawat
Samet Oymak
37
26
0
12 Mar 2024
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models
Ruchao Fan
Natarajan Balaji Shankar
Abeer Alwan
26
0
0
14 Feb 2024
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Alexander H. Liu
Sung-Lin Yeh
James R. Glass
SSL
24
3
0
16 Jan 2024
Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions
H. S. Bovbjerg
Jesper Jensen
Jan Østergaard
Zheng-Hua Tan
VLM
19
3
0
27 Dec 2023
Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer
Bing Yang
Xiaofei Li
SSL
25
3
0
01 Dec 2023
Hypergraph Node Representation Learning with One-Stage Message Passing
Shilin Qu
Weiqing Wang
Yuanxin Li
Xin Zhou
Fajie Yuan
43
1
0
01 Dec 2023
Generative Pre-training for Speech with Flow Matching
Alexander H. Liu
Matt Le
Apoorv Vyas
Bowen Shi
Andros Tjandra
Wei-Ning Hsu
24
31
0
25 Oct 2023
Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models
Asad Ullah
Alessandro Ragano
Andrew Hines
44
1
0
22 Sep 2023
RepCodec: A Speech Representation Codec for Speech Tokenization
Zhichao Huang
Chutong Meng
Tom Ko
22
23
0
31 Aug 2023
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer
Kyuhong Shim
Jinkyu Lee
Simyoung Chang
Kyuwoong Hwang
40
2
0
31 Aug 2023
MASR: Multi-label Aware Speech Representation
Anjali Raj
Shikhar Bharadwaj
Sriram Ganapathy
Min Ma
Shikhar Vashishth
SSL
18
0
0
20 Jul 2023
Probing self-supervised speech models for phonetic and phonemic information: a case study in aspiration
Kinan Martin
Jon Gauthier
Canaan Breiss
R. Levy
SSL
24
14
0
09 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Sameer Khurana
Nauman Dawalatabad
Antoine Laurent
Luis Vicente
Pablo Gimeno
Victoria Mingote
James R. Glass
VLM
20
1
0
01 Jun 2023
Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners
Sarthak Yadav
Sergios Theodoridis
Lars Kai Hansen
Zheng-Hua Tan
28
7
0
01 Jun 2023
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
Xingcheng Song
Di Wu
Binbin Zhang
Zhendong Peng
Bo Dang
Fuping Pan
Zhiyong Wu
40
20
0
18 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
27
5
0
14 May 2023
Self-supervised Learning with Speech Modulation Dropout
Samik Sadhu
H. Hermansky
SSL
18
0
0
22 Mar 2023
ADCNet: Learning from Raw Radar Data via Distillation
Bo Yang
Ishan Khatri
Michael Happold
Chulong Chen
36
3
0
21 Mar 2023
Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Chenyang Gao
Yue Gu
Francesco Calivá
Yuzong Liu
OffRL
26
4
0
07 Mar 2023
AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Jiachen Lian
Alexei Baevski
Wei-Ning Hsu
Michael Auli
SSL
37
34
0
10 Feb 2023
Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
AI4TS
30
15
0
11 Nov 2022
Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion
Zhouyuan Huo
K. Sim
Bo-wen Li
DongSeon Hwang
Tara N. Sainath
Trevor Strohman
22
5
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
30
8
0
02 Nov 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
31
33
0
16 Oct 2022
CTCBERT: Advancing Hidden-unit BERT with CTC Objectives
Ruchao Fan
Yiming Wang
Yashesh Gaur
Jinyu Li
41
7
0
16 Oct 2022
JOIST: A Joint Speech and Text Streaming Model For ASR
Tara N. Sainath
Rohit Prabhavalkar
Ankur Bapna
Yu Zhang
Zhouyuan Huo
Zhehuai Chen
Bo-wen Li
Weiran Wang
Trevor Strohman
RALM
AuLLM
53
35
0
13 Oct 2022
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Chutong Meng
Junyi Ao
Tom Ko
Mingxuan Wang
Haizhou Li
SSL
47
6
0
08 Oct 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Gopinath Chennupati
Milind Rao
Gurpreet Chadha
Aaron Eakin
A. Raju
...
Andrew Oberlin
Buddha Nandanoor
Prahalad Venkataramanan
Zheng Wu
Pankaj Sitpure
CLL
27
8
0
19 Jul 2022
MM-ALT: A Multimodal Automatic Lyric Transcription System
Xiangming Gu
Longshen Ou
Danielle Ong
Ye Wang
16
13
0
13 Jul 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation
Jeong-Eun Choi
Seongwon Jang
Hyunsouk Cho
Sehee Chung
SSL
16
6
0
10 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
21
15
0
10 Jul 2022
Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Chengyi Wang
Yiming Wang
Yu Wu
Sanyuan Chen
Jinyu Li
Shujie Liu
Furu Wei
SSL
27
18
0
21 Jun 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
Ruchao Fan
Abeer Alwan
27
30
0
16 Jun 2022
Self-supervised models of audio effectively explain human cortical responses to speech
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
30
42
0
27 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
29
33
0
22 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
134
350
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
25
36
0
17 May 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
38
132
0
09 May 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
Speech Pre-training with Acoustic Piece
Shuo Ren
Shujie Liu
Yu Wu
Long Zhou
Furu Wei
SSL
19
16
0
07 Apr 2022
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Junyi Ao
Zi-Hua Zhang
Long Zhou
Shujie Liu
Haizhou Li
Tom Ko
Lirong Dai
Jinyu Li
Yao Qian
Furu Wei
SSL
22
19
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
25
17
0
31 Mar 2022
Generative Spoken Dialogue Language Modeling
Tu Nguyen
Eugene Kharitonov
Jade Copet
Yossi Adi
Wei-Ning Hsu
...
Paden Tomasello
Robin Algayres
Benoît Sagot
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
38
80
0
30 Mar 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
19
6
0
29 Mar 2022
1
2
3
Next