ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03240
  4. Cited By
An Unsupervised Autoregressive Model for Speech Representation Learning

An Unsupervised Autoregressive Model for Speech Representation Learning

5 April 2019
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
    SSL
ArXivPDFHTML

Papers citing "An Unsupervised Autoregressive Model for Speech Representation Learning"

50 / 148 papers shown
Title
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
36
8
0
08 May 2022
Sound Localization by Self-Supervised Time Delay Estimation
Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen
David Fouhey
Andrew Owens
SSL
32
19
0
26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0
On-demand compute reduction with stochastic wav2vec 2.0
Apoorv Vyas
Wei-Ning Hsu
Michael Auli
Alexei Baevski
32
13
0
25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training
  and Distribution of Opinion Scores
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
24
21
0
07 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of
  Speech Emotion Recognition in Federated Learning
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning
Tiantian Feng
Raghuveer Peri
Shrikanth Narayanan
FedML
20
28
0
05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory
  mapping by vocal imitation
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Marc-Antoine Georges
Julien Diard
Laurent Girin
J. Schwartz
Thomas Hueber
25
7
0
05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
29
6
0
29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification
Federated Self-Supervised Learning for Acoustic Event Classification
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
44
11
0
22 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On
  Federated Learning using Multiview Pseudo-Labeling
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Tiantian Feng
Shrikanth Narayanan
38
17
0
15 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
31
109
0
14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with
  Sparse Sharing Sub-networks
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
32
19
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
45
106
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised
  Pre-Training
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
27
7
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
21
11
0
01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and
  Self-Supervised Scoring
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Herman Kamper
34
25
0
24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using
  Wearables
Assessing the State of Self-Supervised Human Activity Recognition using Wearables
H. Haresamudram
Irfan Essa
Thomas Plötz
SSL
47
86
0
22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with
  Transfer Learning and Language Model Decoding
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
49
11
0
10 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual
  Grounding and Masked Language Modeling
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng
David Harwath
SSL
43
26
0
07 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for
  Automatic Speech Recognition
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Bethan Thomas
Samuel Kessler
S. Karout
28
71
0
07 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
47
51
0
02 Feb 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection
  Using Acoustic Breathing/Cough/Speech Signals
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech Signals
Xing-Yu Chen
Qiu-shi Zhu
Jie Zhang
Lirong Dai
31
14
0
22 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster
  Prediction
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
60
306
0
05 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forward
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
38
20
0
04 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated
  Learning Settings
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng
H. Hashemi
Rajat Hebbar
M. Annavaram
Shrikanth S. Narayanan
26
25
0
26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
26
28
0
16 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
40
363
0
02 Nov 2021
Improving Noise Robustness of Contrastive Speech Representation Learning
  with Speech Reconstruction
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Heming Wang
Yao Qian
Xiaofei Wang
Yiming Wang
Chengyi Wang
Shujie Liu
Takuya Yoshioka
Jinyu Li
DeLiang Wang
26
29
0
28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
141
1,721
0
26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
38
268
0
19 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and
  Challenges
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
Henry Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
37
274
0
18 Oct 2021
Speech Representation Learning Through Self-supervised Pretraining And
  Multi-task Finetuning
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Yi-Chen Chen
Shu-Wen Yang
Cheng-Kuang Lee
Simon See
Hung-yi Lee
SSL
19
12
0
18 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio
  Representations
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
36
12
0
17 Oct 2021
Word Order Does Not Matter For Speech Recognition
Word Order Does Not Matter For Speech Recognition
Vineel Pratap
Qiantong Xu
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
43
4
0
12 Oct 2021
UniSpeech-SAT: Universal Speech Representation Learning with Speaker
  Aware Pre-Training
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Sanyuan Chen
Yu Wu
Chengyi Wang
Zhengyang Chen
Zhuo Chen
...
Jian Wu
Yao Qian
Furu Wei
Jinyu Li
Xiangzhan Yu
SSL
35
85
0
12 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from
  Self-Supervised Models
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models
Sanjeev Khudanpur
Desh Raj
Sanjeev Khudanpur
61
6
0
10 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised
  Conformers
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Joel Shor
A. Jansen
Wei Han
Daniel S. Park
Yu Zhang
SSL
AI4TS
48
54
0
09 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for
  End-to-End Speech Recognition
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
24
81
0
09 Oct 2021
Mandarin-English Code-switching Speech Recognition with Self-supervised
  Speech Representation Models
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models
Liang-Hsuan Tseng
Yu-Kuan Fu
Heng-Jui Chang
Hung-yi Lee
SSL
28
14
0
07 Oct 2021
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation
  of Hidden-unit BERT
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT
Heng-Jui Chang
Shu-Wen Yang
Hung-yi Lee
SSL
43
165
0
05 Oct 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish
  Dutch
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
Jakob Poncelet
Hugo Van hamme
SSL
28
1
0
29 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning
  for Automatic Speech Recognition
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
34
175
0
27 Sep 2021
Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Qiantong Xu
Alexei Baevski
Michael Auli
VLM
34
78
0
23 Sep 2021
Self-supervised Contrastive Cross-Modality Representation Learning for
  Spoken Question Answering
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering
Chenyu You
Nuo Chen
Yuexian Zou
SSL
27
63
0
08 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
35
117
0
07 Sep 2021
Speech Representations and Phoneme Classification for Preserving the
  Endangered Language of Ladin
Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin
Zane Durante
Leena Mathur
Eric Ye
Sichong Zhao
Tejas Ramdas
Khalil Iskarous
29
0
0
27 Aug 2021
Using Large Pre-Trained Models with Cross-Modal Attention for
  Multi-Modal Emotion Recognition
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition
Krishna D N Freshworks
32
11
0
22 Aug 2021
Analyzing Speaker Information in Self-Supervised Models to Improve
  Zero-Resource Speech Processing
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing
Benjamin van Niekerk
Leanne Nortje
Matthew Baas
Herman Kamper
SSL
38
31
0
02 Aug 2021
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised
  Speech Representation Learning
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning
Samuel Kessler
Bethan Thomas
S. Karout
SSL
27
29
0
26 Jul 2021
Unsupervised Automatic Speech Recognition: A Review
Unsupervised Automatic Speech Recognition: A Review
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLM
SSL
39
57
0
09 Jun 2021
Previous
123
Next