Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.03929
Cited By
Comparative layer-wise analysis of self-supervised speech models
8 November 2022
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Comparative layer-wise analysis of self-supervised speech models"
50 / 87 papers shown
Title
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
52
2
0
11 Apr 2025
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
Ji-Hoon Kim
Jeongsoo Choi
Jaehun Kim
Chaeyoung Jung
Joon Son Chung
CVBM
50
1
0
21 Mar 2025
Context-Aware Two-Step Training Scheme for Domain Invariant Speech Separation
Wupeng Wang
Zexu Pan
Jingru Lin
Shuai Wang
Haizhou Li
53
0
0
16 Mar 2025
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
Kshitij Ambilduke
Ben Peters
Sonal Sannigrahi
Anil Keshwani
Tsz Kin Lam
Bruno Martins
Marcely Zanon Boito
André F. T. Martins
52
0
0
13 Mar 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Xinbing Wang
Mingqi Jiang
Z. Ma
Ziyu Zhang
S. Liu
...
Zhifei Li
Xie Chen
Lei Xie
Y. Guo
Wei Xue
81
12
0
03 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
52
0
0
02 Mar 2025
Why disentanglement-based speaker anonymization systems fail at preserving emotions?
Ünal Ege Gaznepoglu
Nils Peters
83
0
0
22 Jan 2025
How Redundant Is the Transformer Stack in Speech Representation Models?
Teresa Dorszewski
Albert Kjøller Jacobsen
Lenka Tětková
Lars Kai Hansen
107
0
0
20 Jan 2025
Discrete Speech Unit Extraction via Independent Component Analysis
Tomohiko Nakamura
Kwanghee Choi
Keigo Hojo
Yoshiaki Bando
Satoru Fukayama
Shinji Watanabe
43
0
0
11 Jan 2025
Towards Unsupervised Speech Recognition Without Pronunciation Models
Junrui Ni
Liming Wang
Yang Zhang
Kaizhi Qian
Heting Gao
Mark Hasegawa-Johnson
Chang D. Yoo
SSL
OffRL
88
0
0
10 Jan 2025
An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions
Theo Clark
Benedetta Cevoli
Eloy de Jong
Timofey Abramski
Jamie Dougherty
SSL
38
0
0
31 Oct 2024
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
K R Prajwal
Bowen Shi
Matthew Lee
Apoorv Vyas
Andros Tjandra
...
Baishan Guo
Huiyu Wang
Triantafyllos Afouras
David Kant
Wei-Ning Hsu
43
5
0
27 Oct 2024
JOOCI: a Framework for Learning Comprehensive Speech Representations
Hemant Yadav
R. Shah
Sunayana Sitaram
28
0
0
14 Oct 2024
Music Genre Classification using Large Language Models
Mohamed El Amine Meguenani
Alceu de Souza Britto Jr.
A. L. Koerich
31
0
0
10 Oct 2024
Exploring ASR-Based Wav2Vec2 for Automated Speech Disorder Assessment: Insights and Analysis
Tuan Nguyen
C. Fredouille
A. Ghio
M. Balaguer
Virginie Woisard
16
0
0
10 Oct 2024
Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
Yi Zhu
C. Goel
Surya Koppisetti
Trang Tran
Ankur Kumar
Gaurav Bharaj
AAML
28
0
0
09 Oct 2024
Mitigation of gender bias in automatic facial non-verbal behaviors generation
Alice Delbosc
M. Ochs
Nicolas Sabouret
Brian Ravenet
Stéphane Ayache
29
0
0
09 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David Harwath
50
3
0
05 Oct 2024
Adaptive Large Language Models By Layerwise Attention Shortcuts
Prateek Verma
Mert Pilanci
KELM
OffRL
52
0
0
17 Sep 2024
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Li-Wei Chen
Takuya Higuchi
He Bai
Ahmed Hussen Abdelaziz
Alexander Rudnicky
Shinji Watanabe
Tatiana Likhomanenko
B. Theobald
Zakaria Aldeneh
49
0
0
16 Sep 2024
Connecting Concept Convexity and Human-Machine Alignment in Deep Neural Networks
Teresa Dorszewski
Lenka Tětková
Lorenz Linhardt
Lars Kai Hansen
HAI
36
0
0
10 Sep 2024
Property Neurons in Self-Supervised Speech Transformers
T. Lin
Guan-Ting Lin
Hung-yi Lee
Hao Tang
MILM
27
0
0
07 Sep 2024
Probing self-attention in self-supervised speech models for cross-linguistic differences
Sai Gopinath
Joselyn Rodriguez
MILM
56
0
0
04 Sep 2024
Convexity-based Pruning of Speech Representation Models
Teresa Dorszewski
Lenka Tětková
Lars Kai Hansen
25
2
0
16 Aug 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
52
9
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
J. Duret
Yannick Esteve
Titouan Parcollet
41
0
0
08 Jul 2024
Improving Self-supervised Pre-training using Accent-Specific Codebooks
Darshan Prabhu
Abhishek Gupta
Omkar Nitsure
P. Jyothi
Sriram Ganapathy
SSL
44
0
0
04 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark J. F. Gales
Kate Knill
20
1
0
01 Jul 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen
Sining Sun
Changhao Shan
Qing Yang
Lei Xie
42
2
0
27 Jun 2024
WavRx: a Disease-Agnostic, Generalizable, and Privacy-Preserving Speech Health Diagnostic Model
Yi Zhu
Tiago H. Falk
MedIm
41
0
0
26 Jun 2024
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
Tung-Yu Wu
Yu-Xiang Lin
Tsui-Wei Weng
52
1
0
24 Jun 2024
Speech Representation Analysis based on Inter- and Intra-Model Similarities
Yassine El Kheir
Ahmed M. Ali
Shammur A. Chowdhury
SSL
43
2
0
23 Jun 2024
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
Cheol Jun Cho
Peter Wu
Tejas S. Prabhune
Dhruv Agarwal
Gopala K. Anumanchipalli
36
1
0
18 Jun 2024
Interface Design for Self-Supervised Speech Models
Yi-Jen Shih
David Harwath
54
1
0
18 Jun 2024
Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations
Mukhtar Mohamed
Oli Danyi Liu
Hao Tang
Sharon Goldwater
SSL
44
2
0
13 Jun 2024
Self-Supervised Speech Representations are More Phonetic than Semantic
Kwanghee Choi
Ankita Pasad
Tomohiko Nakamura
Satoru Fukayama
Karen Livescu
Shinji Watanabe
31
14
0
12 Jun 2024
SCDNet: Self-supervised Learning Feature-based Speaker Change Detection
Yue Li
Xinsheng Wang
Li Zhang
Lei Xie
42
1
0
12 Jun 2024
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
31
2
0
11 Jun 2024
MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
49
1
0
09 Jun 2024
Towards objective and interpretable speech disorder assessment: a comparative analysis of CNN and transformer-based models
Malo Maisonneuve
C. Fredouille
M. Lalain
A. Ghio
Virginie Woisard
48
0
0
07 Jun 2024
Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Ihab Asaad
Maxime Jacquelin
Olivier Perrotin
Laurent Girin
Thomas Hueber
33
0
0
30 May 2024
Crossmodal ASR Error Correction with Discrete Speech Units
Yuanchao Li
Pinzhen Chen
Peter Bell
Catherine Lai
36
6
0
26 May 2024
Investigating the Áutoencoder Behavior' in Speech Self-Supervised Models: a focus on HuBERT's Pretraining
Valentin Vielzeuf
SSL
44
0
0
14 May 2024
A predictive learning model can simulate temporal dynamics and context effects found in neural representations of continuous speech
Oli Danyi Liu
Hao Tang
Naomi H Feldman
Sharon Goldwater
24
1
0
13 May 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Compact Speech Translation Models via Discrete Speech Units Pretraining
Tsz Kin Lam
Alexandra Birch
Barry Haddow
53
2
0
29 Feb 2024
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
Jeong Hun Yeo
Seunghee Han
Minsu Kim
Y. Ro
53
22
0
23 Feb 2024
Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models
Maxime Fily
Guillaume Wisniewski
Severine Guillaume
Gilles Adda
Alexis Michaud
22
1
0
08 Feb 2024
Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition
Alexandra Saliba
Yuanchao Li
Ramon Sanabria
Catherine Lai
38
8
0
04 Feb 2024
1
2
Next