ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10005
  4. Cited By
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
v1v2 (latest)

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

17 May 2023
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
ArXiv (abs)PDFHTML

Papers citing "DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning"

20 / 20 papers shown
Title
Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis
Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis
Tianyi Xu
Hongjie Chen
Wang Qing
Lv Hang
Jian Kang
Li Jie
Zhennan Lin
Yongxiang Li
Xie Lei
19
0
0
27 May 2025
K*-Means: A Parameter-free Clustering Algorithm
K*-Means: A Parameter-free Clustering Algorithm
Louis Mahon
Mirella Lapata
72
0
0
17 May 2025
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Yemin Shi
Yu Shu
Siwei Dong
Guangyi Liu
Jaward Sesay
Jingwen Li
Zhiting Hu
AuLLMVLM
98
0
0
05 May 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
99
1
0
02 Mar 2025
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Sungnyun Kim
Sungwoo Cho
Sangmin Bae
Kangwook Jang
Se-Young Yun
SSL
169
1
0
23 Jan 2025
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
120
0
0
31 Oct 2024
DM-Codec: Distilling Multimodal Representations for Speech Tokenization
DM-Codec: Distilling Multimodal Representations for Speech Tokenization
Md Mubtasim Ahasan
Md Fahim
Tasnim Mohiuddin
A K M Mahbubur Rahman
Aman Chadha
Tariq Iqbal
M. A. Amin
Md. Mofijul Islam
Amin Ahsan Ali
103
1
0
19 Oct 2024
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech
  Representation Learning
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
Ashish Seth
Ramaneswaran Selvakumar
S. Sakshi
Sonal Kumar
Sreyan Ghosh
Dinesh Manocha
85
0
0
17 Oct 2024
SSR: Alignment-Aware Modality Connector for Speech Language Models
SSR: Alignment-Aware Modality Connector for Speech Language Models
Weiting Tan
Hirofumi Inaguma
Ning Dong
Paden Tomasello
Xutai Ma
128
6
0
30 Sep 2024
Advanced Clustering Techniques for Speech Signal Enhancement: A Review
  and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods
Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods
Abdulhady Abas Abdullah
A. Ahmed
Tarik Rashid
Hadi Veisi
Yassin Hussein Rassul
B. Hassan
Polla Fattah
Sabat Abdulhameed Ali
Ahmed S. Shamsaldin
67
1
0
28 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
113
4
0
29 Aug 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
137
19
0
30 Jun 2024
Sustainable self-supervised learning for speech representations
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
94
2
0
11 Jun 2024
The Effect of Batch Size on Contrastive Self-Supervised Speech
  Representation Learning
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning
Nik Vaessen
David A. van Leeuwen
96
3
0
21 Feb 2024
Revisiting Self-supervised Learning of Speech Representation from a
  Mutual Information Perspective
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Alexander H. Liu
Sung-Lin Yeh
James R. Glass
SSL
66
3
0
16 Jan 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Danwei Cai
Zexin Cai
Ze Li
Ming Li
64
0
0
03 Jan 2024
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
62
1
0
18 Dec 2023
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
72
3
0
15 Nov 2023
Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Dianwen Ng
Chong Zhang
Ruixi Zhang
Yukun Ma
Fabian Ritter-Gutierrez
Trung Hieu Nguyen
Chongjia Ni
Shengkui Zhao
Eng Siong Chng
B. Ma
VLM
65
1
0
18 Sep 2023
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
163
374
0
25 Oct 2019
1