ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18108
  4. Cited By
Exploration of Efficient End-to-End ASR using Discretized Input from
  Self-Supervised Learning

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

29 May 2023
Xuankai Chang
Brian Yan
Yuya Fujita
Takashi Maekaku
Shinji Watanabe
ArXivPDFHTML

Papers citing "Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning"

31 / 31 papers shown
Title
Discrete Speech Unit Extraction via Independent Component Analysis
Discrete Speech Unit Extraction via Independent Component Analysis
Tomohiko Nakamura
Kwanghee Choi
Keigo Hojo
Yoshiaki Bando
Satoru Fukayama
Shinji Watanabe
43
0
0
11 Jan 2025
Exploring SSL Discrete Tokens for Multilingual ASR
Exploring SSL Discrete Tokens for Multilingual ASR
Mingyu Cui
Daxin Tan
Yifan Yang
Dingdong Wang
Huimeng Wang
Xiao Chen
Xie Chen
Xunying Liu
38
1
0
13 Sep 2024
Exploring SSL Discrete Speech Features for Zipformer-based Contextual
  ASR
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Mingyu Cui
Yifan Yang
Jiajun Deng
Jiawen Kang
Shujie Hu
Tianzi Wang
Zhaoqing Li
Shiliang Zhang
Xie Chen
Xunying Liu
33
1
0
13 Sep 2024
Estimating the Completeness of Discrete Speech Units
Estimating the Completeness of Discrete Speech Units
Sung-Lin Yeh
Hao Tang
36
1
0
09 Sep 2024
LAST: Language Model Aware Speech Tokenization
LAST: Language Model Aware Speech Tokenization
A. Turetzky
Yossi Adi
37
2
0
05 Sep 2024
SpeechPrompt: Prompting Speech Language Models for Speech Processing
  Tasks
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Kai-Wei Chang
Haibin Wu
Yu-Kai Wang
Yuan-Kuei Wu
Hua Shen
Wei-Cheng Tseng
Iu-thing Kang
Shang-Wen Li
Hung-yi Lee
53
3
0
23 Aug 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with
  Discrete Speech Representations
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations
Kunal Dhawan
Nithin Rao Koluguri
Ante Jukić
Ryan Langman
Jagadeesh Balam
Boris Ginsburg
47
1
0
03 Jul 2024
DASB -- Discrete Audio and Speech Benchmark
DASB -- Discrete Audio and Speech Benchmark
Pooneh Mousavi
Luca Della Libera
J. Duret
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
35
12
0
20 Jun 2024
Children's Speech Recognition through Discrete Token Enhancement
Children's Speech Recognition through Discrete Token Enhancement
Vrunda N. Sukhadia
Shammur A. Chowdhury
48
1
0
19 Jun 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
42
9
0
15 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech
  Representation from Self-supervised Learning Model
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
60
7
0
14 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
46
2
0
13 Jun 2024
ToneUnit: A Speech Discretization Approach for Tonal Language Speech
  Synthesis
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis
Dehua Tao
Daxin Tan
Y. Yeung
Xiao Chen
Tan Lee
35
3
0
13 Jun 2024
TokSing: Singing Voice Synthesis based on Discrete Tokens
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu
Chunlei Zhang
Jiatong Shi
Yuxun Tang
Shan Yang
Qin Jin
39
6
0
12 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
47
15
0
11 Jun 2024
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
Yihan Wu
Soumi Maiti
Yifan Peng
Wangyou Zhang
Chenda Li
Yuyue Wang
Xihua Wang
Shinji Watanabe
Ruihua Song
33
3
0
31 Jan 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation
  with Unified Audio-Visual Speech Representation
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
J. Choi
Se Jin Park
Minsu Kim
Y. Ro
33
12
0
05 Dec 2023
Loss Masking Is Not Needed in Decoder-only Transformer for
  Discrete-token-based ASR
Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR
Qian Chen
Wen Wang
Qinglin Zhang
Siqi Zheng
Shiliang Zhang
Chong Deng
Yukun Ma
Hai Yu
Jiaqing Liu
Chong Zhang
21
8
0
08 Nov 2023
Acoustic BPE for Speech Generation with Discrete Tokens
Acoustic BPE for Speech Generation with Discrete Tokens
Feiyu Shen
Yiwei Guo
Chenpeng Du
Xie Chen
Kai Yu
20
9
0
23 Oct 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
26
2
0
27 Sep 2023
Exploring Speech Recognition, Translation, and Understanding with
  Discrete Speech Units: A Comparative Study
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang
Brian Yan
Kwanghee Choi
Jee-weon Jung
Yichen Lu
...
Pengcheng Guo
Yao-Fei Cheng
Pavel Denisov
Kohei Saijo
Hsiu-Hsuan Wang
31
36
0
27 Sep 2023
Unsupervised Accent Adaptation Through Masked Language Model Correction
  Of Discrete Self-Supervised Speech Units
Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Jakob Poncelet
Hugo Van hamme
23
3
0
25 Sep 2023
Visual Speech Recognition for Languages with Limited Labeled Data using
  Automatic Labels from Whisper
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
Jeong Hun Yeo
Minsu Kim
Shinji Watanabe
Y. Ro
VLM
34
12
0
15 Sep 2023
Towards Practical and Efficient Image-to-Speech Captioning with
  Vision-Language Pre-training and Multi-modal Tokens
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Minsu Kim
J. Choi
Soumi Maiti
Jeong Hun Yeo
Shinji Watanabe
Y. Ro
VLM
26
6
0
15 Sep 2023
Voxtlm: unified decoder-only models for consolidating speech
  recognition/synthesis and speech/text continuation tasks
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Soumi Maiti
Yifan Peng
Shukjae Choi
Jee-weon Jung
Xuankai Chang
Shinji Watanabe
VLM
AuLLM
24
56
0
14 Sep 2023
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Yifan Yang
Feiyu Shen
Chenpeng Du
Ziyang Ma
K. Yu
Daniel Povey
Xie Chen
32
24
0
14 Sep 2023
Lip Reading for Low-resource Languages by Learning and Combining General
  Speech Knowledge and Language-specific Knowledge
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
34
16
0
18 Aug 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by
  Compressing Audio Knowledge of a Pretrained Model
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
26
18
0
15 Aug 2023
Many-to-Many Spoken Language Translation via Unified Speech and Text
  Representation Learning with Unit-to-Unit Translation
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Minsu Kim
J. Choi
Dahun Kim
Y. Ro
40
12
0
03 Aug 2023
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
134
350
0
21 May 2022
1