ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.13159
  4. Cited By
Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot
  TTS and LLM

Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM

20 November 2024
Jiawei Yu
Yongqian Li
Xiaosong Qiao
Huan Zhao
Xiaofeng Zhao
Wei Tang
Hao Fei
Hao Yang
Jinsong Su
ArXivPDFHTML

Papers citing "Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM"

13 / 13 papers shown
Title
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Yushen Chen
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
107
83
0
09 Oct 2024
Can Generative Large Language Models Perform ASR Error Correction?
Can Generative Large Language Models Perform ASR Error Correction?
Rao Ma
Mengjie Qian
Potsawee Manakul
Mark Gales
Kate Knill
AuLLM
KELM
48
57
0
09 Jul 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,313
0
15 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
344
1,091
0
05 Oct 2022
Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric
  Speech Recognition
Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition
M. Soleymanpour
Michael T. Johnson
Rahim Soleymanpour
J. Berry
66
30
0
27 Jan 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
227
1,857
0
26 Oct 2021
Human Perception of Audio Deepfakes
Human Perception of Audio Deepfakes
Nicolas Müller
Karla Markert
Konstantin Böttinger
55
49
0
20 Jul 2021
Quantifying Bias in Automatic Speech Recognition
Quantifying Bias in Automatic Speech Recognition
Siyuan Feng
O. Kudina
B. Halpern
O. Scharenborg
63
87
0
28 Mar 2021
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary
  Words in End-To-End ASR Systems
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
97
78
0
23 Nov 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
220
3,131
0
16 May 2020
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
102
1,504
0
30 Mar 2018
Attention-Based Models for Speech Recognition
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
121
2,606
0
24 Jun 2015
Sequence Transduction with Recurrent Neural Networks
Sequence Transduction with Recurrent Neural Networks
Alex Graves
187
1,868
0
14 Nov 2012
1