ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.07353
  4. Cited By
JOIST: A Joint Speech and Text Streaming Model For ASR

JOIST: A Joint Speech and Text Streaming Model For ASR

13 October 2022
Tara N. Sainath
Rohit Prabhavalkar
Ankur Bapna
Yu Zhang
Zhouyuan Huo
Zhehuai Chen
Bo-wen Li
Weiran Wang
Trevor Strohman
    RALM
    AuLLM
ArXivPDFHTML

Papers citing "JOIST: A Joint Speech and Text Streaming Model For ASR"

26 / 26 papers shown
Title
A Non-autoregressive Model for Joint STT and TTS
A Non-autoregressive Model for Joint STT and TTS
Vishal Sunder
Brian Kingsbury
G. Saon
Samuel Thomas
Slava Shechtman Hagai Aronowitz
Hagai Aronowitz
Eric Fosler-Lussier
Luis A. Lastras
61
0
0
15 Jan 2025
Effective Text Adaptation for LLM-based ASR through Soft Prompt
  Fine-Tuning
Effective Text Adaptation for LLM-based ASR through Soft Prompt Fine-Tuning
Yingyi Ma
Zhe Liu
Ozlem Kalinli
70
0
0
09 Dec 2024
An efficient text augmentation approach for contextualized Mandarin
  speech recognition
An efficient text augmentation approach for contextualized Mandarin speech recognition
Naijun Zheng
Xucheng Wan
Kai Liu
Ziqing Du
Zhou Huan
40
1
0
14 Jun 2024
ASTRA: Aligning Speech and Text Representations for Asr without Sampling
ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Neeraj Gaur
Rohan Agrawal
Gary Wang
Parisa Haghani
Andrew Rosenberg
Bhuvana Ramabhadran
39
0
0
10 Jun 2024
Text Injection for Neural Contextual Biasing
Text Injection for Neural Contextual Biasing
Zhong Meng
Zelin Wu
Rohit Prabhavalkar
Cal Peyser
Weiran Wang
Nanxin Chen
Tara N. Sainath
Bhuvana Ramabhadran
25
3
0
05 Jun 2024
High-precision Voice Search Query Correction via Retrievable Speech-text
  Embedings
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings
Christopher Li
Gary Wang
Kyle Kastner
Heng Su
Allen Chen
...
Zelin Wu
L. Velikovich
Pat Rondon
D. Caseiro
Petar S. Aleksic
34
1
0
08 Jan 2024
FastInject: Injecting Unpaired Text Data into CTC-based ASR training
FastInject: Injecting Unpaired Text Data into CTC-based ASR training
Keqi Deng
Phil Woodland
18
2
0
14 Dec 2023
Improving Large-scale Deep Biasing with Phoneme Features and Text-only
  Data in Streaming Transducer
Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer
Jin Qiu
Lu Huang
Boyu Li
Jun Zhang
Lu Lu
Zejun Ma
21
3
0
15 Nov 2023
Augmenting text for spoken language understanding with Large Language
  Models
Augmenting text for spoken language understanding with Large Language Models
Roshan Sharma
Suyoun Kim
Daniel Lazar
Trang Le
Akshat Shrivastava
Kwanghoon Ahn
Piyush Kansal
Leda Sari
Ozlem Kalinli
Michael Seltzer
25
2
0
17 Sep 2023
Text-Only Domain Adaptation for End-to-End Speech Recognition through
  Down-Sampling Acoustic Representation
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Jiaxu Zhu
Weinan Tong
Yaoxun Xu
Chang Song
Zhiyong Wu
Zhao You
Dan Su
Dong Yu
Helen M. Meng
32
0
0
04 Sep 2023
Text Injection for Capitalization and Turn-Taking Prediction in Speech
  Models
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Shaan Bijwadia
Shuo-yiin Chang
Weiran Wang
Zhong Meng
Hao Zhang
Tara N. Sainath
24
1
0
14 Aug 2023
Using Text Injection to Improve Recognition of Personal Identifiers in
  Speech
Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Yochai Blau
Rohan Agrawal
Lior Madmony
Gary Wang
Andrew Rosenberg
Zhehuai Chen
Zorik Gekhman
Genady Beryozkin
Parisa Haghani
Bhuvana Ramabhadran
43
3
0
14 Aug 2023
Improving Joint Speech-Text Representations Without Alignment
Improving Joint Speech-Text Representations Without Alignment
Cal Peyser
Zhong Meng
Ke Hu
Rohit Prabhavalkar
Andrew Rosenberg
Tara N. Sainath
M. Picheny
Kyunghyun Cho
VLM
31
4
0
11 Aug 2023
Text-only Domain Adaptation using Unified Speech-Text Representation in
  Transducer
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Lu Huang
Yangqiu Song
Jun Zhang
Lu Lu
Zejun Ma
29
2
0
07 Jun 2023
Understanding Shared Speech-Text Representations
Understanding Shared Speech-Text Representations
Gary Wang
Kyle Kastner
Ankur Bapna
Zhehuai Chen
Andrew Rosenberg
Bhuvana Ramabhadran
Yu Zhang
AuLLM
69
7
0
27 Apr 2023
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at
  Scale
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale
Cal Peyser
M. Picheny
Kyunghyun Cho
Rohit Prabhavalkar
Ronny Huang
Tara N. Sainath
AI4TS
32
1
0
19 Apr 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
149
0
03 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
79
253
0
02 Mar 2023
Massively Multilingual Shallow Fusion with Large Language Models
Massively Multilingual Shallow Fusion with Large Language Models
Ke Hu
Tara N. Sainath
Bo-wen Li
Nan Du
Yanping Huang
Andrew M. Dai
Yu Zhang
Rodrigo Cabrera
Z. Chen
Trevor Strohman
35
13
0
17 Feb 2023
JEIT: Joint End-to-End Model and Internal Language Model Training for
  Speech Recognition
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition
Zhong Meng
Weiran Wang
Rohit Prabhavalkar
Tara N. Sainath
Tongzhou Chen
Ehsan Variani
Yu Zhang
Bo-wen Li
Andrew Rosenberg
Bhuvana Ramabhadran
AuLLM
VLM
36
11
0
16 Feb 2023
Efficient Domain Adaptation for Speech Foundation Models
Efficient Domain Adaptation for Speech Foundation Models
Bo-wen Li
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
41
23
0
03 Feb 2023
Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Mu2^{2}2SLAM: Multitask, Multilingual Speech and Language Models
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
33
8
0
19 Dec 2022
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Zi-Hua Zhang
Sanyuan Chen
Long Zhou
Yu Wu
Shuo Ren
...
Zhuoyuan Yao
Xun Gong
Lirong Dai
Jinyu Li
Furu Wei
35
55
0
30 Sep 2022
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
118
193
0
14 Oct 2021
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech
  Recognition
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao Weng
Chengzhu Yu
Jia Cui
Chunlei Zhang
Dong Yu
77
39
0
28 Nov 2019
Listening while Speaking: Speech Chain by Deep Learning
Listening while Speaking: Speech Chain by Deep Learning
Andros Tjandra
S. Sakti
Satoshi Nakamura
AuLLM
126
165
0
16 Jul 2017
1