ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.07677
  4. Cited By
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
v1v2 (latest)

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

11 May 2023
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
ArXiv (abs)PDFHTML

Papers citing "Masked Audio Text Encoders are Effective Multi-Modal Rescorers"

40 / 40 papers shown
Title
KG-ECO: Knowledge Graph Enhanced Entity Correction for Query Rewriting
KG-ECO: Knowledge Graph Enhanced Entity Correction for Query Rewriting
Jason (Jinglun) Cai
Mingda Li
Ziyan Jiang
Eunah Cho
Zheng Chen
Yang Liu
Xing Fan
Chenlei Guo
73
7
0
21 Feb 2023
Improving Deliberation by Text-Only and Semi-Supervised Training
Improving Deliberation by Text-Only and Semi-Supervised Training
Ke Hu
Tara N. Sainath
Yanzhang He
Rohit Prabhavalkar
Trevor Strohman
S. Mavandadi
Weiran Wang
64
13
0
29 Jun 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
MAESTRO: Matched Speech Text Representations through Modality Matching
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
70
108
0
07 Apr 2022
Effect and Analysis of Large-scale Language Model Rescoring on
  Competitive ASR Systems
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
N. Itoh
G. Saon
99
24
0
01 Apr 2022
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen
  Language Models
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Heting Gao
Junrui Ni
Kaizhi Qian
Yang Zhang
Shiyu Chang
M. Hasegawa-Johnson
VLM
151
31
0
29 Mar 2022
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
94
48
0
02 Feb 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
265
1,898
0
26 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
103
96
0
20 Oct 2021
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
  for Low-Resource Speech Recognition
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
67
26
0
19 Sep 2021
Multimodal Few-Shot Learning with Frozen Language Models
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
183
788
0
25 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
184
2,993
0
14 Jun 2021
FastCorrect: Fast Error Correction with Edit Alignment for Automatic
  Speech Recognition
FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Yichong Leng
Xu Tan
Linchen Zhu
Jin Xu
Renqian Luo
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
KELM
68
64
0
09 May 2021
SUPERB: Speech processing Universal PERformance Benchmark
SUPERB: Speech processing Universal PERformance Benchmark
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
...
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
SSL
108
941
0
03 May 2021
Societal Biases in Retrieved Contents: Measurement Framework and
  Adversarial Mitigation for BERT Rankers
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers
Navid Rekabsaz
Simone Kopeinik
Markus Schedl
64
62
0
28 Apr 2021
Transformer Based Deliberation for Two-Pass Speech Recognition
Transformer Based Deliberation for Two-Pass Speech Recognition
Ke Hu
Ruoming Pang
Tara N. Sainath
Trevor Strohman
52
38
0
27 Jan 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
  Learning, Semi-Supervised Learning and Interpretation
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
102
492
0
02 Jan 2021
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
84
406
0
14 Dec 2020
SLURP: A Spoken Language Understanding Resource Package
SLURP: A Spoken Language Understanding Resource Package
E. Bastianelli
Andrea Vanzo
P. Swietojanski
Verena Rieser
VLM
93
231
0
26 Nov 2020
Align-Refine: Non-Autoregressive Speech Recognition via Iterative
  Realignment
Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
Ethan A. Chi
Julian Salazar
Katrin Kirchhoff
AI4TS
75
52
0
24 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
297
5,837
0
20 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
877
42,379
0
28 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
229
3,155
0
16 May 2020
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Ke Hu
Tara N. Sainath
Ruoming Pang
Rohit Prabhavalkar
81
87
0
17 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End
  Automatic Speech Recognition
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
57
113
0
26 Feb 2020
Transformer Transducer: A Streamable Speech Recognition Model with
  Transformer Encoders and RNN-T Loss
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
92
481
0
07 Feb 2020
Audio-attention discriminative language model for ASR rescoring
Audio-attention discriminative language model for ASR rescoring
Ankur Gandhe
Ariya Rastrow
46
24
0
06 Dec 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
163
667
0
12 Oct 2019
Two-Pass End-to-End Speech Recognition
Two-Pass End-to-End Speech Recognition
Tara N. Sainath
Ruoming Pang
David Rybach
Yanzhang He
Rohit Prabhavalkar
...
Qiao Liang
Trevor Strohman
Yonghui Wu
Ian McGraw
Chung-Cheng Chiu
80
148
0
29 Aug 2019
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field
  Language Model
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
Alex Jinpeng Wang
Kyunghyun Cho
VLM
88
358
0
11 Feb 2019
Parameter-Efficient Transfer Learning for NLP
Parameter-Efficient Transfer Learning for NLP
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
221
4,514
0
02 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,175
0
11 Oct 2018
Cold Fusion: Training Seq2Seq Models Together with Language Models
Cold Fusion: Training Seq2Seq Models Together with Language Models
Anuroop Sriram
Heewoo Jun
S. Satheesh
Adam Coates
VLM
90
281
0
21 Aug 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
786
132,363
0
12 Jun 2017
End-to-End Attention-based Large Vocabulary Speech Recognition
End-to-End Attention-based Large Vocabulary Speech Recognition
Dzmitry Bahdanau
J. Chorowski
Dmitriy Serdyuk
Philemon Brakel
Yoshua Bengio
89
1,152
0
18 Aug 2015
Listen, Attend and Spell
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
156
2,269
0
05 Aug 2015
EESEN: End-to-End Speech Recognition using Deep RNN Models and
  WFST-based Decoding
EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Yajie Miao
M. Gowayyed
Florian Metze
100
755
0
29 Jul 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.0K
150,312
0
22 Dec 2014
Deep Speech: Scaling up end-to-end speech recognition
Deep Speech: Scaling up end-to-end speech recognition
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
...
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
188
2,128
0
17 Dec 2014
End-to-end Continuous Speech Recognition using Attention-based Recurrent
  NN: First Results
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
J. Chorowski
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
103
471
0
04 Dec 2014
Sequence Transduction with Recurrent Neural Networks
Sequence Transduction with Recurrent Neural Networks
Alex Graves
195
1,871
0
14 Nov 2012
1