ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.04226
  4. Cited By
Language Modeling with Deep Transformers

Language Modeling with Deep Transformers

10 May 2019
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
    KELM
ArXivPDFHTML

Papers citing "Language Modeling with Deep Transformers"

50 / 106 papers shown
Title
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
52
0
0
13 Mar 2025
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
AuLLM
42
0
0
08 Jan 2025
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
Shawn Tan
Songlin Yang
Aaron Courville
Rameswar Panda
Yikang Shen
30
4
0
23 Oct 2024
What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach
What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach
Xingfang Wu
Heng Li
Foutse Khomh
AI4TS
32
0
0
30 Sep 2024
Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep
  Graph Neural Networks
Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep Graph Neural Networks
Jie Peng
Runlin Lei
Zhewei Wei
33
5
0
07 Aug 2024
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Xuelong Geng
Tianyi Xu
Kun Wei
Bingshen Mu
Hongfei Xue
...
Pengcheng Guo
Yuhang Dai
Longhao Li
Mingchen Shao
Lei Xie
44
9
0
03 May 2024
Transformers and Language Models in Form Understanding: A Comprehensive
  Review of Scanned Document Analysis
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
Abdelrahman Abdallah
Daniel Eberharter
Zoe Pfister
Adam Jatowt
40
12
0
06 Mar 2024
Lattice Rescoring Based on Large Ensemble of Complementary Neural
  Language Models
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models
A. Ogawa
Naohiro Tawara
Marc Delcroix
S. Araki
35
3
0
20 Dec 2023
Simul-LLM: A Framework for Exploring High-Quality Simultaneous
  Translation with Large Language Models
Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models
Victor Agostinelli
Max Wild
Matthew Raffel
Kazi Ahmed Asif Fuad
Lizhong Chen
26
6
0
07 Dec 2023
Server-side Rescoring of Spoken Entity-centric Knowledge Queries for
  Virtual Assistants
Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants
Youyuan Zhang
Sashank Gondala
Thiago Fraga-Silva
Christophe Van Gysel
48
2
0
02 Nov 2023
Practical Computational Power of Linear Transformers and Their Recurrent
  and Self-Referential Extensions
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
36
11
0
24 Oct 2023
Correction Focused Language Model Training for Speech Recognition
Correction Focused Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Ozlem Kalinli
KELM
33
3
0
17 Oct 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training
  Data for Automatic Speech Recognition
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
14
3
0
12 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative
  Training for Neural Transducers
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
28
0
0
11 Oct 2023
Forgetting Private Textual Sequences in Language Models via
  Leave-One-Out Ensemble
Forgetting Private Textual Sequences in Language Models via Leave-One-Out Ensemble
Zhe Liu
Ozlem Kalinli
MU
KELM
28
2
0
28 Sep 2023
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in
  the HYKIST Project
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project
Khai Le-Duc
17
2
0
26 Sep 2023
Decoder-only Architecture for Speech Recognition with CTC Prompts and
  Text Data Augmentation
Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
VLM
AuLLM
RALM
40
9
0
16 Sep 2023
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription
Peter Vieting
Simon Berger
Thilo von Neumann
Christoph Boeddeker
Ralf Schluter
Reinhold Haeb-Umbach
26
0
0
15 Sep 2023
Recovering from Privacy-Preserving Masking with Large Language Models
Recovering from Privacy-Preserving Masking with Large Language Models
A. Vats
Zhe Liu
Peng Su
Debjyoti Paul
Yingyi Ma
Yutong Pang
Zeeshan Ahmed
Ozlem Kalinli
31
9
0
12 Sep 2023
Evaluating Transformer's Ability to Learn Mildly Context-Sensitive
  Languages
Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages
Shunjie Wang
Shane Steinert-Threlkeld
30
4
0
02 Sep 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
30
4
0
24 Jul 2023
Leveraging Cross-Utterance Context For ASR Decoding
Leveraging Cross-Utterance Context For ASR Decoding
Robert Flynn
Anton Ragni
33
1
0
29 Jun 2023
The Impact of Positional Encoding on Length Generalization in
  Transformers
The Impact of Positional Encoding on Length Generalization in Transformers
Amirhossein Kazemnejad
Inkit Padhi
K. Ramamurthy
Payel Das
Siva Reddy
47
178
0
31 May 2023
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech
  Recognition
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition
Wei Zhou
Eugen Beck
Simon Berger
Ralf Schluter
Hermann Ney
VLM
30
4
0
28 May 2023
Latent Positional Information is in the Self-Attention Variance of
  Transformer Language Models Without Positional Embeddings
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Ta-Chung Chi
Ting-Han Fan
Li-Wei Chen
Alexander I. Rudnicky
Peter J. Ramadge
VLM
MILM
60
12
0
23 May 2023
Application-Agnostic Language Modeling for On-Device ASR
Application-Agnostic Language Modeling for On-Device ASR
M. Nußbaum-Thom
Lyan Verwimp
Youssef Oualil
11
2
0
16 May 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
150
0
03 Mar 2023
Massively Multilingual Shallow Fusion with Large Language Models
Massively Multilingual Shallow Fusion with Large Language Models
Ke Hu
Tara N. Sainath
Bo-wen Li
Nan Du
Yanping Huang
Andrew M. Dai
Yu Zhang
Rodrigo Cabrera
Z. Chen
Trevor Strohman
35
13
0
17 Feb 2023
Memory Augmented Lookup Dictionary based Language Modeling for Automatic
  Speech Recognition
Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition
Yukun Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
RALM
37
0
0
30 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw
  Data
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
M. Pantic
SSL
45
48
0
12 Dec 2022
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural
  Transducers
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
32
4
0
07 Dec 2022
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Xuedong Zhang
33
2
0
09 Nov 2022
Is Encoder-Decoder Redundant for Neural Machine Translation?
Is Encoder-Decoder Redundant for Neural Machine Translation?
Yingbo Gao
Christian Herold
Zijian Yang
Hermann Ney
27
4
0
21 Oct 2022
Mitigating Unintended Memorization in Language Models via Alternating
  Teaching
Mitigating Unintended Memorization in Language Models via Alternating Teaching
Zhe Liu
Xuedong Zhang
Fuchun Peng
38
3
0
13 Oct 2022
Multilingual Transformer Language Model for Speech Recognition in
  Low-resource Languages
Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Li Miao
Jian Wu
Piyush Behre
Shuangyu Chang
S. Parthasarathy
21
2
0
08 Sep 2022
Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by
  Human Speech Perception
Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception
Jiadong Wang
Xinyuan Qian
Haizhou Li
38
14
0
05 Sep 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
44
14
0
28 Aug 2022
Federated Select: A Primitive for Communication- and Memory-Efficient
  Federated Learning
Federated Select: A Primitive for Communication- and Memory-Efficient Federated Learning
Zachary B. Charles
Kallista A. Bonawitz
Stanislav Chiknavaryan
H. B. McMahan
Blaise Agüera y Arcas
FedML
23
13
0
19 Aug 2022
AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry
  Estimation
AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation
Nimet Kaygusuz
Oscar Alejandro Mendez Maldonado
Richard Bowden
29
5
0
26 Jun 2022
Neural Differential Equations for Learning to Program Neural Nets
  Through Continuous Learning Rules
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules
Kazuki Irie
Francesco Faccio
Jürgen Schmidhuber
AI4TS
38
11
0
03 Jun 2022
Efficient Training of Neural Transducer for Speech Recognition
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
24
22
0
22 Apr 2022
DecBERT: Enhancing the Language Understanding of BERT with Causal
  Attention Masks
DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks
Ziyang Luo
Yadong Xi
Jing Ma
Zhiwei Yang
Xiaoxi Mao
Changjie Fan
Rongsheng Zhang
19
3
0
19 Apr 2022
Scaling Language Model Size in Cross-Device Federated Learning
Scaling Language Model Size in Cross-Device Federated Learning
Jae Hun Ro
Theresa Breiner
Lara McConnaughey
Mingqing Chen
A. Suresh
Shankar Kumar
Rajiv Mathews
FedML
26
24
0
31 Mar 2022
Transformer Language Models without Positional Encodings Still Learn
  Positional Information
Transformer Language Models without Positional Encodings Still Learn Positional Information
Adi Haviv
Ori Ram
Ofir Press
Peter Izsak
Omer Levy
20
113
0
30 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
M. Pantic
VLM
128
144
0
26 Feb 2022
The Dual Form of Neural Networks Revisited: Connecting Test Time
  Predictions to Training Patterns via Spotlights of Attention
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
14
42
0
11 Feb 2022
Prompt Tuning GPT-2 language model for parameter-efficient domain
  adaptation of ASR systems
Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems
Saket Dingliwal
Ashish Shenoy
S. Bodapati
Ankur Gandhe
R. Gadde
Katrin Kirchhoff
VLM
25
4
0
16 Dec 2021
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
27
13
0
29 Nov 2021
Mixed Precision of Quantization of Transformer Language Models for
  Speech Recognition
Mixed Precision of Quantization of Transformer Language Models for Speech Recognition
Junhao Xu
Shoukang Hu
Jianwei Yu
Xunying Liu
Helen M. Meng
MQ
40
15
0
29 Nov 2021
Self-Normalized Importance Sampling for Neural Language Modeling
Self-Normalized Importance Sampling for Neural Language Modeling
Zijian Yang
Yingbo Gao
Alexander Gerstenberger
Jintao Jiang
Ralf Schluter
Hermann Ney
21
1
0
11 Nov 2021
123
Next