ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.02858
  4. Cited By
Capturing Multi-Resolution Context by Dilated Self-Attention

Capturing Multi-Resolution Context by Dilated Self-Attention

7 April 2021
Niko Moritz
Takaaki Hori
Jonathan Le Roux
ArXiv (abs)PDFHTML

Papers citing "Capturing Multi-Resolution Context by Dilated Self-Attention"

19 / 19 papers shown
Title
Streaming Transformer-based Acoustic Models Using Self-attention with
  Augmented Memory
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
66
63
0
16 May 2020
Exploring Self-attention for Image Recognition
Exploring Self-attention for Image Recognition
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
95
786
0
28 Apr 2020
Streaming automatic speech recognition with the transformer model
Streaming automatic speech recognition with the transformer model
Niko Moritz
Takaaki Hori
Jonathan Le Roux
80
187
0
08 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
336
1,918
0
17 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
79
721
0
13 Sep 2019
Stand-Alone Self-Attention in Vision Models
Stand-Alone Self-Attention in Vision Models
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLMSLRViT
104
1,216
0
13 Jun 2019
Language Modeling with Deep Transformers
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
92
176
0
10 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech
  Recognition
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
177
3,470
0
18 Apr 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
260
3,747
0
09 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,229
0
11 Oct 2018
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
206
3,531
0
19 Aug 2018
End-to-end Speech Recognition with Word-based RNN Language Models
End-to-end Speech Recognition with Word-based RNN Language Models
Takaaki Hori
Jaejin Cho
Shinji Watanabe
73
107
0
08 Aug 2018
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Urvashi Khandelwal
He He
Peng Qi
Dan Jurafsky
RALM
58
297
0
12 May 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
798
132,454
0
12 Jun 2017
Advances in Joint CTC-Attention based End-to-End Speech Recognition with
  a Deep CNN Encoder and RNN-LM
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM
Takaaki Hori
Shinji Watanabe
Yu Zhang
William Chan
63
292
0
08 Jun 2017
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
354
1,375
0
06 Jun 2016
Multi-Scale Context Aggregation by Dilated Convolutions
Multi-Scale Context Aggregation by Dilated Convolutions
Feng Yu
V. Koltun
SSeg
271
8,459
0
23 Nov 2015
Attention-Based Models for Speech Recognition
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
129
2,611
0
24 Jun 2015
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
578
27,338
0
01 Sep 2014
1