Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.03100
Cited By
Information Aggregation for Multi-Head Attention with Routing-by-Agreement
5 April 2019
Jian Li
Baosong Yang
Zi-Yi Dou
Xing Wang
Michael R. Lyu
Zhaopeng Tu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Information Aggregation for Multi-Head Attention with Routing-by-Agreement"
29 / 29 papers shown
Title
Convolutional Self-Attention Networks
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
49
126
0
05 Apr 2019
Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement
Zi-Yi Dou
Zhaopeng Tu
Xing Wang
Longyue Wang
Shuming Shi
Tong Zhang
AI4CE
41
56
0
15 Feb 2019
Multi-Head Attention with Disagreement Regularization
Jian Li
Zhaopeng Tu
Baosong Yang
Michael R. Lyu
Tong Zhang
66
146
0
24 Oct 2018
Exploiting Deep Representations for Neural Machine Translation
Zi-Yi Dou
Zhaopeng Tu
Xing Wang
Shuming Shi
Tong Zhang
79
93
0
24 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.5K
94,511
0
11 Oct 2018
Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures
Gongbo Tang
Mathias Müller
Annette Rios Gonzales
Rico Sennrich
52
263
0
27 Aug 2018
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
80
752
0
10 Jul 2018
Information Aggregation via Dynamic Routing for Sequence Encoding
Jingjing Gong
Xipeng Qiu
Shaojing Wang
Xuanjing Huang
36
65
0
05 Jun 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
311
892
0
03 May 2018
Linguistically-Informed Self-Attention for Semantic Role Labeling
Emma Strubell
Pat Verga
D. Andor
David J. Weiss
Andrew McCallum
OffRL
72
379
0
23 Apr 2018
Capsules for Object Segmentation
Rodney LaLonde
Ulas Bagci
SSeg
MedIm
57
270
0
11 Apr 2018
Investigating Capsule Networks with Dynamic Routing for Text Classification
Wei Zhao
Jianbo Ye
Min Yang
Zeyang Lei
Suofei Zhang
Zhou Zhao
62
367
0
29 Mar 2018
Achieving Human Parity on Automatic Chinese to English News Translation
Hany Hassan
Anthony Aue
Chang Chen
Vishal Chowdhary
Jonathan Clark
...
Shuangzhi Wu
Yingce Xia
Dongdong Zhang
Zhirui Zhang
Ming Zhou
66
605
0
15 Mar 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
184
11,542
0
15 Feb 2018
Capsule Network Performance on Complex Data
Edgar Xi
Selina Bing
Yang Jin
43
214
0
10 Dec 2017
Weighted Transformer Network for Machine Translation
Karim Ahmed
N. Keskar
R. Socher
63
133
0
06 Nov 2017
Dynamic Routing Between Capsules
S. Sabour
Nicholas Frosst
Geoffrey E. Hinton
144
4,589
0
26 Oct 2017
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Shirui Pan
Chengqi Zhang
49
755
0
14 Sep 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
628
130,942
0
12 Jun 2017
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
H. Ben-younes
Rémi Cadène
Matthieu Cord
Nicolas Thome
147
582
0
18 May 2017
Convolutional Sequence to Sequence Learning
Jonas Gehring
Michael Auli
David Grangier
Denis Yarats
Yann N. Dauphin
AIMat
148
3,283
0
08 May 2017
A Structured Self-attentive Sentence Embedding
Zhouhan Lin
Minwei Feng
Cicero Nogueira dos Santos
Mo Yu
Bing Xiang
Bowen Zhou
Yoshua Bengio
113
2,136
0
09 Mar 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
869
6,781
0
26 Sep 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
287
1,465
0
06 Jun 2016
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
195
7,729
0
31 Aug 2015
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
354
7,955
0
17 Aug 2015
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
117
2,606
0
24 Jun 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
314
10,050
0
10 Feb 2015
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
505
27,263
0
01 Sep 2014
1