v1v2v3v4v5 (latest)

Effective Approaches to Attention-based Neural Machine Translation

17 August 2015

Thang Luong

Hieu H. Pham

Christopher D. Manning

ArXiv (abs)PDF HTML

Papers citing "Effective Approaches to Attention-based Neural Machine Translation"

50 / 52 papers shown

Title
TRA: Better Length Generalisation with Threshold Relative Attention Mattia Opper Roland Fernandez P. Smolensky Jianfeng Gao 111 0 0 29 Mar 2025
Informer in Algorithmic Investment Strategies on High Frequency Bitcoin Data Filip Stefaniuk Robert Ślepaczuk AIFin 164 0 0 23 Mar 2025
Generative Artificial Intelligence: Evolving Technology, Growing Societal Impact, and Opportunities for Information Systems Research Veda C. Storey Wei Thoo Yue J. Leon Zhao Roman Lukyanenko 66 1 0 25 Feb 2025
Synthetic generation of 2D data records based on Autoencoders Darius Couchard Oscar Olarte Rob Haelterman 86 0 0 20 Feb 2025
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference Duc Hau Nguyen Duc Hau Nguyen Pascale Sébillot 109 5 0 23 Jan 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation Duc Hau Nguyen Cyrielle Mallart Guillaume Gravier Pascale Sébillot 115 0 0 22 Jan 2025
Does Self-Attention Need Separate Weights in Transformers? Md. Kowsher Nusrat Jahan Prottasha Chun-Nam Yu O. Garibay Niloofar Yousefi 523 1 0 30 Nov 2024
Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction Guan-Hua Huang Wan-Chen Lai Tai-Been Chen Chien-Chin Hsu Huei-Yung Chen Yi-Chen Wu Li-Ren Yeh MedIm 67 2 0 31 Oct 2024
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Hanbo Cheng Limin Lin Chenyu Liu Pengcheng Xia Pengfei Hu Jiefeng Ma Jun Du Jia Pan DiffM VGen 420 0 0 17 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse Federico Arangath Joseph Jerome Sieber Melanie Zeilinger Carmen Amo Alonso 187 0 0 14 Oct 2024
Attention layers provably solve single-location regression Pierre Marion Raphael Berthier Gérard Biau Claire Boyer 426 7 0 02 Oct 2024
The Conformer Encoder May Reverse the Time Dimension Robin Schmitt Albert Zeyer Mohammad Zeineldeen Ralf Schluter Hermann Ney 70 0 0 01 Oct 2024
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs Emilio Colombo Fabio Mercorio Mario Mezzanzanica Antonio Serino 145 2 0 27 Jul 2024
Can Small Language Models Learn, Unlearn, and Retain Noise Patterns? Nicy Scaria Silvester John Joseph Kennedy Deepak N. Subramani MU 62 2 0 01 Jul 2024
Invariant Correlation of Representation with Label: Enhancing Domain Generalization in Noisy Environments Gaojie Jin Ronghui Mu Xinping Yi Xiaowei Huang Lijun Zhang 134 0 0 01 Jul 2024
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization Chanyeon Kim Jongwoon Park Hyun-sool Bae Woo Chang Kim 72 3 0 03 Apr 2024
Learning Goal-Directed Object Pushing in Cluttered Scenes with Location-Based Attention Nils Dengler Juan Del Aguila Ferrandis João Moura S. Vijayakumar Maren Bennewitz 91 0 0 26 Mar 2024
Streaming Sequence Transduction through Dynamic Compression Weiting Tan Yunmo Chen Tongfei Chen Guanghui Qin Haoran Xu Heidi C. Zhang Benjamin Van Durme Philipp Koehn 128 2 0 02 Feb 2024
Explicitly Disentangled Representations in Object-Centric Learning Riccardo Majellaro Jonathan Collu Aske Plaat Thomas M. Moerland CoGe OOD OCL 141 1 0 18 Jan 2024
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling Jinchao Zhang Shuyang Jiang Jiangtao Feng Lin Zheng Dianbo Sui 3DV 126 9 0 14 Oct 2022
DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement S. Iizuka E. Simo-Serra 143 39 0 18 Sep 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion Wen-Chin Huang Tomoki Hayashi Yi-Chiao Wu Hirokazu Kameoka Tomoki Toda 103 40 0 07 Aug 2020
An Effective Transition-based Model for Discontinuous NER Xiang Dai Sarvnaz Karimi Ben Hachey Cécile Paris BDL MU MedIm 87 79 0 28 Apr 2020
Dreem Open Datasets: Multi-Scored Sleep Datasets to compare Human and Automated sleep staging Antoine Guillot F. Sauvet E. During Valentin Thorey 105 106 0 31 Oct 2019
Algorithmic Copywriting: Automated Generation of Health-Related Advertisements to Improve their Performance Brit Youngmann Ran Gilad-Bachrach D. Karmon E. Yom-Tov MedIm 85 8 0 27 Oct 2019
Conversational Emotion Analysis via Attention Mechanisms Zheng Lian J. Tao Bin Liu Jian Huang 51 27 0 24 Oct 2019
Controlling the Output Length of Neural Machine Translation Surafel Melaku Lakew Mattia Antonino Di Gangi Marcello Federico 113 68 0 23 Oct 2019
A Transformer with Interleaved Self-attention and Convolution for Hybrid Acoustic Models Liang Lu 79 4 0 23 Oct 2019
Language model integration based on memory control for sequence to sequence speech recognition Aaron Springer Shinji Watanabe Takaaki Hori M. Baskar Hirofumi Inaguma Jesus Villalba Najim Dehak KELM 73 5 0 06 Nov 2018
Fine-Grained Attention Mechanism for Neural Machine Translation Heeyoul Choi Kyunghyun Cho Yoshua Bengio 72 174 0 30 Mar 2018
Train longer, generalize better: closing the generalization gap in large batch training of neural networks Elad Hoffer Itay Hubara Daniel Soudry ODL 178 799 0 24 May 2017
Data Augmentation for Low-Resource Neural Machine Translation Marzieh Fadaee Arianna Bisazza Christof Monz 103 469 0 01 May 2017
Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations Upol Ehsan Brent Harrison Larry Chan Mark O. Riedl 118 219 0 25 Feb 2017
Trainable Greedy Decoding for Neural Machine Translation Jiatao Gu Kyunghyun Cho Victor O.K. Li 154 74 0 08 Feb 2017
Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush 114 463 0 03 Feb 2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Noam M. Shazeer Azalia Mirhoseini Krzysztof Maziarz Andy Davis Quoc V. Le Geoffrey E. Hinton J. Dean MoE 251 2,653 0 23 Jan 2017
A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue Mihail Eric Christopher D. Manning BDL 109 155 0 15 Jan 2017
SYSTRAN's Pure Neural Machine Translation Systems Josep Crego Jungi Kim Guillaume Klein Anabel Rebollo Kathy Yang ... Bo Wang Jin Yang Dakun Zhang Jing Zhou Peter Zoldan 86 125 0 18 Oct 2016
Fully Character-Level Neural Machine Translation without Explicit Segmentation Jason D. Lee Kyunghyun Cho Thomas Hofmann VLM 130 457 0 10 Oct 2016
Online Segment to Segment Neural Transduction Lei Yu Jan Buys Phil Blunsom 105 82 0 26 Sep 2016
Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model Shi Feng Shujie Liu Mu Li M. Zhou 96 44 0 13 Jan 2016
Language to Logical Form with Neural Attention Li Dong Mirella Lapata AI4CE NAI 109 729 0 06 Jan 2016
DRAW: A Recurrent Neural Network For Image Generation Karol Gregor Ivo Danihelka Alex Graves Danilo Jimenez Rezende Daan Wierstra GAN DRL 170 1,961 0 16 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 348 10,079 0 10 Feb 2015
On Using Very Large Target Vocabulary for Neural Machine Translation Sébastien Jean Kyunghyun Cho Roland Memisevic Yoshua Bengio 155 1,011 0 05 Dec 2014
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results J. Chorowski Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio 98 471 0 04 Dec 2014
Addressing the Rare Word Problem in Neural Machine Translation Thang Luong Ilya Sutskever Quoc V. Le Oriol Vinyals Wojciech Zaremba AIMat AAML 114 788 0 30 Oct 2014
Sequence to Sequence Learning with Neural Networks Ilya Sutskever Oriol Vinyals Quoc V. Le AIMat 437 20,568 0 10 Sep 2014
Recurrent Neural Network Regularization Wojciech Zaremba Ilya Sutskever Oriol Vinyals ODL 148 2,777 0 08 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 575 27,325 0 01 Sep 2014