Regularizing and Optimizing LSTM Language Models

7 August 2017

Papers citing "Regularizing and Optimizing LSTM Language Models"

50 / 509 papers shown

Title
Syntax-driven Iterative Expansion Language Models for Controllable Text Generation Noe Casas José A. R. Fonollosa Marta R. Costa-jussá 19 11 0 05 Apr 2020
Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset Lee F. Callender Curtis Hawthorne Jesse Engel 43 20 0 01 Apr 2020
A Survey of Deep Learning for Scientific Discovery M. Raghu Erica Schmidt OOD AI4CE 40 120 0 26 Mar 2020
Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing Ajay K. Gogineni S. Swayamjyoti Devadatta Sahoo K. Sahu R. Kishore 31 32 0 21 Mar 2020
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies Paul Pu Liang Manzil Zaheer Yuan Wang Amr Ahmed BDL 21 1 0 18 Mar 2020
Iterative Averaging in the Quest for Best Test Error Diego Granziol Xingchen Wan Samuel Albanie Stephen J. Roberts 10 3 0 02 Mar 2020
Tensor Networks for Probabilistic Sequence Modeling Jacob Miller Guillaume Rabusseau John Terilla 16 5 0 02 Mar 2020
The Implicit and Explicit Regularization Effects of Dropout Colin Wei Sham Kakade Tengyu Ma 30 114 0 28 Feb 2020
Temporal Convolutional Attention-based Network For Sequence Modeling Hongyan Hao Yan Wang Siqiao Xue Yudi Xia Jian Zhao S. Furao 30 41 0 28 Feb 2020
Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity Thomas Miconi Aditya Rawal Jeff Clune Kenneth O. Stanley 13 90 0 24 Feb 2020
Addressing Some Limitations of Transformers with Feedback Memory Angela Fan Thibaut Lavril Edouard Grave Armand Joulin Sainbayar Sukhbaatar 26 11 0 21 Feb 2020
MaxUp: A Simple Way to Improve Generalization of Neural Network Training Chengyue Gong Tongzheng Ren Mao Ye Qiang Liu AAML 27 56 0 20 Feb 2020
A Systematic Comparison of Architectures for Document-Level Sentiment Classification Jeremy Barnes Vinit Ravishankar Lilja Ovrelid Erik Velldal 8 0 0 19 Feb 2020
SentenceMIM: A Latent Variable Language Model M. Livne Kevin Swersky David J. Fleet VLM 49 6 0 18 Feb 2020
Transformer on a Diet Chenguang Wang Zihao Ye Aston Zhang Zheng-Wei Zhang Alex Smola 32 8 0 14 Feb 2020
Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges T. H. Le Hao Chen Muhammad Ali Babar VLM 64 152 0 13 Feb 2020
fastai: A Layered API for Deep Learning Jeremy Howard Sylvain Gugger AI4CE 20 857 0 11 Feb 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning Neha Singh Nirmalya Roy A. Gangopadhyay 27 6 0 10 Feb 2020
Understanding and Improving Knowledge Distillation Jiaxi Tang Rakesh Shivanna Zhe Zhao Dong Lin Anima Singh Ed H. Chi Sagar Jain 27 129 0 10 Feb 2020
Blank Language Models T. Shen Victor Quach Regina Barzilay Tommi Jaakkola 203 73 0 08 Feb 2020
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding Sean Welleck Ilia Kulikov Jaedeok Kim Richard Yuanzhe Pang Kyunghyun Cho 17 65 0 06 Feb 2020
SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks Sungho Shin Yoonho Boo Wonyong Sung MQ 27 3 0 02 Feb 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training Weizhen Qi Yu Yan Yeyun Gong Dayiheng Liu Nan Duan Jiusheng Chen Ruofei Zhang Ming Zhou AI4TS 27 446 0 13 Jan 2020
A Continuous Space Neural Language Model for Bengali Language Hemayet Ahmed Chowdhury Md. Azizul Haque Imon Anisur Rahman Aisha Khatun Md. Saiful Islam 19 2 0 11 Jan 2020
CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity Konpat Preechakul B. Kijsirikul ODL 30 3 0 24 Dec 2019
Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin using Recursive Neural Networks Minh Nguyen G. Ngo Nancy F. Chen 19 19 0 20 Dec 2019
Just Add Functions: A Neural-Symbolic Language Model David Demeter Doug Downey 8 11 0 11 Dec 2019
Why are Adaptive Methods Good for Attention Models? J.N. Zhang Sai Praneeth Karimireddy Andreas Veit Seungyeon Kim Sashank J. Reddi Surinder Kumar S. Sra 18 79 0 06 Dec 2019
Fantastic Generalization Measures and Where to Find Them Yiding Jiang Behnam Neyshabur H. Mobahi Dilip Krishnan Samy Bengio AI4CE 14 596 0 04 Dec 2019
Domain-independent Dominance of Adaptive Methods Pedro H. P. Savarese David A. McAllester Sudarshan Babu Michael Maire ODL 18 22 0 04 Dec 2019
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization Thanapapas Horsuwan Kasidis Kanwatchara P. Vateekul B. Kijsirikul 14 9 0 03 Dec 2019
TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP Nils Rethmeier V. Saxena Isabelle Augenstein FAtt 25 2 0 02 Dec 2019
How Can We Know What Language Models Know? Zhengbao Jiang Frank F. Xu Jun Araki Graham Neubig KELM 41 1,373 0 28 Nov 2019
SimpleBooks: Long-term dependency book dataset with simplified English vocabulary for word-level language modeling Huyen Nguyen 9 2 0 27 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling Sachin Mehta Rik Koncel-Kedziorski Mohammad Rastegari Hannaneh Hajishirzi AI4TS 38 23 0 27 Nov 2019
Autoencoding Undirected Molecular Graphs With Neural Networks Jeppe Johan Waarkjaer Olsen Peter Ebert Christensen Martin Hangaard Hansen alexander rosenberg johansen AI4CE 19 0 0 26 Nov 2019
Relevance-Promoting Language Model for Short-Text Conversation Xin Li Piji Li Wei Bi Xiaojiang Liu Wai Lam 16 11 0 26 Nov 2019
Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity 27 68 0 26 Nov 2019
AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture Tunhou Zhang Hsin-Pai Cheng Zhenwen Li Feng Yan Chengyu Huang H. Li Yiran Chen 10 9 0 21 Nov 2019
Thick-Net: Parallel Network Structure for Sequential Modeling Yu-Xuan Li Jin-Yuan Liu Liang Li Xiang Guan 19 0 0 19 Nov 2019
RotationOut as a Regularization Method for Neural Network Kaiqin Hu Barnabás Póczós 33 1 0 18 Nov 2019
Multi-Zone Unit for Recurrent Neural Networks Fandong Meng Jinchao Zhang Yang Liu Jie Zhou AI4CE 19 2 0 17 Nov 2019
A Subword Level Language Model for Bangla Language Aisha Khatun Anisur Rahman Hemayet Ahmed Chowdhury Md. Saiful Islam A. Tasnim 17 4 0 15 Nov 2019
Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits Achyudh Ram Ji Xin M. Nagappan Yaoliang Yu Rocío Cabrera Lozoya A. Sabetta Jimmy J. Lin 27 3 0 15 Nov 2019
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation Xiaozhi Wang Tianyu Gao Zhaocheng Zhu Zhengyan Zhang Zhiyuan Liu Juan-Zi Li Jian Tang 15 647 0 13 Nov 2019
Optimizing Millions of Hyperparameters by Implicit Differentiation Jonathan Lorraine Paul Vicol David Duvenaud DD 30 403 0 06 Nov 2019
On the Effectiveness of the Pooling Methods for Biomedical Relation Extraction with Deep Learning Tuan Ngo Nguyen Franck Dernoncourt Thien Huu Nguyen 19 5 0 04 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning Jianbang Ding Xuancheng Ren Ruixuan Luo Xu Sun ODL 19 46 0 27 Oct 2019
FineText: Text Classification via Attention-based Language Model Fine-tuning Yunzhe Tao Saurabh Gupta Satyapriya Krishna Xiong Zhou Orchid Majumder Vineet Khare 21 3 0 25 Oct 2019
Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling Heung-Chang Lee Do-Guk Kim Bohyung Han 38 6 0 23 Oct 2019