Papers citing "Language Models are Few-Shot Learners"

50 / 12,200 papers shown

Title
CoCon: A Self-Supervised Approach for Controlled Text Generation Alvin Chan Yew-Soon Ong B. Pung Aston Zhang Jie Fu 79 86 0 05 Jun 2020
MLE-guided parameter search for task loss minimization in neural sequence modeling Sean Welleck Kyunghyun Cho 59 10 0 04 Jun 2020
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up A. Gujarati Reza Karimi Safya Alzayat Wei Hao Antoine Kaufmann Ymir Vigfusson Jonathan Mace 109 285 0 03 Jun 2020
A Survey on Transfer Learning in Natural Language Processing Zaid Alyafeai Maged S. Alshaibani Irfan Ahmad 91 75 0 31 May 2020
Transferring Inductive Biases through Knowledge Distillation Samira Abnar Mostafa Dehghani Willem H. Zuidema 90 60 0 31 May 2020
Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems Zehao Lin Shaobo Cui Guodun Li Xiaoming Kang Feng Ji Feng-Lin Li Zhongzhou Zhao Haiqing Chen Yin Zhang 60 2 0 27 May 2020
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction L. Rasmy Yang Xiang Z. Xie Cui Tao Degui Zhi AI4MH LM&MA 101 698 0 22 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning Victor Sanh Thomas Wolf Alexander M. Rush 79 487 0 15 May 2020
Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora Extraction C. España-Bonet Alberto Barrón-Cedeño Lluís Marquez 22 9 0 03 May 2020
Reinforcement Learning with Augmented Data Michael Laskin Kimin Lee Adam Stooke Lerrel Pinto Pieter Abbeel A. Srinivas OffRL 124 660 0 30 Apr 2020
Explainable Deep Learning: A Field Guide for the Uninitiated Gabrielle Ras Ning Xie Marcel van Gerven Derek Doran AAML XAI 111 379 0 30 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey Konstantinos Benidis Syama Sundar Rangapuram Valentin Flunkert Bernie Wang Danielle C. Maddix ... David Salinas Lorenzo Stella François-Xavier Aubet Laurent Callot Tim Januschowski AI4TS 99 200 0 21 Apr 2020
Experience Grounds Language Yonatan Bisk Ari Holtzman Jesse Thomason Jacob Andreas Yoshua Bengio ... Angeliki Lazaridou Jonathan May Aleksandr Nisnevich Nicolas Pinto Joseph P. Turian 102 360 0 21 Apr 2020
Improving Readability for Automatic Speech Recognition Transcription Junwei Liao Sefik Emre Eskimez Liyang Lu Yu Shi Ming Gong Linjun Shou Hong Qu Michael Zeng 67 56 0 09 Apr 2020
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation Dana Ruiter Josef van Genabith C. España-Bonet SSL 51 3 0 07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review Shervin Minaee Nal Kalchbrenner Min Zhang Narjes Nikzad M. Asgari-Chenaghlu Jianfeng Gao AILaw VLM AI4TS 116 1,113 0 06 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space Chunyuan Li Xiang Gao Yuan Li Baolin Peng Xiujun Li Yizhe Zhang Jianfeng Gao SSL DRL 86 182 0 05 Apr 2020
A Low-cost Fault Corrector for Deep Neural Networks through Range Restriction Zitao Chen Guanpeng Li Karthik Pattabiraman AAML AI4CE 87 17 0 30 Mar 2020
Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat Alina Arseniev-Koehler J. Foster 74 49 0 24 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai Xuanjing Huang LM&MA VLM 383 1,495 0 18 Mar 2020
ReZero is All You Need: Fast Convergence at Large Depth Thomas C. Bachlechner Bodhisattwa Prasad Majumder H. H. Mao G. Cottrell Julian McAuley AI4CE 89 282 0 10 Mar 2020
Teaching Temporal Logics to Neural Networks Christopher Hahn Frederik Schmitt Jens U. Kreber M. Rabe Bernd Finkbeiner NAI 109 67 0 06 Mar 2020
Iterative Averaging in the Quest for Best Test Error Diego Granziol Xingchen Wan Samuel Albanie Stephen J. Roberts 66 3 0 02 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks Chaoyue Liu Libin Zhu M. Belkin ODL 98 266 0 29 Feb 2020
On Biased Compression for Distributed Learning Aleksandr Beznosikov Samuel Horváth Peter Richtárik M. Safaryan 75 189 0 27 Feb 2020
A Primer in BERTology: What we know about how BERT works Anna Rogers Olga Kovaleva Anna Rumshisky OffRL 125 1,507 0 27 Feb 2020
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT Prakhar Ganesh Yao Chen Xin Lou Mohammad Ali Khan Yifan Yang Hassan Sajjad Preslav Nakov Deming Chen Marianne Winslett AI4CE 127 201 0 27 Feb 2020
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts Max Ryabinin Anton I. Gusev FedML 82 52 0 10 Feb 2020
Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data E. Steinberg Kenneth Jung Jason Alan Fries Conor K. Corbin Stephen Pfohl N. Shah 90 110 0 06 Jan 2020
Fast and energy-efficient neuromorphic deep learning with first-spike times Julian Goltz Laura Kriener A. Baumbach Sebastian Billaudelle O. Breitwieser ... Á. F. Kungl Walter Senn Johannes Schemmel K. Meier Mihai A. Petrovici 137 132 0 24 Dec 2019
Extending Machine Language Models toward Human-Level Language Understanding James L. McClelland Felix Hill Maja R. Rudolph Jason Baldridge Hinrich Schütze LRM 78 35 0 12 Dec 2019
Attentive Representation Learning with Adversarial Training for Short Text Clustering Wei Zhang Chao Dong Jianhua Yin Jianyong Wang 60 13 0 08 Dec 2019
Blockwise Self-Attention for Long Document Understanding J. Qiu Hao Ma Omer Levy Scott Yih Sinong Wang Jie Tang 109 254 0 07 Nov 2019
Discovering the Compositional Structure of Vector Representations with Role Learning Networks Paul Soulos R. Thomas McCoy Tal Linzen P. Smolensky CoGe 127 44 0 21 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay John Chen Cameron R. Wolfe Zhaoqi Li Anastasios Kyrillidis ODL 106 15 0 11 Oct 2019
On the adequacy of untuned warmup for adaptive optimization Jerry Ma Denis Yarats 106 70 0 09 Oct 2019
DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators Lu Lu Pengzhan Jin George Karniadakis 248 2,171 0 08 Oct 2019
Soft-Label Dataset Distillation and Text Dataset Distillation Ilia Sucholutsky Matthias Schonlau DD 141 135 0 06 Oct 2019
Distributed Learning of Deep Neural Networks using Independent Subnet Training John Shelton Hyatt Cameron R. Wolfe Michael Lee Yuxin Tang Anastasios Kyrillidis Christopher M. Jermaine OOD 92 39 0 04 Oct 2019
Semi-supervised Thai Sentence Segmentation Using Local and Distant Word Representations Chanatip Saetia Ekapol Chuangsuwanich Tawunrat Chalothorn P. Vateekul 72 5 0 04 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods Aditya Mogadala M. Kalimuthu Dietrich Klakow VLM 141 136 0 22 Jul 2019
Norms for Beneficial A.I.: A Computational Analysis of the Societal Value Alignment Problem Pedro M. Fernandes Francisco C. Santos Manuel Lopes 35 11 0 26 Jun 2019
Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation? Tianxing He Jingzhao Zhang Zhiming Zhou James R. Glass 99 32 0 25 May 2019
An Information Theoretic Interpretation to Deep Neural Networks Shao-Lun Huang Xiangxiang Xu Lizhong Zheng G. Wornell FAtt 90 44 0 16 May 2019
An Attentive Survey of Attention Models S. Chaudhari Varun Mithal Gungor Polatkan R. Ramanath 192 664 0 05 Apr 2019
A Brain-inspired Algorithm for Training Highly Sparse Neural Networks Zahra Atashgahi Joost Pieterse Shiwei Liu Decebal Constantin Mocanu Raymond N. J. Veldhuis Mykola Pechenizkiy 74 15 0 17 Mar 2019
Investigating Antigram Behaviour using Distributional Semantics Saptarshi Sengupta 31 0 0 15 Jan 2019
Automated Machine Learning: From Principles to Practices Quanming Yao Mengshuo Wang Hugo Jair Escalante Huan Zhao Qiang Yang 105 259 0 31 Oct 2018
Deep Learning for Genomics: A Concise Overview Tianwei Yue Yuanxin Wang Longxiang Zhang Chunming Gu Haohan Wang Wenping Wang Qi Lyu Yujie Dun AILaw VLM BDL 86 91 0 02 Feb 2018
Quantifying the probable approximation error of probabilistic inference programs Marco F. Cusumano-Towner Vikash K. Mansinghka 100 5 0 31 May 2016