Papers citing "Language Models are Few-Shot Learners"

50 / 11,513 papers shown

Title
Exploring the limits of Concurrency in ML Training on Google TPUs Sameer Kumar James Bradbury C. Young Yu Emma Wang Anselm Levskaya ... Tao Wang Tayo Oguntebi Yazhou Zu Yuanzhong Xu Andy Swing BDL AIMat MoE LRM 25 27 0 07 Nov 2020
Machine Generation and Detection of Arabic Manipulated and Fake News El Moatez Billah Nagoudi AbdelRahim Elmadany Muhammad Abdul-Mageed Tariq Alhindi H. Cavusoglu DeLMO 24 50 0 05 Nov 2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation Chunting Zhou Graham Neubig Jiatao Gu Mona T. Diab P. Guzmán Luke Zettlemoyer Marjan Ghazvininejad HILM 39 195 0 05 Nov 2020
Rearrangement: A Challenge for Embodied AI Dhruv Batra Angel X. Chang Sonia Chernova Andrew J. Davison Jia Deng ... Jitendra Malik Igor Mordatch Roozbeh Mottaghi Manolis Savva Hao Su LM&Ro 38 217 0 03 Nov 2020
Emergent Communication Pretraining for Few-Shot Machine Translation Yaoyiran Li Edoardo Ponti Ivan Vulić Anna Korhonen 25 19 0 02 Nov 2020
Melody-Conditioned Lyrics Generation with SeqGANs Yihao Chen Alexander Lerch GAN MGen 32 29 0 28 Oct 2020
Scaling Laws for Autoregressive Generative Modeling T. Henighan Jared Kaplan Mor Katz Mark Chen Christopher Hesse ... Nick Ryder Daniel M. Ziegler John Schulman Dario Amodei Sam McCandlish 53 408 0 28 Oct 2020
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks Jianfei Chen Yujie Gai Z. Yao Michael W. Mahoney Joseph E. Gonzalez MQ 20 58 0 27 Oct 2020
Dutch Humor Detection by Generating Negative Examples Thomas Winters Pieter Delobelle 19 10 0 26 Oct 2020
Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification Timo Schick Helmut Schmid Hinrich Schütze VLM 19 206 0 26 Oct 2020
Pre-trained Summarization Distillation Sam Shleifer Alexander M. Rush 26 98 0 24 Oct 2020
Text Editing by Command Felix Faltings Michel Galley Gerold Hintz Chris Brockett Chris Quirk Jianfeng Gao Bill Dolan KELM 147 37 0 24 Oct 2020
Rethinking embedding coupling in pre-trained language models Hyung Won Chung Thibault Févry Henry Tsai Melvin Johnson Sebastian Ruder 95 142 0 24 Oct 2020
Text Style Transfer: A Review and Experimental Evaluation Zhiqiang Hu Roy Ka-wei Lee Charu C. Aggarwal Aston Zhang AI4TS 42 26 0 24 Oct 2020
An Evaluation Protocol for Generative Conversational Systems Seolhwa Lee Heuiseok Lim Jo˜ao Sedoc ELM 35 10 0 24 Oct 2020
Learning to Recognize Dialect Features Dorottya Demszky D. Sharma J. Clark Vinodkumar Prabhakaran Jacob Eisenstein 123 38 0 23 Oct 2020
Long Document Ranking with Query-Directed Sparse Transformer Jyun-Yu Jiang Chenyan Xiong Chia-Jung Lee Wei Wang 33 25 0 23 Oct 2020
On the Transformer Growth for Progressive BERT Training Xiaotao Gu Liyuan Liu Hongkun Yu Jing Li Chong Chen Jiawei Han VLM 69 51 0 23 Oct 2020
The Turking Test: Can Language Models Understand Instructions? Avia Efrat Omer Levy ELM LRM 34 96 0 22 Oct 2020
Language Models are Open Knowledge Graphs Chenguang Wang Xiao Liu D. Song SSL KELM 26 135 0 22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives Chu-cheng Lin Aaron Jaech Xin Li Matthew R. Gormley Jason Eisner 29 58 0 22 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 41 39,428 0 22 Oct 2020
AdapterDrop: On the Efficiency of Adapters in Transformers Andreas Rucklé Gregor Geigle Max Glockner Tilman Beck Jonas Pfeiffer Nils Reimers Iryna Gurevych 57 255 0 22 Oct 2020
Is Retriever Merely an Approximator of Reader? Sohee Yang Minjoon Seo RALM 24 39 0 21 Oct 2020
Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation Laurel J. Orr Megan Leszczynski Simran Arora Sen Wu Neel Guha Xiao Ling Christopher Ré 143 48 0 20 Oct 2020
Local Knowledge Powered Conversational Agents Sashank Santhanam Ming-Yu Liu Raul Puri M. Shoeybi M. Patwary Bryan Catanzaro 29 4 0 20 Oct 2020
Neural Language Modeling for Contextualized Temporal Graph Generation Aman Madaan Yiming Yang 45 20 0 20 Oct 2020
Optimism in the Face of Adversity: Understanding and Improving Deep Learning through Adversarial Robustness Guillermo Ortiz-Jiménez Apostolos Modas Seyed-Mohsen Moosavi-Dezfooli P. Frossard AAML 31 48 0 19 Oct 2020
Consistency and Coherency Enhanced Story Generation Wei Wang Piji Li Haitao Zheng 30 11 0 17 Oct 2020
For self-supervised learning, Rationality implies generalization, provably Yamini Bansal Gal Kaplun Boaz Barak OOD SSL 60 22 0 16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers Preetum Nakkiran Behnam Neyshabur Hanie Sedghi OffRL 29 11 0 16 Oct 2020
Masked Contrastive Representation Learning for Reinforcement Learning Jinhua Zhu Yingce Xia Lijun Wu Jiajun Deng Wen-gang Zhou Tao Qin Houqiang Li SSL OffRL 34 55 0 15 Oct 2020
Neural Databases James Thorne Majid Yazdani Marzieh Saeidi Fabrizio Silvestri Sebastian Riedel A. Halevy NAI 34 9 0 14 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond Jimmy J. Lin Rodrigo Nogueira Andrew Yates VLM 244 612 0 13 Oct 2020
MixCo: Mix-up Contrastive Learning for Visual Representation Sungnyun Kim Gihun Lee Sangmin Bae Seyoung Yun SSL 112 80 0 13 Oct 2020
Improving Text Generation with Student-Forcing Optimal Transport Guoyin Wang Chunyuan Li Jianqiao Li Hao Fu Yuh-Chen Lin ... Ruiyi Zhang Wenlin Wang Dinghan Shen Qian Yang Lawrence Carin OT 30 17 0 12 Oct 2020
Neural, Symbolic and Neural-Symbolic Reasoning on Knowledge Graphs Jing Zhang Bo Chen Lingxi Zhang Xirui Ke Haipeng Ding NAI 40 3 0 12 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering Giannis Daras Nikita Kitaev Augustus Odena A. Dimakis 31 44 0 11 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks Nikunj Saunshi Sadhika Malladi Sanjeev Arora 31 87 0 07 Oct 2020
A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments M. Golzadeh Alexandre Decan Damien Legay T. Mens 31 73 0 07 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components Junwen Bai Weiran Wang Yingbo Zhou Caiming Xiong SSL AI4TS 27 12 0 07 Oct 2020
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective Wei Ping Shuohang Wang Yu Cheng Zhe Gan R. Jia Bo-wen Li Jingjing Liu AAML 46 113 0 05 Oct 2020
Local Label Point Correction for Edge Detection of Overlapping Cervical Cells Jiawei Liu Huijie Fan Qiang Wang Wentao Li Yandong Tang Danbo Wang Mingyi Zhou Li Chen 13 9 0 05 Oct 2020
PMI-Masking: Principled masking of correlated spans Yoav Levine Barak Lenz Opher Lieber Omri Abend Kevin Leyton-Brown Moshe Tennenholtz Y. Shoham 22 72 0 05 Oct 2020
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models Thuy-Trang Vu Dinh Q. Phung Gholamreza Haffari 19 24 0 05 Oct 2020
Data-Efficient Pretraining via Contrastive Self-Supervision Nils Rethmeier Isabelle Augenstein 28 20 0 02 Oct 2020
Where Does Trust Break Down? A Quantitative Trust Analysis of Deep Neural Networks via Trust Matrix and Conditional Trust Densities Andrew Hryniowski Xiao Yu Wang A. Wong 25 10 0 30 Sep 2020
Understanding Human Intelligence through Human Limitations Thomas Griffiths 28 64 0 29 Sep 2020
Utility is in the Eye of the User: A Critique of NLP Leaderboards Kawin Ethayarajh Dan Jurafsky ELM 24 51 0 29 Sep 2020
From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data Weiran Yao Sean Qian 19 47 0 29 Sep 2020