Papers citing "Language Models are Few-Shot Learners"

50 / 12,243 papers shown

Title
Learning to Recognize Dialect Features Dorottya Demszky D. Sharma J. Clark Vinodkumar Prabhakaran Jacob Eisenstein 216 39 0 23 Oct 2020
Long Document Ranking with Query-Directed Sparse Transformer Jyun-Yu Jiang Chenyan Xiong Chia-Jung Lee Wei Wang 71 25 0 23 Oct 2020
Robust Document Representations using Latent Topics and Metadata Natraj Raman Armineh Nourbakhsh Sameena Shah Manuela Veloso 21 0 0 23 Oct 2020
On the Transformer Growth for Progressive BERT Training Xiaotao Gu Liyuan Liu Hongkun Yu Jing Li Chong Chen Jiawei Han VLM 120 54 0 23 Oct 2020
An Analysis of LIME for Text Data Dina Mardaoui Damien Garreau FAtt 187 45 0 23 Oct 2020
Towards Zero-Shot Multilingual Synthetic Question and Answer Generation for Cross-Lingual Reading Comprehension Siamak Shakeri Noah Constant Mihir Kale Linting Xue SyDa 72 28 0 22 Oct 2020
The Turking Test: Can Language Models Understand Instructions? Avia Efrat Omer Levy ELM LRM 114 96 0 22 Oct 2020
Language Models are Open Knowledge Graphs Chenguang Wang Xiao Liu Basel Alomair SSL KELM 79 137 0 22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives Chu-cheng Lin Aaron Jaech Xin Li Matthew R. Gormley Jason Eisner 86 63 0 22 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 702 41,681 0 22 Oct 2020
AdapterDrop: On the Efficiency of Adapters in Transformers Andreas Rucklé Gregor Geigle Max Glockner Tilman Beck Jonas Pfeiffer Nils Reimers Iryna Gurevych 125 267 0 22 Oct 2020
Incorporating Stylistic Lexical Preferences in Generative Language Models Hrituraj Singh Gaurav Verma Balaji Vasan Srinivasan 23 5 0 22 Oct 2020
MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation Junkun Chen Mingbo Ma Renjie Zheng Liang Huang 90 21 0 22 Oct 2020
Is Retriever Merely an Approximator of Reader? Sohee Yang Minjoon Seo RALM 85 42 0 21 Oct 2020
Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition Anand Panchbhai Tommaso Soru Edgard Marx 18 5 0 21 Oct 2020
Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation Laurel J. Orr Megan Leszczynski Simran Arora Sen Wu Neel Guha Xiao Ling Christopher Ré 209 48 0 20 Oct 2020
Local Knowledge Powered Conversational Agents Sashank Santhanam Ming-Yu Liu Raul Puri Mohammad Shoeybi M. Patwary Bryan Catanzaro 93 4 0 20 Oct 2020
Neural Language Modeling for Contextualized Temporal Graph Generation Aman Madaan Yiming Yang 101 20 0 20 Oct 2020
Optimism in the Face of Adversity: Understanding and Improving Deep Learning through Adversarial Robustness Guillermo Ortiz-Jiménez Apostolos Modas Seyed-Mohsen Moosavi-Dezfooli P. Frossard AAML 121 48 0 19 Oct 2020
Consistency and Coherency Enhanced Story Generation Wei Wang Piji Li Haitao Zheng 71 11 0 17 Oct 2020
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding Yanru Qu Dinghan Shen Yelong Shen Sandra Sajeev Jiawei Han Weizhu Chen 204 69 0 16 Oct 2020
Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data Mao Ye Dhruv Choudhary Jiecao Yu Ellie Wen Zeliang Chen Jiyan Yang Jongsoo Park Qiang Liu A. Kejariwal 56 9 0 16 Oct 2020
Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models Peter West Ximing Lu Ari Holtzman Chandra Bhagavatula Jena D. Hwang Yejin Choi OffRL 59 13 0 16 Oct 2020
An Approximation Algorithm for Optimal Subarchitecture Extraction Adrian de Wynter 70 1 0 16 Oct 2020
For self-supervised learning, Rationality implies generalization, provably Yamini Bansal Gal Kaplun Boaz Barak OOD SSL 112 22 0 16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers Preetum Nakkiran Behnam Neyshabur Hanie Sedghi OffRL 97 11 0 16 Oct 2020
PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music Hongru Liang Wenqiang Lei P. Chan Zhenglu Yang Maosong Sun Tat-Seng Chua 64 41 0 16 Oct 2020
Masked Contrastive Representation Learning for Reinforcement Learning Jinhua Zhu Yingce Xia Lijun Wu Jiajun Deng Wen-gang Zhou Tao Qin Houqiang Li SSL OffRL 110 60 0 15 Oct 2020
Decoding Methods for Neural Narrative Generation Alexandra DeLucia Aaron Mueller Xiang Lisa Li João Sedoc 62 26 0 14 Oct 2020
Explaining Creative Artifacts Lav Varshney Nazneen Rajani R. Socher 129 2 0 14 Oct 2020
Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries Xiaofei Sun Zijun Sun Yuxian Meng Jiwei Li Chun Fan 59 20 0 14 Oct 2020
Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search Gyuwan Kim Kyunghyun Cho 94 98 0 14 Oct 2020
Neural Databases James Thorne Majid Yazdani Marzieh Saeidi Fabrizio Silvestri Sebastian Riedel A. Halevy NAI 96 9 0 14 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond Jimmy J. Lin Rodrigo Nogueira Andrew Yates VLM 387 628 0 13 Oct 2020
MixCo: Mix-up Contrastive Learning for Visual Representation Sungnyun Kim Gihun Lee Sangmin Bae Seyoung Yun SSL 167 81 0 13 Oct 2020
Improving Text Generation with Student-Forcing Optimal Transport Guoyin Wang Chunyuan Li Jianqiao Li Hao Fu Yuh-Chen Lin ... Ruiyi Zhang Wenlin Wang Dinghan Shen Qian Yang Lawrence Carin OT 78 18 0 12 Oct 2020
COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Jena D. Hwang Chandra Bhagavatula Ronan Le Bras Jeff Da Keisuke Sakaguchi Antoine Bosselut Yejin Choi 81 415 0 12 Oct 2020
Neural, Symbolic and Neural-Symbolic Reasoning on Knowledge Graphs Jing Zhang Bo Chen Lingxi Zhang Xirui Ke Haipeng Ding NAI 97 3 0 12 Oct 2020
Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) Alex Warstadt Yian Zhang Haau-Sing Li Haokun Liu Samuel R. Bowman SSL AI4CE 78 21 0 11 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering Giannis Daras Nikita Kitaev Augustus Odena A. Dimakis 101 46 0 11 Oct 2020
What causes the test error? Going beyond bias-variance via ANOVA Licong Lin Yan Sun 93 34 0 11 Oct 2020
AutoQA: From Databases To QA Semantic Parsers With Only Synthetic Training Data Silei Xu Sina J. Semnani Giovanni Campagna M. Lam 76 52 0 09 Oct 2020
On the importance of pre-training data volume for compact language models Vincent Micheli Martin d'Hoffschmidt Franccois Fleuret 67 42 0 08 Oct 2020
AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models Amrit Nagarajan Sanchari Sen Jacob R. Stevens A. Raghunathan 18 3 0 07 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks Nikunj Saunshi Sadhika Malladi Sanjeev Arora 87 89 0 07 Oct 2020
A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments M. Golzadeh Alexandre Decan Damien Legay T. Mens 57 78 0 07 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components Junwen Bai Weiran Wang Yingbo Zhou Caiming Xiong SSL AI4TS 75 12 0 07 Oct 2020
A Closer Look at Codistillation for Distributed Training Shagun Sodhani Olivier Delalleau Mahmoud Assran Koustuv Sinha Nicolas Ballas Michael G. Rabbat 129 8 0 06 Oct 2020
A Transformer-based Framework for Multivariate Time Series Representation Learning George Zerveas Srideepika Jayaraman Dhaval Patel A. Bhamidipaty Carsten Eickhoff AI4TS 109 940 0 06 Oct 2020
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective Wei Ping Shuohang Wang Yu Cheng Zhe Gan R. Jia Yue Liu Jingjing Liu AAML 215 116 0 05 Oct 2020