Papers citing "Language Models are Few-Shot Learners"

50 / 12,427 papers shown

Title
Fast Inference from Transformers via Speculative Decoding Yaniv Leviathan Matan Kalman Yossi Matias LRM 155 738 0 30 Nov 2022
Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification Jijie Wu Dongliang Chang Aneeshan Sain Xiaoxu Li Zhanyu Ma Jie Cao Jun Guo Yi-Zhe Song 78 37 0 30 Nov 2022
Data-driven Science and Machine Learning Methods in Laser-Plasma Physics Andreas Döpp C. Eberle S. Howard F. Irshad Jinpu Lin M. Streeter AI4CE 79 72 0 30 Nov 2022
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning Xiao Yu Qingyang Wu Kun Qian Zhou Yu OffRL 70 12 0 30 Nov 2022
Protein Language Models and Structure Prediction: Connection and Progression Bozhen Hu Jun Xia Jiangbin Zheng Cheng Tan Yufei Huang Yongjie Xu Stan Z. Li 70 41 0 30 Nov 2022
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off Shaoyi Huang Bowen Lei Dongkuan Xu Hongwu Peng Yue Sun Mimi Xie Caiwen Ding 105 19 0 30 Nov 2022
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation Vishnu Sashank Dorbala Gunnar Sigurdsson Robinson Piramuthu Jesse Thomason Gaurav Sukhatme LM&Ro 92 59 0 30 Nov 2022
Scalable Pathogen Detection from Next Generation DNA Sequencing with Deep Learning S. Narayanan Sathyanarayanan N. Aakur Priyadharsini Ramamurthy A. Bagavathi V. Ramnath A. Ramachandran 104 0 0 30 Nov 2022
Model Extraction Attack against Self-supervised Speech Models Tsung-Yuan Hsu Chen-An Li Tung-Yu Wu Hung-yi Lee 48 1 0 29 Nov 2022
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Trevor Gale Deepak Narayanan C. Young Matei A. Zaharia MoE 81 109 0 29 Nov 2022
What learning algorithm is in-context learning? Investigations with linear models Ekin Akyürek Dale Schuurmans Jacob Andreas Tengyu Ma Denny Zhou 125 493 0 28 Nov 2022
Is Conditional Generative Modeling all you need for Decision-Making? Anurag Ajay Yilun Du Abhi Gupta J. Tenenbaum Tommi Jaakkola Pulkit Agrawal DiffM 162 408 0 28 Nov 2022
Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation Sai Shashank Kalakonda Shubh Maheshwari Ravi Kiran Sarvadevabhatla 114 28 0 28 Nov 2022
On the Effectiveness of Parameter-Efficient Fine-Tuning Z. Fu Haoran Yang Anthony Man-Cho So Wai Lam Lidong Bing Nigel Collier 78 162 0 28 Nov 2022
A Survey on Conversational Search and Applications in Biomedicine Naga Sai Krishna Adatrao Gowtham Reddy Gadireddy Jiho Noh MedIm 64 4 0 28 Nov 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes Aviral Kumar Rishabh Agarwal Xinyang Geng George Tucker Sergey Levine OffRL 131 51 0 28 Nov 2022
MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection Yukang Chen Zhengzhe Liu Baoheng Zhang W. Fok Xiaojuan Qi Yik-Chung Wu 119 129 0 28 Nov 2022
Continuous diffusion for categorical data Sander Dieleman Laurent Sartran Arman Roshannai Nikolay Savinov Yaroslav Ganin ... Conor Durkan Curtis Hawthorne Rémi Leblond Will Grathwohl J. Adler DiffM 121 106 0 28 Nov 2022
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models Zhengfu He Tianxiang Sun Kuan-Chieh Wang Xuanjing Huang Xipeng Qiu DiffM VLM 101 131 0 28 Nov 2022
Fine-tuning language models to find agreement among humans with diverse preferences Michiel A. Bakker Martin Chadwick Hannah R. Sheahan Michael Henry Tessler Lucy Campbell-Gillingham ... Nat McAleese Amelia Glaese John Aslanides M. Botvinick Christopher Summerfield ALM 110 237 0 28 Nov 2022
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective Yu Zhao Huaming Du Qing Li Fuzhen Zhuang Ji Liu Gang Kou Gang Kou 157 1 0 28 Nov 2022
Understanding BLOOM: An empirical study on diverse NLP tasks Parag Dakle Sai Krishna Rallabandi Preethi Raghavan AI4CE 89 4 0 27 Nov 2022
Latent SHAP: Toward Practical Human-Interpretable Explanations Ron Bitton Alon Malach Amiel Meiseles Satoru Momiyama Toshinori Araki Jun Furukawa Yuval Elovici A. Shabtai FAtt 35 4 0 27 Nov 2022
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs Guangrun Wang Philip Torr 83 9 0 27 Nov 2022
How Crucial is Transformer in Decision Transformer? Max Siebenborn Boris Belousov Junning Huang Jan Peters 54 15 0 26 Nov 2022
Asymmetric Cross-Scale Alignment for Text-Based Person Search Zhong Ji Junhua Hu Deyin Liu Yuan Wu Ye Zhao 106 46 0 26 Nov 2022
Deep Learning Training Procedure Augmentations Cristian Simionescu 104 1 0 25 Nov 2022
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation Hiroki Furuta Yusuke Iwasawa Yutaka Matsuo S. Gu 79 17 0 25 Nov 2022
Solving math word problems with process- and outcome-based feedback J. Uesato Nate Kushman Ramana Kumar Francis Song Noah Y. Siegel L. Wang Antonia Creswell G. Irving I. Higgins FaML ReLM AIMat LRM 133 362 0 25 Nov 2022
GPT-3-driven pedagogical agents for training children's curious question-asking skills Rania Abdelghani Yen-Hsiang Wang Xingdi Yuan Tong Wang Pauline Lucas Hélene Sauzéon Pierre-Yves Oudeyer 118 108 0 25 Nov 2022
Semantic Table Detection with LayoutLMv3 Ivan Silajev Niels Victor Phillip Mortimer 48 1 0 25 Nov 2022
Testing the effectiveness of saliency-based explainability in NLP using randomized survey-based experiments Adel Rahimi Shaurya Jain FAtt 99 0 0 25 Nov 2022
Complementary Explanations for Effective In-Context Learning Xi Ye Srini Iyer Asli Celikyilmaz Ves Stoyanov Greg Durrett Ramakanth Pasunuru ReLM LRM 112 96 0 25 Nov 2022
TPA-Net: Generate A Dataset for Text to Physics-based Animation Yuxing Qiu Feng Gao Minchen Li Govind Thattai Yin Yang Chenfanfu Jiang PINN DiffM VGen 58 0 0 25 Nov 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism Xupeng Miao Yujie Wang Youhe Jiang Chunan Shi Xiaonan Nie Hailin Zhang Tengjiao Wang GNN MoE 110 64 0 25 Nov 2022
Signed Binary Weight Networks Sachit Kuhar Alexey Tumanov Judy Hoffman MQ 86 1 0 25 Nov 2022
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt Zhichao Yang Sunjae Kwon Zonghai Yao Hongfeng Yu 74 18 0 24 Nov 2022
Undesirable Biases in NLP: Addressing Challenges of Measurement Oskar van der Wal Dominik Bachmann Alina Leidinger L. Maanen Willem H. Zuidema K. Schulz 86 7 0 24 Nov 2022
Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes Yiqiao Jin Xiting Wang Y. Hao Yizhou Sun Xing Xie 94 11 0 24 Nov 2022
TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering Yueqing Sun Yu Zhang Le Qi Qi Shi ReLM RALM LRM 56 6 0 24 Nov 2022
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets David Venuto Sherry Yang Pieter Abbeel Doina Precup Igor Mordatch Ofir Nachum OffRL 56 5 0 23 Nov 2022
Automatic Generation of Socratic Subquestions for Teaching Math Word Problems Kumar Shridhar Jakub Macina Mennatallah El-Assady Tanmay Sinha Manu Kapur Mrinmaya Sachan AIMat 98 49 0 23 Nov 2022
Learning Regularized Positional Encoding for Molecular Prediction Xiang Gao Weihao Gao Wen Xiao Zhirui Wang Chong Wang Liang Xiang AI4CE 80 2 0 23 Nov 2022
Masked Autoencoding for Scalable and Generalizable Decision Making Fangchen Liu Hao Liu Aditya Grover Pieter Abbeel OffRL 87 49 0 23 Nov 2022
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies Weiyan Shi Emily Dinan Adithya Renduchintala Daniel Fried Athul Paul Jacob Zhou Yu M. Lewis AAML 111 2 0 22 Nov 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks Wenhu Chen Xueguang Ma Xinyi Wang William W. Cohen ReLM ReCod LRM 244 829 0 22 Nov 2022
Retrieval-Augmented Multimodal Language Modeling Michihiro Yasunaga Armen Aghajanyan Weijia Shi Rich James J. Leskovec Percy Liang M. Lewis Luke Zettlemoyer Wen-tau Yih RALM 104 108 0 22 Nov 2022
HyperTuning: Toward Adapting Large Language Models without Back-propagation Jason Phang Yi Mao Pengcheng He Weizhu Chen 96 34 0 22 Nov 2022
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions S. Bhattamishra Arkil Patel Varun Kanade Phil Blunsom 116 49 0 22 Nov 2022
Coreference Resolution through a seq2seq Transition-Based System Bernd Bohnet Chris Alberti Michael Collins 80 40 0 22 Nov 2022