ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.07118
  4. Cited By
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners

It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

15 September 2020
Timo Schick
Hinrich Schütze
ArXivPDFHTML

Papers citing "It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners"

50 / 606 papers shown
Title
Interviewer-Candidate Role Play: Towards Developing Real-World NLP
  Systems
Interviewer-Candidate Role Play: Towards Developing Real-World NLP Systems
Neeraj Varshney
Swaroop Mishra
Chitta Baral
14
0
0
01 Jul 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
  Language Models
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
Robert L Logan IV
Ivana Balavzević
Eric Wallace
Fabio Petroni
Sameer Singh
Sebastian Riedel
VPVLM
39
207
0
24 Jun 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
13
808
0
22 Jun 2021
CPM-2: Large-scale Cost-effective Pre-trained Language Models
CPM-2: Large-scale Cost-effective Pre-trained Language Models
Zhengyan Zhang
Yuxian Gu
Xu Han
Shengqi Chen
Chaojun Xiao
...
Minlie Huang
Wentao Han
Yang Liu
Xiaoyan Zhu
Maosong Sun
MoE
37
86
0
20 Jun 2021
Label prompt for multi-label text classification
Label prompt for multi-label text classification
Rui Song
Xingbing Chen
Zelong Liu
Haining An
Zhiqi Zhang
Xiaoguang Wang
Hao Xu
VLM
26
4
0
18 Jun 2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge
  Bases
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Boxi Cao
Hongyu Lin
Xianpei Han
Le Sun
Lingyong Yan
M. Liao
Tong Xue
Jin Xu
21
130
0
17 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
  of Head and Prompt Tuning
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
24
97
0
17 Jun 2021
Question Answering Infused Pre-training of General-Purpose
  Contextualized Representations
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Robin Jia
M. Lewis
Luke Zettlemoyer
18
28
0
15 Jun 2021
An Empirical Survey of Data Augmentation for Limited Data Learning in
  NLP
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Jiaao Chen
Derek Tam
Colin Raffel
Joey Tianyi Zhou
Diyi Yang
28
172
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
40
815
0
14 Jun 2021
Zero-Shot Controlled Generation with Encoder-Decoder Transformers
Zero-Shot Controlled Generation with Encoder-Decoder Transformers
Devamanyu Hazarika
Mahdi Namazifar
Dilek Z. Hakkani-Tür
AI4CE
39
6
0
11 Jun 2021
A Semi-supervised Multi-task Learning Approach to Classify Customer
  Contact Intents
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
Li Dong
Matthew C. Spencer
Amir Biagi
21
3
0
10 Jun 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
21
428
0
24 May 2021
A brain basis of dynamical intelligence for AI and computational
  neuroscience
A brain basis of dynamical intelligence for AI and computational neuroscience
J. Monaco
Kanaka Rajan
Grace M. Hwang
AI4CE
26
6
0
15 May 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
35
183
0
29 Apr 2021
Text-to-Text Multi-view Learning for Passage Re-ranking
Text-to-Text Multi-view Learning for Passage Re-ranking
Jia-Huei Ju
Jheng-Hong Yang
Chuan-Ju Wang
AIMat
25
20
0
29 Apr 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
217
180
0
18 Apr 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
31
235
0
18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
279
1,124
0
18 Apr 2021
Constrained Language Models Yield Few-Shot Semantic Parsers
Constrained Language Models Yield Few-Shot Semantic Parsers
Richard Shin
C. H. Lin
Sam Thomson
Charles C. Chen
Subhro Roy
Emmanouil Antonios Platanios
Adam Pauls
Dan Klein
J. Eisner
Benjamin Van Durme
301
198
0
18 Apr 2021
Moving on from OntoNotes: Coreference Resolution Model Transfer
Moving on from OntoNotes: Coreference Resolution Model Transfer
Patrick Xia
Benjamin Van Durme
27
30
0
17 Apr 2021
Surface Form Competition: Why the Highest Probability Answer Isn't
  Always Right
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
Ari Holtzman
Peter West
Vered Schwartz
Yejin Choi
Luke Zettlemoyer
LRM
22
230
0
16 Apr 2021
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
  for Relation Extraction
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction
Xiang Chen
Ningyu Zhang
Xin Xie
Shumin Deng
Yunzhi Yao
Chuanqi Tan
Fei Huang
Luo Si
Huajun Chen
38
401
0
15 Apr 2021
Generating Datasets with Pretrained Language Models
Generating Datasets with Pretrained Language Models
Timo Schick
Hinrich Schütze
13
234
0
15 Apr 2021
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution
Ryuto Konno
Shun Kiyono
Yuichiroh Matsubayashi
Hiroki Ouchi
Kentaro Inui
9
10
0
15 Apr 2021
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts
Guanghui Qin
J. Eisner
7
533
0
14 Apr 2021
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Zexuan Zhong
Dan Friedman
Danqi Chen
14
403
0
12 Apr 2021
Adapting Language Models for Zero-shot Learning by Meta-tuning on
  Dataset and Prompt Collections
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Ruiqi Zhong
Kristy Lee
Zheng-Wei Zhang
Dan Klein
30
166
0
10 Apr 2021
What Will it Take to Fix Benchmarking in Natural Language Understanding?
What Will it Take to Fix Benchmarking in Natural Language Understanding?
Samuel R. Bowman
George E. Dahl
ELM
ALM
30
156
0
05 Apr 2021
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning
  Performance of GPT-2
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor Betz
Kyle Richardson
Christian Voigt
ReLM
LRM
21
30
0
24 Mar 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Joey Tianyi Zhou
Shashank Srivastava
Colin Raffel
13
149
0
22 Mar 2021
Using Molecular Embeddings in QSAR Modeling: Does it Make a Difference?
Using Molecular Embeddings in QSAR Modeling: Does it Make a Difference?
María Virginia Sabando
I. Ponzoni
E. Milios
Axel J. Soto
27
26
0
20 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
63
1,144
0
18 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
31
1,478
0
18 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
66
296
0
15 Mar 2021
Inductive Relation Prediction by BERT
Inductive Relation Prediction by BERT
H. Zha
Zhiyu Zoey Chen
Xifeng Yan
29
54
0
12 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly
  Exponentially with Depth
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
52
373
0
05 Mar 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
A Survey on Stance Detection for Mis- and Disinformation Identification
A Survey on Stance Detection for Mis- and Disinformation Identification
Momchil Hardalov
Arnav Arora
Preslav Nakov
Isabelle Augenstein
111
132
0
27 Feb 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
125
202
0
16 Feb 2021
Structured Prediction as Translation between Augmented Natural Languages
Structured Prediction as Translation between Augmented Natural Languages
Giovanni Paolini
Ben Athiwaratkun
Jason Krone
Jie Ma
Alessandro Achille
Rishita Anubhai
Cicero Nogueira dos Santos
Bing Xiang
Stefano Soatto
19
284
0
14 Jan 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Jeff Da
Ronan Le Bras
Ximing Lu
Yejin Choi
Antoine Bosselut
AI4MH
KELM
64
40
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
254
342
0
01 Jan 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,919
0
31 Dec 2020
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots
  Matters
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
Mengjie Zhao
Yi Zhu
Ehsan Shareghi
Ivan Vulić
Roi Reichart
Anna Korhonen
Hinrich Schütze
32
64
0
31 Dec 2020
Few-Shot Text Generation with Pattern-Exploiting Training
Few-Shot Text Generation with Pattern-Exploiting Training
Timo Schick
Hinrich Schütze
31
145
0
22 Dec 2020
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
13
385
0
14 Dec 2020
UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability
  Prediction with Multi-task Learning on Self-Supervised Annotations
UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations
Gabriele Sarti
11
5
0
10 Nov 2020
A Survey on Recent Approaches for Natural Language Processing in
  Low-Resource Scenarios
A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios
Michael A. Hedderich
Lukas Lange
Heike Adel
Jannik Strötgen
Dietrich Klakow
202
286
0
23 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream
  Tasks
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
20
86
0
07 Oct 2020
Previous
123...111213
Next