Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.00196
Cited By
v1
v2 (latest)
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
1 December 2022
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Data-Efficient Finetuning Using Cross-Task Nearest Neighbors"
38 / 38 papers shown
Title
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
Liu Liu
...
Jianfeng Gao
Weizhu Chen
Shuaiqiang Wang
Simon Shaolei Du
Yelong Shen
OffRL
ReLM
LRM
304
47
0
29 Apr 2025
Compute-Constrained Data Selection
Junjie Oscar Yin
Alexander M. Rush
175
1
0
21 Oct 2024
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs
Jiancheng Dong
Lei Jiang
Wei Jin
Lu Cheng
103
1
0
18 Aug 2024
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
Weijia Shi
Julian Michael
Suchin Gururangan
Luke Zettlemoyer
RALM
VLM
88
32
0
27 May 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
92
31
0
25 May 2022
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Haokun Liu
Derek Tam
Mohammed Muqeeth
Jay Mohta
Tenghao Huang
Joey Tianyi Zhou
Colin Raffel
107
934
0
11 May 2022
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRM
RALM
84
49
0
17 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
125
861
0
16 Apr 2022
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
V. Aribandi
Yi Tay
Tal Schuster
J. Rao
H. Zheng
...
Jianmo Ni
Jai Gupta
Kai Hui
Sebastian Ruder
Donald Metzler
MoE
115
216
0
22 Nov 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
363
1,710
0
15 Oct 2021
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILaw
ELM
269
267
0
03 Oct 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
258
3,793
0
03 Sep 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
528
10,563
0
17 Jun 2021
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers
Pradeep Dasigi
Kyle Lo
Iz Beltagy
Arman Cohan
Noah A. Smith
Matt Gardner
RALM
125
311
0
07 May 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
623
4,099
0
18 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
102
100
0
16 Apr 2021
An Empirical Comparison of Instance Attribution Methods for NLP
Pouya Pezeshkpour
Sarthak Jain
Byron C. Wallace
Sameer Singh
TDI
117
35
0
09 Apr 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
93
2,234
0
11 Jan 2021
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
911
42,520
0
28 May 2020
UnifiedQA: Crossing Format Boundaries With a Single QA System
Daniel Khashabi
Sewon Min
Tushar Khot
Ashish Sabharwal
Oyvind Tafjord
Peter Clark
Hannaneh Hajishirzi
160
742
0
02 May 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
172
2,444
0
23 Apr 2020
Estimating Training Data Influence by Tracing Gradient Descent
G. Pruthi
Frederick Liu
Mukund Sundararajan
Satyen Kale
TDI
116
419
0
19 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
147
2,121
0
10 Feb 2020
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
179
842
0
01 Nov 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
137
1,012
0
31 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
521
20,376
0
23 Oct 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
191
2,532
0
19 May 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
281
2,327
0
02 May 2019
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
Dheeru Dua
Yizhong Wang
Pradeep Dasigi
Gabriel Stanovsky
Sameer Singh
Matt Gardner
AIMat
113
966
0
01 Mar 2019
Parameter-Efficient Transfer Learning for NLP
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
226
4,529
0
02 Feb 2019
Representer Point Selection for Explaining Deep Neural Networks
Chih-Kuan Yeh
Joon Sik Kim
Ian En-Hsu Yen
Pradeep Ravikumar
TDI
94
254
0
23 Nov 2018
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
Jason Phang
Thibault Févry
Samuel R. Bowman
111
470
0
02 Nov 2018
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
Mohammad Taher Pilehvar
Jose Camacho-Collados
209
493
0
28 Aug 2018
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
232
2,910
0
14 Mar 2017
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
282
3,746
0
28 Feb 2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
255
2,697
0
23 Jan 2017
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
Yury Malkov
Dmitry A. Yashunin
AI4TS
143
1,486
0
30 Mar 2016
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,433
0
22 Dec 2014
1