Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.11670
Cited By
v1
v2 (latest)
GIO: Gradient Information Optimization for Training Dataset Selection
20 June 2023
Dante Everaert
Christopher Potts
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GIO: Gradient Information Optimization for Training Dataset Selection"
28 / 28 papers shown
Title
MixMin: Finding Data Mixtures via Convex Minimization
Anvith Thudi
Evianne Rovers
Yangjun Ruan
Tristan Thrush
Chris J. Maddison
101
0
0
14 Feb 2025
Improving Pretraining Data Using Perplexity Correlations
Tristan Thrush
Christopher Potts
Tatsunori Hashimoto
92
22
0
09 Sep 2024
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
123
196
0
06 Feb 2023
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
100
444
0
29 Jun 2022
Dataset Pruning: Reducing Training Data by Examining Generalization Influence
Shuo Yang
Zeke Xie
Hanyu Peng
Minjing Xu
Mingming Sun
P. Li
DD
228
114
0
19 May 2022
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Keshav Santhanam
Omar Khattab
Jon Saad-Falcon
Christopher Potts
Matei A. Zaharia
112
417
0
02 Dec 2021
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
Xingcheng Yao
Yanan Zheng
Xiaocong Yang
Zhilin Yang
59
45
0
07 Nov 2021
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
Marzena Karpinska
Nader Akoury
Mohit Iyyer
281
108
0
14 Sep 2021
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
121
462
0
15 Jul 2021
What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus
A. Luccioni
J. Viviano
86
118
0
06 May 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
389
6,813
0
23 Dec 2020
NeuSpell: A Neural Spelling Correction Toolkit
Sai Muralidhar Jayanthi
Danish Pruthi
Graham Neubig
KELM
LRM
82
67
0
21 Oct 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
904
42,520
0
28 May 2020
MPNet: Masked and Permuted Pre-training for Language Understanding
Kaitao Song
Xu Tan
Tao Qin
Jianfeng Lu
Tie-Yan Liu
111
1,138
0
20 Apr 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
186
1,284
0
25 Feb 2020
Scalable and Generalizable Social Bot Detection through Data Selection
Kai-Cheng Yang
Onur Varol
Pik-Mai Hui
Filippo Menczer
68
327
0
20 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
119
658
0
01 Nov 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
266
10,880
0
29 Oct 2019
BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning
Andreas Kirsch
Joost R. van Amersfoort
Y. Gal
FedML
89
629
0
19 Jun 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
132
3,156
0
01 Apr 2019
Impact of Data Pruning on Machine Learning Algorithm Performance
Arun Thundyill Saseendran
Lovish Setia
V. Chhabria
D. Chakraborty
A. Roy
30
6
0
11 Jan 2019
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
808
132,725
0
12 Jun 2017
Spelling Correction as a Foreign Language
Yingbo Zhou
U. Porwal
Roberto Konow
45
21
0
21 May 2017
Deep Bayesian Active Learning with Image Data
Y. Gal
Riashat Islam
Zoubin Ghahramani
BDL
UQCV
75
1,739
0
08 Mar 2017
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
257
3,741
0
28 Feb 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
238
7,765
0
31 Aug 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,433
0
22 Dec 2014
1