ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.15613
  4. Cited By
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based
  Approach

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

24 May 2024
Huy V. Vo
Vasil Khalidov
Timothée Darcet
Théo Moutakanni
Nikita Smetanin
Marc Szafraniec
Hugo Touvron
Camille Couprie
Maxime Oquab
Armand Joulin
Hervé Jégou
Patrick Labatut
Piotr Bojanowski
    SSL
ArXivPDFHTML

Papers citing "Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach"

50 / 56 papers shown
Title
Demystifying CLIP Data
Demystifying CLIP Data
Hu Xu
Saining Xie
Xiaoqing Ellen Tan
Po-Yao (Bernie) Huang
Russell Howes
Vasu Sharma
Shang-Wen Li
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLM
CLIP
90
121
0
31 Dec 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
168
4
0
30 Sep 2024
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
376
2,382
0
09 Nov 2022
Open Long-Tailed Recognition in a Dynamic World
Open Long-Tailed Recognition in a Dynamic World
Ziwei Liu
Zhongqi Miao
Xiaohang Zhan
Jiayun Wang
Boqing Gong
Stella X. Yu
VLM
78
21
0
17 Aug 2022
Active Learning Strategies for Weakly-supervised Object Detection
Active Learning Strategies for Weakly-supervised Object Detection
Huy V. Vo
Oriane Siméoni
Spyros Gidaris
Andrei Bursuc
Patrick Pérez
Jean Ponce
89
19
0
25 Jul 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
85
439
0
29 Jun 2022
Masked Siamese Networks for Label-Efficient Learning
Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
84
320
0
14 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
449
6,222
0
05 Apr 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
187
1,944
0
29 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
811
12,893
0
04 Mar 2022
A Self-Supervised Descriptor for Image Copy Detection
A Self-Supervised Descriptor for Image Copy Detection
Ed Pizzi
Sreya . Dutta Roy
Sugosh Nagavara Ravindra
Priya Goyal
Matthijs Douze
SSL
66
125
0
21 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated
  Images Without Supervision
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLM
SSL
74
111
0
16 Feb 2022
Fairness Indicators for Systematic Assessments of Visual Feature
  Extractors
Fairness Indicators for Systematic Assessments of Visual Feature Extractors
Priya Goyal
Adriana Romero Soriano
C. Hazirbas
Levent Sagun
Nicolas Usunier
EGVM
50
31
0
15 Feb 2022
PASS: An ImageNet replacement for self-supervised pretraining without
  humans
PASS: An ImageNet replacement for self-supervised pretraining without humans
Yuki M. Asano
Christian Rupprecht
Andrew Zisserman
Andrea Vedaldi
VLM
SSL
77
58
0
27 Sep 2021
Deep Learning on a Data Diet: Finding Important Examples Early in
  Training
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
105
457
0
15 Jul 2021
Divide and Contrast: Self-supervised Learning from Uncurated Data
Divide and Contrast: Self-supervised Learning from Uncurated Data
Yonglong Tian
Olivier J. Hénaff
Aaron van den Oord
SSL
110
100
0
17 May 2021
VICReg: Variance-Invariance-Covariance Regularization for
  Self-Supervised Learning
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
Adrien Bardes
Jean Ponce
Yann LeCun
SSL
DML
149
933
0
11 May 2021
With a Little Help from My Friends: Nearest-Neighbor Contrastive
  Learning of Visual Representations
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations
Debidatta Dwibedi
Y. Aytar
Jonathan Tompson
P. Sermanet
Andrew Zisserman
SSL
228
467
0
29 Apr 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
650
6,059
0
29 Apr 2021
Benchmarking Representation Learning for Natural World Image Collections
Benchmarking Representation Learning for Natural World Image Collections
Grant Van Horn
Elijah Cole
Sara Beery
Kimberly Wilber
Serge J. Belongie
Oisin Mac Aodha
SSL
VLM
62
173
0
30 Mar 2021
Active Learning for Deep Object Detection via Probabilistic Modeling
Active Learning for Deep Object Detection via Probabilistic Modeling
Jiwoong Choi
Ismail Elezi
Hyuk-Jae Lee
C. Farabet
J. Álvarez
50
123
0
30 Mar 2021
Vision Transformers for Dense Prediction
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
128
1,729
0
24 Mar 2021
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Jure Zbontar
Li Jing
Ishan Misra
Yann LeCun
Stéphane Deny
SSL
300
2,343
0
04 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
389
4,937
0
24 Feb 2021
BYOL works even without batch statistics
BYOL works even without batch statistics
Pierre Harvey Richemond
Jean-Bastien Grill
Florent Altché
Corentin Tallec
Florian Strub
...
Samuel L. Smith
Soham De
Razvan Pascanu
Bilal Piot
Michal Valko
SSL
286
115
0
20 Oct 2020
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Vitaly Feldman
Chiyuan Zhang
TDI
140
462
0
09 Aug 2020
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution
  Generalization
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
Dan Hendrycks
Steven Basart
Norman Mu
Saurav Kadavath
Frank Wang
...
Samyak Parajuli
Mike Guo
D. Song
Jacob Steinhardt
Justin Gilmer
OOD
320
1,732
0
29 Jun 2020
Unsupervised Learning of Visual Features by Contrasting Cluster
  Assignments
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
Mathilde Caron
Ishan Misra
Julien Mairal
Priya Goyal
Piotr Bojanowski
Armand Joulin
OCL
SSL
218
4,073
0
17 Jun 2020
Bootstrap your own latent: A new approach to self-supervised Learning
Bootstrap your own latent: A new approach to self-supervised Learning
Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre Harvey Richemond
...
M. G. Azar
Bilal Piot
Koray Kavukcuoglu
Rémi Munos
Michal Valko
SSL
355
6,792
0
13 Jun 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
350
18,739
0
13 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
557
4,797
0
23 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
439
42,393
0
03 Dec 2019
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
187
12,073
0
13 Nov 2019
Self-labelling via simultaneous clustering and representation learning
Self-labelling via simultaneous clustering and representation learning
Yuki M. Asano
Christian Rupprecht
Andrea Vedaldi
SSL
118
770
0
13 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
81
654
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
404
20,114
0
23 Oct 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.2K
12,181
0
27 Aug 2019
Natural Adversarial Examples
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
D. Song
OODD
195
1,469
0
16 Jul 2019
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Jordan T. Ash
Chicheng Zhang
A. Krishnamurthy
John Langford
Alekh Agarwal
BDL
UQCV
85
772
0
09 Jun 2019
Learning Robust Global Representations by Penalizing Local Predictive
  Power
Learning Robust Global Representations by Penalizing Local Predictive Power
Haohan Wang
Songwei Ge
Eric Xing
Zachary Chase Lipton
OOD
114
957
0
29 May 2019
Billion-scale semi-supervised learning for image classification
Billion-scale semi-supervised learning for image classification
I. Z. Yalniz
Hervé Jégou
Kan Chen
Manohar Paluri
D. Mahajan
SSL
90
463
0
02 May 2019
Large-Scale Long-Tailed Recognition in an Open World
Large-Scale Long-Tailed Recognition in an Open World
Ziwei Liu
Zhongqi Miao
Xiaohang Zhan
Jiayun Wang
Boqing Gong
Stella X. Yu
145
1,158
0
10 Apr 2019
Benchmarking Neural Network Robustness to Common Corruptions and
  Perturbations
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
Dan Hendrycks
Thomas G. Dietterich
OOD
VLM
167
3,429
0
28 Mar 2019
Do ImageNet Classifiers Generalize to ImageNet?
Do ImageNet Classifiers Generalize to ImageNet?
Benjamin Recht
Rebecca Roelofs
Ludwig Schmidt
Vaishaal Shankar
OOD
SSeg
VLM
111
1,714
0
13 Feb 2019
Diverse mini-batch Active Learning
Diverse mini-batch Active Learning
Fedor Zhdanov
57
155
0
17 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network
  Learning
An Empirical Study of Example Forgetting during Deep Neural Network Learning
Mariya Toneva
Alessandro Sordoni
Rémi Tachet des Combes
Adam Trischler
Yoshua Bengio
Geoffrey J. Gordon
107
733
0
12 Dec 2018
Active Learning for Deep Object Detection
Active Learning for Deep Object Detection
C. Brust
Christoph Käding
Joachim Denzler
VLM
ObjD
48
118
0
26 Sep 2018
Unsupervised Feature Learning via Non-Parametric Instance-level
  Discrimination
Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination
Zhirong Wu
Yuanjun Xiong
Stella X. Yu
Dahua Lin
SSL
170
3,452
0
05 May 2018
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Filip Radenovic
Ahmet Iscen
Giorgos Tolias
Yannis Avrithis
Ondřej Chum
49
379
0
29 Mar 2018
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for
  Reading Comprehension
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
197
2,646
0
09 May 2017
12
Next