ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 673 papers shown
Title
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal
  Image Generation
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Marco Bellagente
Manuel Brack
H. Teufel
Felix Friedrich
Bjorn Deiseroth
...
Koen Oostermeijer
Andres Felipe Cruz Salinas
P. Schramowski
Kristian Kersting
Samuel Weinbach
45
16
0
24 May 2023
In-Context Impersonation Reveals Large Language Models' Strengths and
  Biases
In-Context Impersonation Reveals Large Language Models' Strengths and Biases
Leonard Salewski
Stephan Alaniz
Isabel Rio-Torto
Eric Schulz
Zeynep Akata
44
151
0
24 May 2023
Text encoders bottleneck compositionality in contrastive vision-language
  models
Text encoders bottleneck compositionality in contrastive vision-language models
Amita Kamath
Jack Hessel
Kai-Wei Chang
CoGe
CLIP
VLM
30
19
0
24 May 2023
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic
  Correspondence
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence
Grace Luo
Lisa Dunlap
Dong Huk Park
Aleksander Holynski
Trevor Darrell
42
119
0
23 May 2023
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained
  Vision-Language Model
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
CLIP
VLM
28
25
0
23 May 2023
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes
  From Text-To-Image Models
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Y. Qu
Xinyue Shen
Xinlei He
Michael Backes
Savvas Zannettou
Yang Zhang
21
106
0
23 May 2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language,
  and Speech Data
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Ziyi Yang
Mahmoud Khademi
Yichong Xu
Reid Pryzant
Yuwei Fang
...
Yu Shi
Lu Yuan
Takuya Yoshioka
Michael Zeng
Xuedong Huang
17
2
0
21 May 2023
Data Redaction from Conditional Generative Models
Data Redaction from Conditional Generative Models
Zhifeng Kong
Kamalika Chaudhuri
KELM
21
7
0
18 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
48
115
0
18 May 2023
TextDiffuser: Diffusion Models as Text Painters
TextDiffuser: Diffusion Models as Text Painters
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
52
113
0
18 May 2023
OpenShape: Scaling Up 3D Shape Representation Towards Open-World
  Understanding
OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
Minghua Liu
Ruoxi Shi
Kaiming Kuang
Yinhao Zhu
Xuanlin Li
Shizhong Han
H. Cai
Fatih Porikli
Hao Su
3DPC
39
116
0
18 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
56
178
0
17 May 2023
Consensus and Subjectivity of Skin Tone Annotation for ML Fairness
Consensus and Subjectivity of Skin Tone Annotation for ML Fairness
Candice Schumann
Gbolahan O. Olanubi
Auriel Wright
Ellis P. Monk
Courtney Heldreth
Susanna Ricco
30
21
0
16 May 2023
Common Diffusion Noise Schedules and Sample Steps are Flawed
Common Diffusion Noise Schedules and Sample Steps are Flawed
Shanchuan Lin
Bingchen Liu
Jiashi Li
Xiao Yang
DiffM
17
202
0
15 May 2023
Self-Chained Image-Language Model for Video Localization and Question
  Answering
Self-Chained Image-Language Model for Video Localization and Question Answering
Shoubin Yu
Jaemin Cho
Prateek Yadav
Joey Tianyi Zhou
56
130
0
11 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through
  Multimodal Data Poisoning
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
35
71
0
07 May 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
137
3,055
0
14 Apr 2023
Memory Efficient Diffusion Probabilistic Models via Patch-based
  Generation
Memory Efficient Diffusion Probabilistic Models via Patch-based Generation
Shinei Arakawa
Hideki Tsunashima
Daichi Horita
Keitaro Tanaka
Shigeo Morishima
DiffM
16
3
0
14 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial
  Artificial Intelligence
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
35
123
0
13 Apr 2023
Expressive Text-to-Image Generation with Rich Text
Expressive Text-to-Image Generation with Rich Text
Songwei Ge
Taesung Park
Jun-Yan Zhu
Jia-Bin Huang
DiffM
79
78
0
13 Apr 2023
Control3Diff: Learning Controllable 3D Diffusion Models from Single-view
  Images
Control3Diff: Learning Controllable 3D Diffusion Models from Single-view Images
Jiatao Gu
Qingzhe Gao
Shuangfei Zhai
Baoquan Chen
Lingjie Liu
J. Susskind
41
29
0
13 Apr 2023
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image
  Models
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr
Pengzhan Sun
Xiaoqian Shen
Faizan Farooq Khan
Li Erran Li
Mohamed Elhoseiny
VLM
24
76
0
11 Apr 2023
Towards Real-time Text-driven Image Manipulation with Unconditional
  Diffusion Models
Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models
Nikita Starodubcev
Dmitry Baranchuk
Valentin Khrulkov
Artem Babenko
DiffM
49
4
0
10 Apr 2023
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions
Jun Chen
Deyao Zhu
Kilichbek Haydarov
Xiang Li
Mohamed Elhoseiny
36
37
0
09 Apr 2023
Exploring Vision-Language Models for Imbalanced Learning
Exploring Vision-Language Models for Imbalanced Learning
Yidong Wang
Zhuohao Yu
Jindong Wang
Qiang Heng
Haoxing Chen
Wei Ye
Rui Xie
Xingxu Xie
Shi-Bo Zhang
VLM
46
30
0
04 Apr 2023
Parents and Children: Distinguishing Multimodal DeepFakes from Natural
  Images
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
39
29
0
02 Apr 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
Ximeng Sun
Pengchuan Zhang
Peizhao Zhang
Hardik Shah
Kate Saenko
Xide Xia
VLM
25
20
0
31 Mar 2023
Trade-offs in Fine-tuned Diffusion Models Between Accuracy and
  Interpretability
Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability
Mischa Dombrowski
Hadrien Reynaud
Johanna P. Müller
Matthew Baugh
Bernhard Kainz
MedIm
24
6
0
31 Mar 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zhangyang Wang
Humphrey Shi
DiffM
46
175
0
30 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
55
226
0
28 Mar 2023
Anti-DreamBooth: Protecting users from personalized text-to-image
  synthesis
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
T. Le
Hao Phung
Thuan Hoang Nguyen
Quan Dao
Ngoc N. Tran
Anh Tran
28
92
0
27 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
81
470
0
27 Mar 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
36
960
0
27 Mar 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image Synthesis
Han Xue
Z. Huang
Qianru Sun
Li-Na Song
Wenjun Zhang
DiffM
17
62
0
25 Mar 2023
ReVersion: Diffusion-Based Relation Inversion from Images
ReVersion: Diffusion-Based Relation Inversion from Images
Ziqi Huang
Tianxing Wu
Yuming Jiang
Kelvin C. K. Chan
Ziwei Liu
51
67
0
23 Mar 2023
Medical diffusion on a budget: textual inversion for medical image
  generation
Medical diffusion on a budget: textual inversion for medical image generation
B. D. Wilde
A. Saha
R. T. Broek
Henkjan Huisman
DiffM
MedIm
42
15
0
23 Mar 2023
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing
  Diffusion Models
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
Jing Zhao
Heliang Zheng
Chaoyue Wang
L. Lan
Wenjing Yang
VLM
45
17
0
23 Mar 2023
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Lukas Höllein
Ang Cao
Andrew Owens
Justin Johnson
Matthias Nießner
DiffM
38
177
0
21 Mar 2023
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic
  Segmentation Using Diffusion Models
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
Weijia Wu
Yuzhong Zhao
Mike Zheng Shou
Hong Zhou
Chunhua Shen
50
140
0
21 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
42
20
0
17 Mar 2023
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
  Single Image using Diffusion Models
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
D. Kothandaraman
Dinesh Manocha
Ming Lin
Dinesh Manocha
26
5
0
15 Mar 2023
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D
  Generation
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo
Wooseok Jang
Minseop Kwak
Ines Hyeonsu Kim
Jaehoon Ko
Junho Kim
Jin-Hwa Kim
Jiyoung Lee
Seung Wook Kim
DiffM
46
135
0
14 Mar 2023
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases
Aengus Lynch
G. Dovonon
Jean Kaddour
Ricardo M. A. Silva
205
30
0
09 Mar 2023
Prismer: A Vision-Language Model with Multi-Task Experts
Prismer: A Vision-Language Model with Multi-Task Experts
Shikun Liu
Linxi Fan
Edward Johns
Zhiding Yu
Chaowei Xiao
Anima Anandkumar
VLM
MLLM
49
21
0
04 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
163
217
0
03 Mar 2023
Language-Driven Representation Learning for Robotics
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Percy Liang
LM&Ro
SSL
47
145
0
24 Feb 2023
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Cusuh Ham
James Hays
Jingwan Lu
Krishna Kumar Singh
Zhifei Zhang
Tobias Hinz
DiffM
21
24
0
24 Feb 2023
Poisoning Web-Scale Training Datasets is Practical
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini
Matthew Jagielski
Christopher A. Choquette-Choo
Daniel Paleka
Will Pearce
Hyrum S. Anderson
Andreas Terzis
Kurt Thomas
Florian Tramèr
SILM
31
182
0
20 Feb 2023
Composer: Creative and Controllable Image Synthesis with Composable
  Conditions
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Lianghua Huang
Di Chen
Yu Liu
Yujun Shen
Deli Zhao
Jingren Zhou
DiffM
22
279
0
20 Feb 2023
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal
Lior Yariv
Y. Lipman
Tali Dekel
45
365
1
16 Feb 2023
Previous
123...11121314
Next