ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 656 papers shown
Title
Avoid Wasted Annotation Costs in Open-set Active Learning with Pre-trained Vision-Language Model
Avoid Wasted Annotation Costs in Open-set Active Learning with Pre-trained Vision-Language Model
Jaehyuk Heo
Pilsung Kang
VLM
28
1
0
09 Aug 2024
ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative
  Generation
ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative Generation
Jack Lu
Ryan Teehan
Mengye Ren
DiffM
29
3
0
05 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
76
48
0
05 Aug 2024
ScalingGaussian: Enhancing 3D Content Creation with Generative Gaussian
  Splatting
ScalingGaussian: Enhancing 3D Content Creation with Generative Gaussian Splatting
Shen Chen
Jiale Zhou
Zhongyu Jiang
Tianfang Zhang
Zongkai Wu
Lei Li
Lei Li
3DGS
46
3
0
26 Jul 2024
Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints
Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints
Lei Guo
Wei Chen
Yuxuan Sun
Bo Ai
Nikolaos Pappas
T. Quek
DiffM
42
5
0
26 Jul 2024
Stretching Each Dollar: Diffusion Training from Scratch on a
  Micro-Budget
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
47
9
0
22 Jul 2024
HoloDreamer: Holistic 3D Panoramic World Generation from Text
  Descriptions
HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions
Haiyang Zhou
Xinhua Cheng
Wangbo Yu
Yonghong Tian
Li-ming Yuan
3DGS
DiffM
67
10
0
21 Jul 2024
Assessing Sample Quality via the Latent Space of Generative Models
Assessing Sample Quality via the Latent Space of Generative Models
Jingyi Xu
Hieu M. Le
Dimitris Samaras
MedIm
42
2
0
21 Jul 2024
Audio-visual training for improved grounding in video-text LLMs
Audio-visual training for improved grounding in video-text LLMs
Shivprasad Sagare
Hemachandran S
Kinshuk Sarabhai
Prashant Ullegaddi
SA Rajeshkumar
30
0
0
21 Jul 2024
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic
  Rewards via Failure Prompts
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
Yanting Yang
Minghao Chen
Qibo Qiu
Jiahao Wu
Wenxiao Wang
Binbin Lin
Ziyu Guan
Xiaofei He
LM&Ro
45
2
0
20 Jul 2024
Not All Noises Are Created Equally:Diffusion Noise Selection and
  Optimization
Not All Noises Are Created Equally:Diffusion Noise Selection and Optimization
Zipeng Qi
Lichen Bai
Haoyi Xiong
Zeke Xie
DiffM
39
18
0
19 Jul 2024
Learning Visual Grounding from Generative Vision and Language Model
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
36
5
0
18 Jul 2024
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
Huiguo He
Huan Yang
Zixi Tuo
Yuan Zhou
Qiuyue Wang
Yuhang Zhang
Zeyu Liu
Wenhao Huang
Hongyang Chao
Jian Yin
DiffM
VGen
62
12
0
17 Jul 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large
  Vision-Language Models
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLM
LRM
VLM
34
3
0
16 Jul 2024
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Yu-Guan Hsieh
Cheng-Yu Hsieh
Shih-Ying Yeh
Louis Béthune
Hadi Pour Ansari
Pavan Kumar Anasosalu Vasu
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Marco Cuturi
66
4
0
09 Jul 2024
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for
  Interleaved Image-Text Generation
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Ethan Chern
Jiadi Su
Yan Ma
Pengfei Liu
MLLM
29
29
0
08 Jul 2024
StyleShot: A Snapshot on Any Style
StyleShot: A Snapshot on Any Style
Junyao Gao
Yanchen Liu
Yanan Sun
Yinhao Tang
Yanhong Zeng
Kai Chen
Cairong Zhao
TTA
3DH
VLM
82
15
0
01 Jul 2024
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models
  via Counterfactual Probing
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing
Yisong Xiao
Aishan Liu
QianJia Cheng
Zhenfei Yin
Siyuan Liang
Jiapeng Li
Jing Shao
Xianglong Liu
Dacheng Tao
51
4
0
30 Jun 2024
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
Peng Dai
Feitong Tan
Qiangeng Xu
David Futschik
Ruofei Du
S. Fanello
Xiaojuan Qi
Yinda Zhang
VGen
25
4
0
29 Jun 2024
ScoreFusion: Fusing Score-based Generative Models via Kullback-Leibler Barycenters
ScoreFusion: Fusing Score-based Generative Models via Kullback-Leibler Barycenters
Hao Liu
Junze Tony Ye
Ye
Jose H. Blanchet
DiffM
FedML
36
1
0
28 Jun 2024
Curriculum Learning with Quality-Driven Data Selection
Curriculum Learning with Quality-Driven Data Selection
Biao Wu
Fang Meng
Ling-Hao Chen
36
2
0
27 Jun 2024
MATE: Meet At The Embedding -- Connecting Images with Long Texts
MATE: Meet At The Embedding -- Connecting Images with Long Texts
Young Kyun Jang
Junmo Kang
Yong Jae Lee
Donghyun Kim
VLM
44
5
0
26 Jun 2024
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
Xiangyu Zhao
Xiangtai Li
Haodong Duan
Haian Huang
Yining Li
Kai Chen
Hua Yang
VLM
MLLM
45
10
0
25 Jun 2024
Director3D: Real-world Camera Trajectory and 3D Scene Generation from
  Text
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
Xinyang Li
Zhangyu Lai
Linning Xu
Yansong Qu
Liujuan Cao
Shengchuan Zhang
Bo Dai
Rongrong Ji
VGen
58
8
0
25 Jun 2024
StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
Chongjie Ye
Lingteng Qiu
Xiaodong Gu
Qi Zuo
Yushuang Wu
Zilong Dong
Liefeng Bo
Yuliang Xiu
Xiaoguang Han
DiffM
43
40
0
24 Jun 2024
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback
  for Text-to-Image Generation
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation
Katherine M. Collins
Najoung Kim
Yonatan Bitton
Verena Rieser
Shayegan Omidshafiei
...
Gang Li
Adrian Weller
Junfeng He
Deepak Ramachandran
Krishnamurthy Dvijotham
EGVM
47
3
0
24 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
85
28
1
23 Jun 2024
A3D: Does Diffusion Dream about 3D Alignment?
A3D: Does Diffusion Dream about 3D Alignment?
Savva Ignatyev
Nina Konovalova
Daniil Selikhanovych
Nikolay Patakin
Nikolay Patakin
...
Anton Konushin
Peter Wonka
Alexander Filippov
Peter Wonka
Evgeny Burnaev
DiffM
68
0
0
21 Jun 2024
Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models
Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models
Jie Ren
Kangrui Chen
Yingqian Cui
Shenglai Zeng
Hui Liu
Yue Xing
Jiliang Tang
Lingjuan Lyu
53
1
0
21 Jun 2024
Younger: The First Dataset for Artificial Intelligence-Generated Neural
  Network Architecture
Younger: The First Dataset for Artificial Intelligence-Generated Neural Network Architecture
Zhengxin Yang
Wanling Gao
Luzhou Peng
Yunyou Huang
Fei Tang
Jianfeng Zhan
33
0
0
20 Jun 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
46
9
0
20 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
85
24
0
17 Jun 2024
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Robert Honig
Javier Rando
Nicholas Carlini
Florian Tramèr
WIGM
AAML
55
16
0
17 Jun 2024
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen
Lin Li
Yongqi Yang
Bin Wen
Fan Yang
Tingting Gao
Yu Wu
Long Chen
VLM
VGen
47
6
0
15 Jun 2024
Neural Pose Representation Learning for Generating and Transferring
  Non-Rigid Object Poses
Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
Seungwoo Yoo
Juil Koo
Kyeongmin Yeo
Minhyuk Sung
3DH
DRL
32
0
0
14 Jun 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
Rahmad Mahendra
Salsabil Maulana Akbar
Lester James V. Miranda
Jennifer Santoso
...
Genta Indra Winata
Ruochen Zhang
Fajri Koto
Zheng-Xin Yong
Samuel Cahyawijaya
95
9
0
14 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
44
35
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
60
85
0
11 Jun 2024
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal
  Large Language Models
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu
Zeyang Zhou
Kexin Huang
Dandan Liang
Yixu Wang
...
Keqing Wang
Yujiu Yang
Yan Teng
Yu Qiao
Yingchun Wang
ELM
50
13
0
11 Jun 2024
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with
  Foundation Models
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models
Athanasios Tragakis
Marco Aversa
Chaitanya Kaul
Roderick Murray-Smith
Daniele Faccio
57
2
0
11 Jun 2024
Haptic Repurposing with GenAI
Haptic Repurposing with GenAI
Haoyu Wang
44
0
0
11 Jun 2024
OVMR: Open-Vocabulary Recognition with Multi-Modal References
OVMR: Open-Vocabulary Recognition with Multi-Modal References
Zehong Ma
Shiliang Zhang
Longhui Wei
Qi Tian
VLM
44
0
0
07 Jun 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
65
32
0
07 Jun 2024
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and
  Effective for LMMs
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
Lingchen Meng
Jianwei Yang
Rui Tian
Xiyang Dai
Zuxuan Wu
Jianfeng Gao
Yu-Gang Jiang
VLM
30
9
0
06 Jun 2024
Bayesian Power Steering: An Effective Approach for Domain Adaptation of
  Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Ding Huang
Ting Li
Jian Huang
DiffM
46
1
0
06 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Y. Guo
VGen
104
16
0
06 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
62
16
0
06 Jun 2024
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Dmytro Kuzmenko
N. Shvai
LM&Ro
34
0
0
05 Jun 2024
Inv-Adapter: ID Customization Generation via Image Inversion and
  Lightweight Adapter
Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter
Peng-Fei Xing
Ning Wang
Jianbo Ouyang
Zechao Li
DiffM
44
1
0
05 Jun 2024
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Erik Landolsi
Fredrik Kahl
DiffM
58
1
0
05 Jun 2024
Previous
123...567...121314
Next