ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 656 papers shown
Title
Exploring Local Memorization in Diffusion Models via Bright Ending Attention
Exploring Local Memorization in Diffusion Models via Bright Ending Attention
Chong Chen
Daochang Liu
M. Shah
Chang Xu
65
3
0
29 Oct 2024
Investigating Memorization in Video Diffusion Models
Investigating Memorization in Video Diffusion Models
Chong Chen
Enhuai Liu
Daochang Liu
M. Shah
Chang Xu
VGen
DiffM
83
1
0
29 Oct 2024
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?
Antonia Wüst
Tim Nelson Tobiasch
Lukas Helff
Inga Ibs
Wolfgang Stammer
Devendra Singh Dhami
Constantin Rothkopf
Kristian Kersting
CoGe
ReLM
VLM
LRM
71
1
0
25 Oct 2024
Fast constrained sampling in pre-trained diffusion models
Fast constrained sampling in pre-trained diffusion models
Alexandros Graikos
Nebojsa Jojic
Dimitris Samaras
DiffM
30
1
0
24 Oct 2024
Probabilistic Language-Image Pre-Training
Probabilistic Language-Image Pre-Training
Sanghyuk Chun
Wonjae Kim
Song Park
Sangdoo Yun
MLLM
VLM
CLIP
161
4
2
24 Oct 2024
Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation
Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation
Xiaoyu Zhang
Teng Zhou
Xinlong Zhang
Jia Wei
Yongchuan Tang
44
1
0
24 Oct 2024
TIPS: Text-Image Pretraining with Spatial awareness
TIPS: Text-Image Pretraining with Spatial awareness
Kevis-Kokitsi Maninis
Kaifeng Chen
Soham Ghosh
Arjun Karpur
Koert Chen
...
Jan Dlabal
Dan Gnanapragasam
Mojtaba Seyedhosseini
Howard Zhou
Andre Araujo
VLM
35
3
0
21 Oct 2024
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
Baiqi Li
Zhiqiu Lin
Wenxuan Peng
Jean de Dieu Nyandwi
Daniel Jiang
Zixian Ma
Simran Khanuja
Ranjay Krishna
Graham Neubig
Deva Ramanan
AAML
CoGe
VLM
71
21
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
TDI
DiffM
75
4
0
17 Oct 2024
Sensitivity of Generative VLMs to Semantically and Lexically Altered
  Prompts
Sensitivity of Generative VLMs to Semantically and Lexically Altered Prompts
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLM
26
2
0
16 Oct 2024
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Guiyu Zhang
Huan-ang Gao
Zijian Jiang
Hao Zhao
Zhedong Zheng
EGVM
52
6
0
15 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
53
4
0
15 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
72
4
0
14 Oct 2024
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Kun Ding
Qiang Yu
Haojian Zhang
Gaofeng Meng
Shiming Xiang
VLM
32
0
0
11 Oct 2024
Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism
  via Dual Diffusion Models and GPT Prompting
Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting
Purushothaman Natarajan
Kamal Basha
Athira Nambiar
DiffM
32
0
0
11 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
69
14
0
10 Oct 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLM
MLLM
67
25
0
10 Oct 2024
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through
  Data, Reward, and Conditional Guidance Design
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Jiachen Li
Qian Long
Jian Zheng
Xiaofeng Gao
Robinson Piramuthu
Wenhu Chen
William Yang Wang
VGen
33
22
0
08 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
66
66
0
08 Oct 2024
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
Saemi Moon
M. Lee
Sangdon Park
Dongwoo Kim
44
1
0
08 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza
Mengjie Zhao
Zhuoyuan Mao
Sivan Doveh
Wei Lin
...
Yuki Mitsufuji
Horst Possegger
Rogerio Feris
Leonid Karlinsky
James Glass
VLM
84
1
0
08 Oct 2024
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise
Yepeng Liu
Yiren Song
Hai Ci
Yu Zhang
Haofan Wang
Mike Zheng Shou
Yuheng Bu
WIGM
58
3
0
07 Oct 2024
VISTA: A Visual and Textual Attention Dataset for Interpreting
  Multimodal Models
VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models
Harshit
Tolga Tasdizen
CoGe
VLM
28
1
0
06 Oct 2024
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
Zichen Miao
Zhengyuan Yang
Kevin Lin
Ze Wang
Zicheng Liu
Lijuan Wang
Qiang Qiu
48
3
0
04 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
56
23
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
136
15
0
03 Oct 2024
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
Wanpeng Zhang
Zilong Xie
Yicheng Feng
Yijiang Li
Xingrun Xing
Sipeng Zheng
Zongqing Lu
MLLM
30
0
0
03 Oct 2024
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
Chenyou Fan
Chenjia Bai
Zhao Shan
Haoran He
Yang Zhang
Zhen Wang
33
3
0
30 Sep 2024
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He
Haodong Li
Wei Yin
Yixun Liang
Leheng Li
Kaiqiang Zhou
Hongbo Zhang
Bingbing Liu
Ying-Cong Chen
DiffM
VLM
49
40
0
26 Sep 2024
Stable Video Portraits
Stable Video Portraits
Mirela Ostrek
Justus Thies
VGen
DiffM
33
1
0
26 Sep 2024
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
Yao Ni
Shan Zhang
Piotr Koniusz
175
2
0
25 Sep 2024
VLEU: a Method for Automatic Evaluation for Generalizability of
  Text-to-Image Models
VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models
Jingtao Cao
Zheng Zhang
Hongru Wang
Kam-Fai Wong
39
0
0
23 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
63
10
0
23 Sep 2024
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval
A. Mahbod
Nematollah Saeidi
Sepideh Hatamikia
Ramona Woitek
VLM
MedIm
31
2
0
14 Sep 2024
Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights
Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights
Dixi Yao
20
1
0
13 Sep 2024
ComAlign: Compositional Alignment in Vision-Language Models
ComAlign: Compositional Alignment in Vision-Language Models
Ali Abdollah
Amirmohammad Izadi
Armin Saghafian
Reza Vahidimajd
Mohammad Mozafari
Amirreza Mirzaei
Mohammadmahdi Samiei
M. Baghshah
CoGe
VLM
30
0
0
12 Sep 2024
FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent
  Noising-and-Denoising Process
FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process
Yang Luo
Y. Zhang
Zhaofan Qiu
Ting Yao
Zhineng Chen
Yu-Gang Jiang
Tao Mei
DiffM
43
4
0
11 Sep 2024
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Buhua Liu
Shitong Shao
Bao Li
Lichen Bai
Zhiqiang Xu
Haoyi Xiong
James Kwok
Sumi Helal
Zeke Xie
45
12
0
11 Sep 2024
DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement
DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement
Jia-Wei Liao
Winston Wang
Tzu-Sian Wang
Li-Xuan Peng
Ju-Hsuan Weng
Cheng-Fu Chou
Jun-Cheng Chen
DiffM
51
1
0
10 Sep 2024
Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance
Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance
Quang-Huy Che
Duc-Tri Le
Vinh-Tiep Nguyen
D. Lam
Vinh-Tiep Nguyen
DiffM
44
1
0
09 Sep 2024
A Novel Dataset for Video-Based Autism Classification Leveraging
  Extra-Stimulatory Behavior
A Novel Dataset for Video-Based Autism Classification Leveraging Extra-Stimulatory Behavior
Manuel Serna-Aguilera
Xuan-Bac Nguyen
Han-Seok Seo
Khoa Luu
49
1
0
06 Sep 2024
iSeg: An Iterative Refinement-based Framework for Training-free
  Segmentation
iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Lin Sun
Jiale Cao
J. Xie
Fahad Shahbaz Khan
Yanwei Pang
DiffM
43
1
0
05 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-xiong Wang
78
15
0
05 Sep 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur
Darshan Singh
Makarand Tapaswi
142
1
0
04 Sep 2024
Optimizing CLIP Models for Image Retrieval with Maintained
  Joint-Embedding Alignment
Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment
Konstantin Schall
Kai Uwe Barthel
Nico Hezel
Klaus Jung
VLM
36
3
0
03 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
37
7
0
31 Aug 2024
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi
Fuxiao Liu
Shihao Wang
Shijia Liao
Subhashree Radhakrishnan
...
Andrew Tao
Andrew Tao
Zhiding Yu
Guilin Liu
Guilin Liu
MLLM
33
53
0
28 Aug 2024
The Benefits of Balance: From Information Projections to Variance Reduction
The Benefits of Balance: From Information Projections to Variance Reduction
Lang Liu
Ronak R. Mehta
Soumik Pal
Zaïd Harchaoui
33
0
0
27 Aug 2024
Perception-guided Jailbreak against Text-to-Image Models
Perception-guided Jailbreak against Text-to-Image Models
Yihao Huang
Le Liang
Tianlin Li
Xiaojun Jia
Run Wang
Weikai Miao
G. Pu
Yang Liu
41
7
0
20 Aug 2024
Understanding Generative AI Content with Embedding Models
Understanding Generative AI Content with Embedding Models
Max Vargas
Reilly Cannon
A. Engel
Anand D. Sarwate
Tony Chiang
54
3
0
19 Aug 2024
Previous
123456...121314
Next