ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 664 papers shown
Title
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and
  Effective for LMMs
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
Lingchen Meng
Jianwei Yang
Rui Tian
Xiyang Dai
Zuxuan Wu
Jianfeng Gao
Yu-Gang Jiang
VLM
30
9
0
06 Jun 2024
Bayesian Power Steering: An Effective Approach for Domain Adaptation of
  Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Ding Huang
Ting Li
Jian Huang
DiffM
46
1
0
06 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Y. Guo
VGen
104
16
0
06 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
62
16
0
06 Jun 2024
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Dmytro Kuzmenko
N. Shvai
LM&Ro
34
0
0
05 Jun 2024
Inv-Adapter: ID Customization Generation via Image Inversion and
  Lightweight Adapter
Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter
Peng-Fei Xing
Ning Wang
Jianbo Ouyang
Zechao Li
DiffM
44
1
0
05 Jun 2024
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Erik Landolsi
Fredrik Kahl
DiffM
58
1
0
05 Jun 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang
Wufei Ma
Angtian Wang
Shuo Chen
Adam Kortylewski
Alan Yuille
34
3
0
02 Jun 2024
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Nicolas Dufour
Victor Besnier
Vicky Kalogeiton
David Picard
DiffM
61
2
0
30 May 2024
CLIPLoss and Norm-Based Data Selection Methods for Multimodal
  Contrastive Learning
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang
Yifang Chen
Wendan Yan
Alex Fang
Wenjing Zhou
Kevin G. Jamieson
S. Du
36
7
0
29 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Ming-Yu Liu
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
45
30
0
29 May 2024
Multi-Modal Generative Embedding Model
Multi-Modal Generative Embedding Model
Feipeng Ma
Hongwei Xue
Guangting Wang
Yizhou Zhou
Fengyun Rao
Shilin Yan
Yueyi Zhang
Siying Wu
Mike Zheng Shou
Xiaoyan Sun
VLM
39
3
0
29 May 2024
Does Diffusion Beat GAN in Image Super Resolution?
Does Diffusion Beat GAN in Image Super Resolution?
Denis Kuznedelev
Valerii Startsev
Daniil Shlenskii
Sergey Kastryulin
44
4
0
27 May 2024
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Edison Marrese-Taylor
Hamed Damirchi
Anton Van Den Hengel
VLM
43
1
0
27 May 2024
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
F. Babiloni
Alexandros Lattas
Jiankang Deng
S. Zafeiriou
DiffM
35
4
0
26 May 2024
Pruning for Robust Concept Erasing in Diffusion Models
Pruning for Robust Concept Erasing in Diffusion Models
Tianyun Yang
Juan Cao
Chang Xu
35
13
0
26 May 2024
Composed Image Retrieval for Remote Sensing
Composed Image Retrieval for Remote Sensing
Bill Psomas
Ioannis Kakogeorgiou
Nikos Efthymiadis
Giorgos Tolias
Ondřej Chum
Yannis Avrithis
Konstantinos Karantzalos
48
5
0
24 May 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision
  Models
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLM
LRM
41
18
0
24 May 2024
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via
  Selective Tensor Freezing
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
Kai Huang
Wei Gao
42
2
0
24 May 2024
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Yongliang Wu
Shiji Zhou
Mingzhuo Yang
Lianzhe Wang
Wenbo Zhu
Heng Chang
Xiao Zhou
Xu Yang
Xu Yang
61
19
0
24 May 2024
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
Run Luo
Yunshui Li
Longze Chen
Wanwei He
Ting-En Lin
...
Zikai Song
Xiaobo Xia
Tongliang Liu
Min Yang
Binyuan Hui
VLM
DiffM
75
15
0
24 May 2024
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image
  Analysis
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Yue Yang
Mona Gandhi
Yufei Wang
Yifan Wu
Michael S. Yao
Christopher Callison-Burch
James C. Gee
Mark Yatskar
58
3
0
23 May 2024
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Jiaqi Li
Qianshan Wei
Chuanyi Zhang
Guilin Qi
Miaozeng Du
Yongrui Chen
Sheng Bi
Fan Liu
VLM
MU
81
12
0
21 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
62
261
0
16 May 2024
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
Hanshu Yan
Xingchao Liu
Jiachun Pan
Jun Hao Liew
Qiang Liu
Jiashi Feng
42
41
0
13 May 2024
Fractals as Pre-training Datasets for Anomaly Detection and Localization
Fractals as Pre-training Datasets for Anomaly Detection and Localization
C. Ugwu
S. Casarin
Oswald Lanz
32
0
0
11 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
44
2
0
11 May 2024
Distilling Diffusion Models into Conditional GANs
Distilling Diffusion Models into Conditional GANs
Minguk Kang
Richard Zhang
Connelly Barnes
Sylvain Paris
Suha Kwak
Jaesik Park
Eli Shechtman
Jun-Yan Zhu
Taesung Park
46
37
0
09 May 2024
A Survey on Personalized Content Synthesis with Diffusion Models
A Survey on Personalized Content Synthesis with Diffusion Models
Xu-Lu Zhang
Xiao Wei
Wengyu Zhang
Jinlin Wu
Zhaoxiang Zhang
Zhen Lei
Qing Li
Zhen Lei
Qing Li
EGVM
143
19
0
09 May 2024
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
Honghua Chen
Chen Change Loy
Xingang Pan
39
13
0
05 May 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
41
7
0
02 May 2024
DOCCI: Descriptions of Connected and Contrasting Images
DOCCI: Descriptions of Connected and Contrasting Images
Yasumasa Onoe
Sunayana Rane
Zachary Berger
Yonatan Bitton
Jaemin Cho
...
Zarana Parekh
Jordi Pont-Tuset
Garrett Tanzer
Su Wang
Jason Baldridge
41
48
0
30 Apr 2024
TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image
  Generation with Diffusion Models
TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models
Teng Zhou
Yongchuan Tang
DiffM
48
2
0
30 Apr 2024
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
An Yan
Zhengyuan Yang
Junda Wu
Wanrong Zhu
Jianwei Yang
...
K. Lin
Jianfeng Wang
Julian McAuley
Jianfeng Gao
Lijuan Wang
LRM
34
12
0
25 Apr 2024
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion
  Models
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
Qinghe Wang
Baolu Li
Xiaomin Li
Bing Cao
Liqian Ma
Huchuan Lu
Xu Jia
DiffM
42
6
0
24 Apr 2024
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster
  Pre-training on Web-scale Image-Text Data
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Sachin Mehta
Maxwell Horton
Fartash Faghri
Mohammad Hossein Sekhavat
Mahyar Najibi
Mehrdad Farajtabar
Oncel Tuzel
Mohammad Rastegari
VLM
CLIP
44
6
0
24 Apr 2024
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
Bo Lin
Yingjing Xu
Xuanwen Bao
Zhou Zhao
Zuyong Zhang
Zhouyang Wang
61
2
0
23 Apr 2024
MultiBooth: Towards Generating All Your Concepts in an Image from Text
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Chenyang Zhu
Kai Li
Yue Ma
Chunming He
Li Xiu
DiffM
109
23
0
22 Apr 2024
Iteratively Prompting Multimodal LLMs to Reproduce Natural and
  AI-Generated Images
Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Ali Naseh
Katherine Thai
Mohit Iyyer
Amir Houmansadr
47
6
0
21 Apr 2024
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
  Synthesis
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Yuxi Ren
Xin Xia
Yanzuo Lu
Jiacheng Zhang
Jie Wu
Pan Xie
Xing Wang
Xuefeng Xiao
45
65
0
21 Apr 2024
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than
  We Think
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think
Haotian Xue
Yongxin Chen
DiffM
AAML
43
3
0
20 Apr 2024
Adaptive Memory Replay for Continual Learning
Adaptive Memory Replay for Continual Learning
James Seale Smith
Lazar Valkov
Shaunak Halbe
V. Gutta
Rogerio Feris
Z. Kira
Leonid Karlinsky
44
6
0
18 Apr 2024
MMInA: Benchmarking Multihop Multimodal Internet Agents
MMInA: Benchmarking Multihop Multimodal Internet Agents
Ziniu Zhang
Shulin Tian
Liangyu Chen
Ziwei Liu
LLMAG
LM&Ro
35
13
0
15 Apr 2024
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion
  Models
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
Nithin Gopalakrishnan Nair
Jeya Maria Jose Valanarasu
Vishal M. Patel
MoMe
33
7
0
15 Apr 2024
Knowledge-enhanced Visual-Language Pretraining for Computational
  Pathology
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou
Xiaoman Zhang
Chaoyi Wu
Ya Zhang
Weidi Xie
Yanfeng Wang
VLM
35
7
0
15 Apr 2024
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Cheng Zhang
Qianyi Wu
Camilo Cruz Gambardella
Xiaoshui Huang
Dinh Q. Phung
Wanli Ouyang
Jianfei Cai
MDE
21
8
0
11 Apr 2024
Implicit and Explicit Language Guidance for Diffusion-based Visual
  Perception
Implicit and Explicit Language Guidance for Diffusion-based Visual Perception
Hefeng Wang
Jiale Cao
Jin Xie
Aiping Yang
Yanwei Pang
VLM
DiffM
50
2
0
11 Apr 2024
DreamView: Injecting View-specific Text Guidance into Text-to-3D
  Generation
DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation
Junkai Yan
Yipeng Gao
Q. Yang
Xihan Wei
Xuansong Xie
Ancong Wu
Wei-Shi Zheng
40
1
0
09 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
81
33
0
07 Apr 2024
SDFR: Synthetic Data for Face Recognition Competition
SDFR: Synthetic Data for Face Recognition Competition
Hatef Otroshi-Shahreza
Christophe Ecabert
Anjith George
A. Unnervik
S´ebastien Marcel
...
R. Vera-Rodríguez
Gianpaolo Perelli
G. Orrú
G. L. Marcialis
Julian Fierrez
38
19
0
06 Apr 2024
Previous
123...678...121314
Next