ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02114
  4. Cited By
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

3 November 2021
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs"

50 / 1,097 papers shown
Title
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning
  via Image-Guided Diffusion
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion
Yijun Liang
Shweta Bhardwaj
Dinesh Manocha
45
0
0
17 Oct 2024
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image
  Generation
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Dewei Zhou
Ji Xie
Zongxin Yang
Yi Yang
DiffM
70
7
0
16 Oct 2024
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Shicheng Xu
Liang Pang
Yunchang Zhu
Huawei Shen
Xueqi Cheng
MLLM
38
1
0
16 Oct 2024
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
Guiyu Zhang
Huan-ang Gao
Zijian Jiang
Hao Zhao
Zhedong Zheng
EGVM
52
6
0
15 Oct 2024
Learning to Customize Text-to-Image Diffusion In Diverse Context
Learning to Customize Text-to-Image Diffusion In Diverse Context
Taewook Kim
Wei Chen
Qiang Qiu
DiffM
38
2
0
14 Oct 2024
Enhancing Single Image to 3D Generation using Gaussian Splatting and
  Hybrid Diffusion Priors
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors
Hritam Basak
Hadi Tabatabaee
Shreekant Gayaka
Ming-feng Li
Xin Yang
Cheng-Hao Kuo
Arnie Sen
Min Sun
Zhaozheng Yin
3DGS
33
0
0
12 Oct 2024
Dynamic Multimodal Evaluation with Flexible Complexity by
  Vision-Language Bootstrapping
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
Yue Yang
S. Zhang
Wenqi Shao
Kaipeng Zhang
Yi Bin
Yu Wang
Ping Luo
30
3
0
11 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with
  Mask Referring Modeling
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
34
5
0
10 Oct 2024
Invisibility Cloak: Disappearance under Human Pose Estimation via
  Backdoor Attacks
Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks
Minxing Zhang
Michael Backes
Xiao Zhang
AAML
34
0
0
10 Oct 2024
Deciphering Cross-Modal Alignment in Large Vision-Language Models with
  Modality Integration Rate
Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate
Qidong Huang
Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Jiaqi Wang
Dahua Lin
Weiming Zhang
Nenghai Yu
57
5
0
09 Oct 2024
Enhancing Vision-Language Model Pre-training with Image-text Pair
  Pruning Based on Word Frequency
Enhancing Vision-Language Model Pre-training with Image-text Pair Pruning Based on Word Frequency
Mingliang Liang
Martha Larson
VLM
CLIP
26
0
0
09 Oct 2024
HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric
  Understanding
HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding
Keliang Li
Zaifei Yang
Jiahe Zhao
Hongze Shen
Ruibing Hou
Hong Chang
Shiguang Shan
Xilin Chen
VLM
31
0
0
09 Oct 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
51
48
0
08 Oct 2024
LoTLIP: Improving Language-Image Pre-training for Long Text
  Understanding
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Wei Wu
Kecheng Zheng
Shuailei Ma
Fan Lu
Yuxin Guo
Yifei Zhang
Wei Chen
Qingpei Guo
Yujun Shen
Zheng-Jun Zha
VLM
32
9
0
07 Oct 2024
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data
  Influence Estimation
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation
Xinyu Zhou
Simin Fan
Martin Jaggi
TDI
31
0
0
07 Oct 2024
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image
  Classification
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification
Benjamin Feuer
Jiawei Xu
Niv Cohen
Patrick Yubeaton
Govind Mittal
Chinmay Hegde
26
1
0
07 Oct 2024
AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
Jiaming Zhang
Junhong Ye
Xingjun Ma
Yige Li
Yunfan Yang
Jitao Sang
Dit-Yan Yeung
Dit-Yan Yeung
AAML
VLM
36
0
0
07 Oct 2024
VISTA: A Visual and Textual Attention Dataset for Interpreting
  Multimodal Models
VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models
Harshit
Tolga Tasdizen
CoGe
VLM
28
1
0
06 Oct 2024
The Visualization JUDGE : Can Multimodal Foundation Models Guide
  Visualization Design Through Visual Perception?
The Visualization JUDGE : Can Multimodal Foundation Models Guide Visualization Design Through Visual Perception?
Matthew Berger
Shusen Liu
33
1
0
05 Oct 2024
Real-World Benchmarks Make Membership Inference Attacks Fail on
  Diffusion Models
Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models
Chumeng Liang
Jiaxuan You
40
0
0
04 Oct 2024
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal
  Foundation Models
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Zhengfeng Lai
Vasileios Saveris
Chia-Ju Chen
Hong-You Chen
Haotian Zhang
...
Wenze Hu
Zhe Gan
Peter Grasch
Meng Cao
Yinfei Yang
VLM
38
3
0
03 Oct 2024
Harnessing the Latent Diffusion Model for Training-Free Image Style
  Transfer
Harnessing the Latent Diffusion Model for Training-Free Image Style Transfer
Kento Masui
Mayu Otani
Masahiro Nomura
Hideki Nakayama
DiffM
27
1
0
02 Oct 2024
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common
  Sense Reasoning
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning
Niki Maria Foteinopoulou
Enjie Ghorbel
Djamila Aouada
30
2
0
01 Oct 2024
Optimising EEG decoding with refined sampling and multimodal feature
  integration
Optimising EEG decoding with refined sampling and multimodal feature integration
Arash Akbarinia
25
0
0
30 Sep 2024
SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning
  for Surgical Phase Recognition
SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition
Shu Yang
Zhiyuan Cai
Luyang Luo
Ning Ma
Shuchang Xu
Hao Chen
27
0
0
30 Sep 2024
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion
  Models
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models
Jangyeong Kim
Donggoo Kang
Junyoung Choi
Jeonga Wi
Junho Gwon
Jiun Bae
Dumim Yoon
Junghyun Han
DiffM
39
1
0
30 Sep 2024
Textual Training for the Hassle-Free Removal of Unwanted Visual Data:
  Case Studies on OOD and Hateful Image Detection
Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection
Saehyung Lee
J. Mok
Sangha Park
Yongho Shin
Dahuin Jung
Sungroh Yoon
27
0
0
30 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry
  Enhancement
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Zhehao Huang
Xinwen Cheng
JingHao Zheng
Haoran Wang
Zhengbao He
Tao Li
X. Huang
MU
47
5
0
29 Sep 2024
Conditional Image Synthesis with Diffusion Models: A Survey
Conditional Image Synthesis with Diffusion Models: A Survey
Zheyuan Zhan
Defang Chen
Jian-Ping Mei
Zhenghe Zhao
Jiawei Chen
Chun Chen
Siwei Lyu
Can Wang
VLM
45
5
0
28 Sep 2024
Amodal Instance Segmentation with Diffusion Shape Prior Estimation
Amodal Instance Segmentation with Diffusion Shape Prior Estimation
Minh Tran
Khoa T. Vo
Tri Nguyen
Ngan Le
DiffM
34
0
0
26 Sep 2024
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity
  Refiner
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
Wenliang Zhao
Minglei Shi
Xumin Yu
Jie Zhou
Jiwen Lu
37
0
0
26 Sep 2024
Adversarial Backdoor Defense in CLIP
Adversarial Backdoor Defense in CLIP
Junhao Kuang
Siyuan Liang
Jiawei Liang
Kuanrong Liu
Xiaochun Cao
AAML
36
2
0
24 Sep 2024
Zero-Shot Detection of AI-Generated Images
Zero-Shot Detection of AI-Generated Images
D. Cozzolino
Giovanni Poggi
Matthias Nießner
L. Verdoliva
50
11
0
24 Sep 2024
Understanding Implosion in Text-to-Image Generative Models
Understanding Implosion in Text-to-Image Generative Models
Wenxin Ding
Cathy Y. Li
Shawn Shan
Ben Y. Zhao
Haitao Zheng
36
0
0
18 Sep 2024
NVLM: Open Frontier-Class Multimodal LLMs
NVLM: Open Frontier-Class Multimodal LLMs
Wenliang Dai
Nayeon Lee
Wei Ping
Zhuoling Yang
Zihan Liu
Jon Barker
Tuomas Rintamaki
M. Shoeybi
Bryan Catanzaro
Ming-Yu Liu
MLLM
VLM
LRM
45
51
0
17 Sep 2024
One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild
One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild
Dongqi Fan
Tao Chen
Mingjie Wang
Rui Ma
Qiang Tang
Zili Yi
Qian Wang
Liang Chang
27
0
0
15 Sep 2024
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation
  for Synthetic Image Detection
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection
Hina Otake
Yoshihiro Fukuhara
Yoshiki Kubotani
Shigeo Morishima
ViT
56
0
0
13 Sep 2024
Rethinking Prompting Strategies for Multi-Label Recognition with Partial
  Annotations
Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations
Samyak Rawlekar
Shubhang Bhatnagar
Narendra Ahuja
VLM
31
1
0
12 Sep 2024
NeIn: Telling What You Don't Want
NeIn: Telling What You Don't Want
Nhat-Tan Bui
Dinh-Hieu Hoang
Quoc-Huy Trinh
Minh-Triet Tran
Truong Nguyen
Susan Gauch
43
2
0
09 Sep 2024
WebQuest: A Benchmark for Multimodal QA on Web Page Sequences
WebQuest: A Benchmark for Multimodal QA on Web Page Sequences
Maria Wang
Srinivas Sunkara
Gilles Baechler
Jason Lin
Yun Zhu
Fedir Zubach
Lei Shu
Jindong Chen
LRM
LLMAG
29
1
0
06 Sep 2024
UNIT: Unifying Image and Text Recognition in One Vision Encoder
UNIT: Unifying Image and Text Recognition in One Vision Encoder
Yi Zhu
Yanpeng Zhou
Chunwei Wang
Yang Cao
Jianhua Han
Lu Hou
Hang Xu
ViT
VLM
34
4
0
06 Sep 2024
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic
  Compensation
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao
Haolin Wang
Jie Zhou
Jiwen Lu
DiffM
27
1
0
05 Sep 2024
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Junjie Li
Yang Liu
Weiqing Liu
Shikai Fang
Lewen Wang
Chang Xu
Jiang Bian
VGen
46
4
0
04 Sep 2024
CV-Probes: Studying the interplay of lexical and world knowledge in
  visually grounded verb understanding
CV-Probes: Studying the interplay of lexical and world knowledge in visually grounded verb understanding
Ivana Beňová
Michal Gregor
Albert Gatt
40
0
0
02 Sep 2024
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in
  Federated Class Continual Learning
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning
Jinglin Liang
Jin Zhong
Hanlin Gu
Zhongqi Lu
Xingxing Tang
Gang Dai
Shuangping Huang
Lixin Fan
Qiang Yang
DiffM
44
7
0
02 Sep 2024
FADE: Few-shot/zero-shot Anomaly Detection Engine using Large
  Vision-Language Model
FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model
Yuanwei Li
Elizaveta Ivanova
Martins Bruveris
VLM
24
1
0
31 Aug 2024
Building Better Datasets: Seven Recommendations for Responsible Design
  from Dataset Creators
Building Better Datasets: Seven Recommendations for Responsible Design from Dataset Creators
Will Orr
Kate Crawford
46
3
0
30 Aug 2024
CogVLM2: Visual Language Models for Image and Video Understanding
CogVLM2: Visual Language Models for Image and Video Understanding
Wenyi Hong
Weihan Wang
Ming Ding
Wenmeng Yu
Qingsong Lv
...
Debing Liu
Bin Xu
Juanzi Li
Yuxiao Dong
Jie Tang
VLM
MLLM
50
88
0
29 Aug 2024
Training-free Video Temporal Grounding using Large-scale Pre-trained
  Models
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng
Xinhao Cai
Qingchao Chen
Yuxin Peng
Yang Liu
40
4
0
29 Aug 2024
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation
Fangxun Shu
Yue Liao
Le Zhuo
Chenning Xu
Guanghao Zhang
...
Bolin Li
Zhelun Yu
Si Liu
Hongsheng Li
Hao Jiang
VLM
MoE
32
8
0
28 Aug 2024
Previous
12345...202122
Next