ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02114
  4. Cited By
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

3 November 2021
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs"

50 / 1,098 papers shown
Title
Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images
Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images
Santosh
Li Lin
Irene Amerini
Xin Wang
Shu Hu
47
11
0
19 Apr 2024
BLINK: Multimodal Large Language Models Can See but Not Perceive
BLINK: Multimodal Large Language Models Can See but Not Perceive
Xingyu Fu
Yushi Hu
Bangzheng Li
Yu Feng
Haoyu Wang
Xudong Lin
Dan Roth
Noah A. Smith
Wei-Chiu Ma
Ranjay Krishna
VLM
LRM
MLLM
43
110
0
18 Apr 2024
Lazy Diffusion Transformer for Interactive Image Editing
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan
Zongze Wu
Richard Zhang
Eli Shechtman
Daniel Cohen-Or
Taesung Park
Michael Gharbi
43
9
0
18 Apr 2024
Customizing Text-to-Image Diffusion with Camera Viewpoint Control
Customizing Text-to-Image Diffusion with Camera Viewpoint Control
Nupur Kumari
Grace Su
Richard Zhang
Taesung Park
Eli Shechtman
Jun-Yan Zhu
DiffM
44
3
0
18 Apr 2024
Vocabulary-free Image Classification and Semantic Segmentation
Vocabulary-free Image Classification and Semantic Segmentation
Alessandro Conti
Enrico Fini
Massimiliano Mancini
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
43
2
0
16 Apr 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Lewei Yao
Renjie Pi
Jianhua Han
Xiaodan Liang
Hang Xu
Wei Zhang
Zhenguo Li
Dan Xu
VLM
ObjD
53
20
0
14 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
40
79
0
12 Apr 2024
ControlNet++: Improving Conditional Controls with Efficient Consistency
  Feedback
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Ming Li
Taojiannan Yang
Huafeng Kuang
Jie Wu
Zhaoning Wang
Xuefeng Xiao
Chong Chen
45
63
0
11 Apr 2024
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Taming Stable Diffusion for Text to 360° Panorama Image Generation
Cheng Zhang
Qianyi Wu
Camilo Cruz Gambardella
Xiaoshui Huang
Dinh Q. Phung
Wanli Ouyang
Jianfei Cai
MDE
21
8
0
11 Apr 2024
Scaling Laws for Data Filtering -- Data Curation cannot be Compute
  Agnostic
Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic
Sachin Goyal
Pratyush Maini
Zachary Chase Lipton
Aditi Raghunathan
J. Zico Kolter
56
43
0
10 Apr 2024
Disguised Copyright Infringement of Latent Diffusion Models
Disguised Copyright Infringement of Latent Diffusion Models
Yiwei Lu
Matthew Y.R. Yang
Zuoqiu Liu
Gautam Kamath
Yaoliang Yu
WIGM
36
7
0
10 Apr 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model
  Handling Resolutions from 336 Pixels to 4K HD
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Xingcheng Zhang
Jifeng Dai
Yuxin Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
41
114
0
09 Apr 2024
DreamView: Injecting View-specific Text Guidance into Text-to-3D
  Generation
DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation
Junkai Yan
Yipeng Gao
Q. Yang
Xihan Wei
Xuansong Xie
Ancong Wu
Wei-Shi Zheng
40
1
0
09 Apr 2024
PromptAD: Learning Prompts with only Normal Samples for Few-Shot Anomaly
  Detection
PromptAD: Learning Prompts with only Normal Samples for Few-Shot Anomaly Detection
Xiaofan Li
Zhizhong Zhang
Xin Tan
Chengwei Chen
Yanyun Qu
Yuan Xie
Lizhuang Ma
VLM
61
36
0
08 Apr 2024
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal
  Model
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem
Jean Lahoud
Hisham Cholakkal
LRM
50
4
0
04 Apr 2024
The More You See in 2D, the More You Perceive in 3D
The More You See in 2D, the More You Perceive in 3D
Xinyang Han
Zelin Gao
Angjoo Kanazawa
Shubham Goel
Yossi Gandelsman
DiffM
56
3
0
04 Apr 2024
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency
  Determines Multimodal Model Performance
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao
Ameya Prabhu
Adhiraj Ghosh
Yash Sharma
Philip Torr
Adel Bibi
Samuel Albanie
Matthias Bethge
VLM
128
45
0
04 Apr 2024
LCM-Lookahead for Encoder-based Text-to-Image Personalization
LCM-Lookahead for Encoder-based Text-to-Image Personalization
Rinon Gal
Or Lichter
Elad Richardson
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
DiffM
44
30
0
04 Apr 2024
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
  Interleaved Visual-Textual Tokens
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
Kirolos Ataallah
Xiaoqian Shen
Eslam Abdelrahman
Essam Sleiman
Deyao Zhu
Jian Ding
Mohamed Elhoseiny
VLM
47
67
0
04 Apr 2024
Would Deep Generative Models Amplify Bias in Future Models?
Would Deep Generative Models Amplify Bias in Future Models?
Tianwei Chen
Yusuke Hirota
Mayu Otani
Noa Garcia
Yuta Nakashima
45
12
0
04 Apr 2024
Which Model Generated This Image? A Model-Agnostic Approach for Origin
  Attribution
Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution
Fengyuan Liu
Haochen Luo
Yiming Li
Philip Torr
Jindong Gu
VLM
34
5
0
03 Apr 2024
Bi-LORA: A Vision-Language Approach for Synthetic Image Detection
Bi-LORA: A Vision-Language Approach for Synthetic Image Detection
Mamadou Keita
W. Hamidouche
Hessen Bougueffa Eutamene
Abdenour Hadid
Abdelmalik Taleb-Ahmed
69
7
0
02 Apr 2024
VLRM: Vision-Language Models act as Reward Models for Image Captioning
VLRM: Vision-Language Models act as Reward Models for Image Captioning
Maksim Dzabraev
Alexander Kunitsyn
Andrei Ivaniuta
VLM
MLLM
31
3
0
02 Apr 2024
Confidence-aware Reward Optimization for Fine-tuning Text-to-Image
  Models
Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Kyuyoung Kim
Jongheon Jeong
Minyong An
Mohammad Ghavamzadeh
Krishnamurthy Dvijotham
Jinwoo Shin
Kimin Lee
EGVM
42
6
0
02 Apr 2024
Vision-language models for decoding provider attention during neonatal
  resuscitation
Vision-language models for decoding provider attention during neonatal resuscitation
Felipe Parodi
Jordan K Matelsky
Alejandra Regla-Vargas
Elizabeth E. Foglia
Charis Lim
Danielle Weinberg
Konrad Kording
Heidi Herrick
Michael L Platt
32
0
0
01 Apr 2024
LLaMA-Excitor: General Instruction Tuning via Indirect Feature
  Interaction
LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
47
6
0
01 Apr 2024
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via
  Negations
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations
Jaisidh Singh
Ishaan Shrivastava
Mayank Vatsa
Richa Singh
Aparna Bharati
VLM
CoGe
34
14
0
29 Mar 2024
MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention
  Editing in Text-to-Image Diffusion Models
MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models
Hidir Yesiltepe
Kiymet Akdemir
Pinar Yanardag
29
3
0
28 Mar 2024
LocCa: Visual Pretraining with Location-aware Captioners
LocCa: Visual Pretraining with Location-aware Captioners
Bo Wan
Michael Tschannen
Yongqin Xian
Filip Pavetić
Ibrahim M. Alabdulmohsin
Xiao Wang
André Susano Pinto
Andreas Steiner
Lucas Beyer
Xiao-Qi Zhai
VLM
51
6
0
28 Mar 2024
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth
  Estimation
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation
Suraj Patni
Aradhye Agarwal
Chetan Arora
VLM
DiffM
MDE
33
26
0
27 Mar 2024
Toward Interactive Regional Understanding in Vision-Large Language
  Models
Toward Interactive Regional Understanding in Vision-Large Language Models
Jungbeom Lee
Sanghyuk Chun
Sangdoo Yun
VLM
28
1
0
27 Mar 2024
Recommendation of data-free class-incremental learning algorithms by
  simulating future data
Recommendation of data-free class-incremental learning algorithms by simulating future data
Eva Feillet
Adrian Daniel Popescu
C´eline Hudelot
49
0
0
26 Mar 2024
Refining Text-to-Image Generation: Towards Accurate Training-Free
  Glyph-Enhanced Image Generation
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
Sanyam Lakhanpal
Shivang Chopra
Vinija Jain
Aman Chadha
Man Luo
40
9
0
25 Mar 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
Centered Masking for Language-Image Pre-Training
Centered Masking for Language-Image Pre-Training
Mingliang Liang
Martha Larson
VLM
CLIP
36
4
0
23 Mar 2024
A Multimodal Approach for Cross-Domain Image Retrieval
A Multimodal Approach for Cross-Domain Image Retrieval
Lucas Iijima
Tania Stathaki
36
1
0
22 Mar 2024
CLIP-VQDiffusion : Langauge Free Training of Text To Image generation
  using CLIP and vector quantized diffusion model
CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model
S. Han
Joohee Kim
DiffM
CLIP
34
1
0
22 Mar 2024
VidLA: Video-Language Alignment at Scale
VidLA: Video-Language Alignment at Scale
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLM
AI4TS
58
4
0
21 Mar 2024
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Qing Jiang
Feng Li
Zhaoyang Zeng
Tianhe Ren
Shilong Liu
Lei Zhang
VLM
32
37
0
21 Mar 2024
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of
  Text-to-Image Models
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Siying Cui
Jia Guo
Xiang An
Jiankang Deng
Yongle Zhao
Xinyu Wei
Ziyong Feng
DiffM
42
21
0
20 Mar 2024
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Mengqi Zhang
Yang Fu
Zheng Ding
Sifei Liu
Zhuowen Tu
Xiaolong Wang
44
17
0
18 Mar 2024
GenView: Enhancing View Quality with Pretrained Generative Model for
  Self-Supervised Learning
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning
Xiaojie Li
Yibo Yang
Hefei Ling
Jianlong Wu
Yue Yu
Guohao Li
Min Zhang
SSL
34
6
0
18 Mar 2024
LayerDiff: Exploring Text-guided Multi-layered Composable Image
  Synthesis via Layer-Collaborative Diffusion Model
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Runhu Huang
Kaixin Cai
Jianhua Han
Xiaodan Liang
Renjing Pei
Guansong Lu
Songcen Xu
Wei Zhang
Hang Xu
DiffM
44
4
0
18 Mar 2024
InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware
  Inpainting
InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware Inpainting
Jiaxiang Tang
Ruijie Lu
Xiaokang Chen
Xiang Wen
Gang Zeng
Ziwei Liu
DiffM
MDE
32
15
0
18 Mar 2024
Infinite-ID: Identity-preserved Personalization via ID-semantics
  Decoupling Paradigm
Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm
Yi Wu
Ziqiang Li
Heliang Zheng
Chaoyue Wang
Bin Li
DiffM
63
18
0
18 Mar 2024
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
Guohao Sun
Can Qin
Jiamian Wang
Zeyuan Chen
Ran Xu
Zhiqiang Tao
MLLM
VLM
LRM
37
9
0
17 Mar 2024
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic
  Segmentation
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation
Yasufumi Kawano
Yoshimitsu Aoki
DiffM
35
4
0
17 Mar 2024
Lost in Translation? Translation Errors and Challenges for Fair
  Assessment of Text-to-Image Models on Multilingual Concepts
Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts
Michael Stephen Saxon
Yiran Luo
Sharon Levy
Chitta Baral
Yezhou Yang
William Y. Wang
EGVM
38
3
0
17 Mar 2024
OMG: Occlusion-friendly Personalized Multi-concept Generation in
  Diffusion Models
OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models
Zhe Kong
Yong Zhang
Tianyu Yang
Tao Wang
Kaihao Zhang
Bizhu Wu
Guanying Chen
Wei Liu
Wenhan Luo
DiffM
51
27
0
16 Mar 2024
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for
  Remote Sensing Image-Text Retrival
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival
Yuanxin Zhao
Mi Zhang
Bingnan Yang
Zhan Zhang
Jiaju Kang
Jianya Gong
35
2
0
16 Mar 2024
Previous
123...789...202122
Next