ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02114
  4. Cited By
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

3 November 2021
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs"

50 / 1,102 papers shown
Title
GeRA: Label-Efficient Geometrically Regularized Alignment
GeRA: Label-Efficient Geometrically Regularized Alignment
Dustin Klebe
Tal Shnitzer
Mikhail Yurochkin
Leonid Karlinsky
Justin Solomon
25
2
0
01 Oct 2023
Beyond Task Performance: Evaluating and Reducing the Flaws of Large
  Multimodal Models with In-Context Learning
Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning
Mustafa Shukor
Alexandre Ramé
Corentin Dancette
Matthieu Cord
LRM
MLLM
48
20
0
01 Oct 2023
PixArt-$α$: Fast Training of Diffusion Transformer for
  Photorealistic Text-to-Image Synthesis
PixArt-ααα: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
39
395
0
30 Sep 2023
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
21
3
0
29 Sep 2023
Practical Membership Inference Attacks Against Large-Scale Multi-Modal
  Models: A Pilot Study
Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
Myeongseob Ko
Ming Jin
Chenguang Wang
Ruoxi Jia
35
27
0
29 Sep 2023
Robustness of AI-Image Detectors: Fundamental Limits and Practical
  Attacks
Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks
Mehrdad Saberi
Vinu Sankar Sadasivan
Keivan Rezaei
Aounon Kumar
Atoosa Malemir Chegini
Wenxiao Wang
S. Feizi
WIGM
AAML
40
40
0
29 Sep 2023
Data Filtering Networks
Data Filtering Networks
Alex Fang
Albin Madappally Jose
Amit Jain
Ludwig Schmidt
Alexander Toshev
Vaishaal Shankar
CLIP
46
127
0
29 Sep 2023
Understanding and Mitigating the Label Noise in Pre-training on
  Downstream Tasks
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen
Jindong Wang
Ankit Shah
Ran Tao
Hongxin Wei
Berfin cSimcsek
Masashi Sugiyama
Bhiksha Raj
49
26
0
29 Sep 2023
Text-image Alignment for Diffusion-based Perception
Text-image Alignment for Diffusion-based Perception
Neehar Kondapaneni
Markus Marks
Manuel Knott
Rogério Guimarães
Pietro Perona
VLM
DiffM
24
32
0
29 Sep 2023
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
Yangyang Guo
Haoyu Zhang
Yongkang Wong
Liqiang Nie
Mohan Kankanhalli
VLM
30
3
0
28 Sep 2023
The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data
  Filtering
The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data Filtering
Hai-ping Yu
Yu Tian
Sateesh Kumar
Linjie Yang
Heng Wang
VLM
38
17
0
27 Sep 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced
  Text-image Comprehension and Composition
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Jiaqi Wang
MLLM
80
226
0
26 Sep 2023
BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile
  Screenshot Captioning
BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning
Ching-Yu Chiang
I-Hua Chang
Shih-Wei Liao
55
1
0
26 Sep 2023
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM
  Animator
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
Hanzhuo Huang
Yufan Feng
Cheng Shi
Lan Xu
Jingyi Yu
Sibei Yang
DiffM
VGen
31
63
0
25 Sep 2023
DECORAIT -- DECentralized Opt-in/out Registry for AI Training
DECORAIT -- DECentralized Opt-in/out Registry for AI Training
Karthika Balan
Alexander Black
Simon Jenni
Andrew Gilbert
Andy Parsons
John Collomosse
27
7
0
25 Sep 2023
Devil in the Number: Towards Robust Multi-modality Data Filter
Devil in the Number: Towards Robust Multi-modality Data Filter
Yichen Xu
Zihan Xu
Wenhao Chai
Zhonghan Zhao
Enxin Song
Gaoang Wang
14
2
0
24 Sep 2023
Accurate and Fast Compressed Video Captioning
Accurate and Fast Compressed Video Captioning
Yaojie Shen
Xin Gu
Kai Xu
Hengrui Fan
Longyin Wen
Libo Zhang
ViT
23
26
0
22 Sep 2023
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight
  Inheritance
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
Kan Wu
Houwen Peng
Zhenghong Zhou
Bin Xiao
Mengchen Liu
...
Xi
Xi Chen
Xinggang Wang
Hongyang Chao
Han Hu
VLM
OODD
29
54
0
21 Sep 2023
DreamLLM: Synergistic Multimodal Comprehension and Creation
DreamLLM: Synergistic Multimodal Comprehension and Creation
Runpei Dong
Chunrui Han
Yuang Peng
Zekun Qi
Zheng Ge
...
Hao-Ran Wei
Xiangwen Kong
Xiangyu Zhang
Kaisheng Ma
Li Yi
MLLM
50
176
0
20 Sep 2023
Face Aging via Diffusion-based Editing
Face Aging via Diffusion-based Editing
Xiangyi Chen
Stéphane Lathuilière
DiffM
18
13
0
20 Sep 2023
Improving CLIP Robustness with Knowledge Distillation and Self-Training
Improving CLIP Robustness with Knowledge Distillation and Self-Training
Clement Laroudie
Andrei Bursuc
Mai Lan Ha
Gianni Franchi
VLM
33
5
0
19 Sep 2023
Image-Text Pre-Training for Logo Recognition
Image-Text Pre-Training for Logo Recognition
Mark Hubenthal
Suren Kumar
VLM
41
3
0
18 Sep 2023
In-Style: Bridging Text and Uncurated Videos with Style Transfer for
  Text-Video Retrieval
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
Nina Shvetsova
Anna Kukleva
Bernt Schiele
Hilde Kuehne
DiffM
33
3
0
16 Sep 2023
Viewpoint Integration and Registration with Vision Language Foundation
  Model for Image Change Understanding
Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Xiaonan Lu
Jianlong Yuan
Ruigang Niu
Yuan Hu
Fan Wang
26
1
0
15 Sep 2023
PatFig: Generating Short and Long Captions for Patent Figures
PatFig: Generating Short and Long Captions for Patent Figures
Dana Aubakirova
Kim Gerdes
Lufei Liu
17
9
0
15 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context
  Learning
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLM
VLM
28
135
0
14 Sep 2023
PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
Gi-Cheon Kang
Junghyun Kim
Jaein Kim
Byoung-Tak Zhang
37
4
0
14 Sep 2023
Language Models as Black-Box Optimizers for Vision-Language Models
Language Models as Black-Box Optimizers for Vision-Language Models
Shihong Liu
Zhiqiu Lin
Samuel Yu
Ryan Lee
Tiffany Ling
Deepak Pathak
Deva Ramanan
VLM
35
28
0
12 Sep 2023
OpenFashionCLIP: Vision-and-Language Contrastive Learning with
  Open-Source Fashion Data
OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data
Giuseppe Cartella
Alberto Baldrati
Davide Morelli
Marcella Cornia
Marco Bertini
Rita Cucchiara
VLM
CLIP
40
7
0
11 Sep 2023
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual
  Tokenization
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Yang Jin
Kun Xu
Kun Xu
Liwei Chen
Chao Liao
...
Xiaoqiang Lei
Di Zhang
Wenwu Ou
Kun Gai
Yadong Mu
MLLM
VLM
27
41
0
09 Sep 2023
Language Prompt for Autonomous Driving
Language Prompt for Autonomous Driving
Dongming Wu
Wencheng Han
Tiancai Wang
Yingfei Liu
Cheng-zhong Xu
Jianbing Shen
Jianbing Shen
VLM
46
74
0
08 Sep 2023
Toward High Quality Facial Representation Learning
Toward High Quality Facial Representation Learning
Yue Wang
Jinlong Peng
Jiangning Zhang
Ran Yi
Lu Liu
Yabiao Wang
Chengjie Wang
CVBM
SSL
57
7
0
07 Sep 2023
My Art My Choice: Adversarial Protection Against Unruly AI
My Art My Choice: Adversarial Protection Against Unruly AI
Anthony Rhodes
Ram Bhagat
U. Ciftci
Ilke Demir
DiffM
50
4
0
06 Sep 2023
Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised
  Semantic Segmentation
Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised Semantic Segmentation
Ryota Yoshihashi
Yuya Otsuka
Kenji Doi
Tomohiro Tanaka
Hirokatsu Kataoka
44
1
0
04 Sep 2023
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
Cheng Shi
Sibei Yang
VLM
ObjD
38
38
0
03 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
46
7
0
02 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
  Large Model
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
36
20
0
02 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
28
27
0
02 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High
  Definition Text-to-Video Generation
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Qi Zhang
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
45
52
0
01 Sep 2023
Sparkles: Unlocking Chats Across Multiple Images for Multimodal
  Instruction-Following Models
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
Yupan Huang
Zaiqiao Meng
Fangyu Liu
Yixuan Su
Nigel Collier
Yutong Lu
MLLM
41
22
0
31 Aug 2023
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification
  with Cross-Modal Retrieval
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval
Seong-Hoon Eom
Namgyu Ho
Jaehoon Oh
Se-Young Yun
CLIP
VLM
38
0
0
29 Aug 2023
How to Evaluate the Generalization of Detection? A Benchmark for
  Comprehensive Open-Vocabulary Detection
How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection
Yi Yao
Peng Liu
Tiancheng Zhao
Qianqian Zhang
Jiajia Liao
Chunxin Fang
Kyusong Lee
Qing Wang
VLM
ObjD
29
10
0
25 Aug 2023
SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data
SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data
Ziyan Yang
Kushal Kafle
Zhe Lin
Scott D. Cohen
Zhihong Ding
Vicente Ordonez
30
1
0
24 Aug 2023
Uniformly Distributed Category Prototype-Guided Vision-Language
  Framework for Long-Tail Recognition
Uniformly Distributed Category Prototype-Guided Vision-Language Framework for Long-Tail Recognition
Siming Fu
Xiaoxuan He
Xinpeng Ding
Yuchen Cao
Hualiang Wang
VLM
32
6
0
24 Aug 2023
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across
  Languages
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages
Jinyi Hu
Yuan Yao
Chong Wang
Shanonan Wang
Yinxu Pan
...
Yankai Lin
Jiao Xue
Dahai Li
Zhiyuan Liu
Maosong Sun
MLLM
VLM
37
49
0
23 Aug 2023
A Benchmark Study on Calibration
A Benchmark Study on Calibration
Linwei Tao
Younan Zhu
Haolan Guo
Minjing Dong
Chang Xu
26
9
0
23 Aug 2023
Random Word Data Augmentation with CLIP for Zero-Shot Anomaly Detection
Random Word Data Augmentation with CLIP for Zero-Shot Anomaly Detection
Masato Tamura
VLM
31
9
0
22 Aug 2023
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and
  Chinese Large Models
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models
Conghui He
Zhenjiang Jin
Chaoxi Xu
Jiantao Qiu
Bin Wang
Wei Li
Hang Yan
Jiaqi Wang
Da Lin
65
35
0
21 Aug 2023
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual
  Questions
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu
Y. Xu
Yuante Li
W. Li
Zhengzhang Chen
Zhuowen Tu
MLLM
VLM
30
123
0
19 Aug 2023
The Unreasonable Effectiveness of Large Language-Vision Models for
  Source-free Video Domain Adaptation
The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation
Giacomo Zara
Alessandro Conti
Subhankar Roy
Stéphane Lathuilière
Paolo Rota
Elisa Ricci
35
11
0
17 Aug 2023
Previous
123...131415...212223
Next