ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02114
  4. Cited By
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

3 November 2021
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs"

50 / 1,099 papers shown
Title
Hyperbolic Image-Text Representations
Hyperbolic Image-Text Representations
Karan Desai
Maximilian Nickel
Tanmay Rajpurohit
Justin Johnson
Ramakrishna Vedantam
VLM
47
57
0
18 Apr 2023
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP
  Training
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
Yihao Chen
Xianbiao Qi
Jianan Wang
Lei Zhang
23
16
0
17 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
140
3,070
0
14 Apr 2023
Intriguing properties of synthetic images: from generative adversarial
  networks to diffusion models
Intriguing properties of synthetic images: from generative adversarial networks to diffusion models
Riccardo Corvi
D. Cozzolino
Giovanni Poggi
Koki Nagano
L. Verdoliva
DiffM
29
90
0
13 Apr 2023
RECLIP: Resource-efficient CLIP by Training with Small Images
RECLIP: Resource-efficient CLIP by Training with Small Images
Runze Li
Dahun Kim
B. Bhanu
Weicheng Kuo
VLM
CLIP
36
13
0
12 Apr 2023
Learning Transferable Pedestrian Representation from Multimodal
  Information Supervision
Learning Transferable Pedestrian Representation from Multimodal Information Supervision
Li-Na Bao
Longhui Wei
Xiaoyu Qiu
Wen-gang Zhou
Houqiang Li
Qi Tian
SSL
39
5
0
12 Apr 2023
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image
  Models
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr
Pengzhan Sun
Xiaoqian Shen
Faizan Farooq Khan
Li Erran Li
Mohamed Elhoseiny
VLM
24
76
0
11 Apr 2023
Controllable Textual Inversion for Personalized Text-to-Image Generation
Controllable Textual Inversion for Personalized Text-to-Image Generation
Jianan Yang
Haobo Wang
Yanming Zhang
Rui Xiao
Sai Wu
Gang Chen
Jun Zhao
DiffM
32
12
0
11 Apr 2023
Mask-conditioned latent diffusion for generating gastrointestinal polyp
  images
Mask-conditioned latent diffusion for generating gastrointestinal polyp images
Roman Machávcek
Leila Mozaffari
Z. Sepasdar
Sravanthi Parasa
P. Halvorsen
Michael A. Riegler
Vajira Thambawita
MedIm
DiffM
16
17
0
11 Apr 2023
A Billion-scale Foundation Model for Remote Sensing Images
A Billion-scale Foundation Model for Remote Sensing Images
Keumgang Cha
Junghoon Seo
Taekyung Lee
38
64
0
11 Apr 2023
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Ahmet Iscen
Alireza Fathi
Cordelia Schmid
VLM
3DV
36
25
0
11 Apr 2023
SATR: Zero-Shot Semantic Segmentation of 3D Shapes
SATR: Zero-Shot Semantic Segmentation of 3D Shapes
Ahmed Abdelreheem
Ivan Skorokhodov
M. Ovsjanikov
Peter Wonka
3DPC
37
38
0
11 Apr 2023
EKILA: Synthetic Media Provenance and Attribution for Generative Art
EKILA: Synthetic Media Provenance and Attribution for Generative Art
Kar Balan
S. Agarwal
Simon Jenni
Andy Parsons
Andrew Gilbert
John Collomosse
27
12
0
10 Apr 2023
HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image
  Generation
HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
Xu Ju
Ailing Zeng
Chenchen Zhao
Jianan Wang
Lei Zhang
Qian Xu
DiffM
31
87
0
09 Apr 2023
Harnessing the Spatial-Temporal Attention of Diffusion Models for
  High-Fidelity Text-to-Image Synthesis
Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
Qiucheng Wu
Yujian Liu
Handong Zhao
T. Bui
Zhe-nan Lin
Yang Zhang
Shiyu Chang
DiffM
42
45
0
07 Apr 2023
Probing Conceptual Understanding of Large Visual-Language Models
Probing Conceptual Understanding of Large Visual-Language Models
Madeline Chantry Schiappa
Raiyaan Abdullah
Shehreen Azad
Jared Claypoole
Michael Cogswell
Ajay Divakaran
Yogesh S Rawat
58
14
0
07 Apr 2023
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
Noa Garcia
Yusuke Hirota
Yankun Wu
Yuta Nakashima
EGVM
43
51
0
06 Apr 2023
ERM++: An Improved Baseline for Domain Generalization
ERM++: An Improved Baseline for Domain Generalization
Piotr Teterwak
Kuniaki Saito
Theodoros Tsiligkaridis
Kate Saenko
Bryan A. Plummer
OOD
44
9
0
04 Apr 2023
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia
  Content Creation
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation
Jheng-Hong Yang
Carlos Lassance
Rafael Sampaio de Rezende
Krishna Srinivasan
Miriam Redi
S. Clinchant
Jimmy J. Lin
42
12
0
04 Apr 2023
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free
  Videos
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
Yue Ma
Yin-Yin He
Xiaodong Cun
Xintao Wang
Siran Chen
Ying Shan
Xiu Li
Qifeng Chen
DiffM
VGen
37
177
0
03 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
45
493
0
03 Apr 2023
Parents and Children: Distinguishing Multimodal DeepFakes from Natural
  Images
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
39
29
0
02 Apr 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
Ximeng Sun
Pengchuan Zhang
Peizhao Zhang
Hardik Shah
Kate Saenko
Xide Xia
VLM
25
20
0
31 Mar 2023
Social Biases through the Text-to-Image Generation Lens
Social Biases through the Text-to-Image Generation Lens
Ranjita Naik
Besmira Nushi
119
114
0
30 Mar 2023
RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild
  Recognition
RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition
Igor L. Markov
S. Nesteruk
Andrey Kuznetsov
Denis Dimitrov
18
0
0
29 Mar 2023
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init
  Attention
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
74
747
0
28 Mar 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
  Textual Guidance
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma
Xiaioqing Zhang
Xiaoshuai Sun
Jiayi Ji
Haowei Wang
Guannan Jiang
Weilin Zhuang
Rongrong Ji
23
39
0
28 Mar 2023
GeoNet: Benchmarking Unsupervised Adaptation across Geographies
GeoNet: Benchmarking Unsupervised Adaptation across Geographies
Tarun Kalluri
Wangdong Xu
Manmohan Chandraker
OOD
34
15
0
27 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
81
470
0
27 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
38
107
0
27 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffM
VGen
26
40
0
27 Mar 2023
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
Jongheon Jeong
Yang Zou
Taewan Kim
Dongqing Zhang
Avinash Ravichandran
Onkar Dabeer
VLM
75
187
0
26 Mar 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu
Hao Zhu
Liming Jiang
Chen Change Loy
Weidong (Tom) Cai
Wayne Wu
30
57
0
26 Mar 2023
Equivariant Similarity for Vision-Language Foundation Models
Equivariant Similarity for Vision-Language Foundation Models
Tan Wang
Kevin Qinghong Lin
Linjie Li
Chung-Ching Lin
Zhengyuan Yang
Hanwang Zhang
Zicheng Liu
Lijuan Wang
CoGe
46
44
0
25 Mar 2023
Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware
  Compression
Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Denis Kuznedelev
Soroush Tabesh
Kimia Noorbakhsh
Elias Frantar
Sara Beery
Eldar Kurtic
Dan Alistarh
MQ
VLM
26
2
0
25 Mar 2023
Ablating Concepts in Text-to-Image Diffusion Models
Ablating Concepts in Text-to-Image Diffusion Models
Nupur Kumari
Bin Zhang
Sheng-Yu Wang
Eli Shechtman
Richard Y. Zhang
Jun-Yan Zhu
VLM
21
184
0
23 Mar 2023
Text with Knowledge Graph Augmented Transformer for Video Captioning
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu
G. Chen
Yufei Wang
Libo Zhang
Tiejian Luo
Longyin Wen
32
47
0
22 Mar 2023
MAGVLT: Masked Generative Vision-and-Language Transformer
MAGVLT: Masked Generative Vision-and-Language Transformer
Sungwoong Kim
DaeJin Jo
Donghoon Lee
Jongmin Kim
VLM
47
12
0
21 Mar 2023
MV-MR: multi-views and multi-representations for self-supervised
  learning and knowledge distillation
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation
Vitaliy Kinakh
M. Drozdova
Slava Voloshynovskiy
40
1
0
21 Mar 2023
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Lukas Höllein
Ang Cao
Andrew Owens
Justin Johnson
Matthias Nießner
DiffM
38
179
0
21 Mar 2023
Attribute-preserving Face Dataset Anonymization via Latent Code
  Optimization
Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
Simone Barattin
Christos Tzelepis
Ioannis Patras
N. Sebe
PICV
CVBM
27
43
0
20 Mar 2023
SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel
  Storage
SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage
Song Park
Sanghyuk Chun
Byeongho Heo
Wonjae Kim
Sangdoo Yun
VLM
ViT
14
8
0
20 Mar 2023
On the De-duplication of LAION-2B
On the De-duplication of LAION-2B
Ryan Webster
Julien Rabin
Loïc Simon
F. Jurie
DiffM
12
40
0
17 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
45
20
0
17 Mar 2023
VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for
  Weakly-Supervised Object Detection
VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object Detection
Arushi Rai
Adriana Kovashka
27
0
0
16 Mar 2023
SemDeDup: Data-efficient learning at web-scale through semantic
  deduplication
SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Amro Abbas
Kushal Tirumala
Daniel Simig
Surya Ganguli
Ari S. Morcos
31
164
0
16 Mar 2023
Unified Multi-Modal Latent Diffusion for Joint Subject and Text
  Conditional Image Generation
Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation
Yi Ma
Huan Yang
Wenjing Wang
Jianlong Fu
Jiaying Liu
22
65
0
16 Mar 2023
Automatic Geo-alignment of Artwork in Children's Story Books
Automatic Geo-alignment of Artwork in Children's Story Books
Jakub J Dylag
V. Suarez
James Wald
Aneesha Amodini Uvara
DiffM
46
0
0
16 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
132
309
0
15 Mar 2023
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D
  Generation
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo
Wooseok Jang
Minseop Kwak
Ines Hyeonsu Kim
Jaehoon Ko
Junho Kim
Jin-Hwa Kim
Jiyoung Lee
Seung Wook Kim
DiffM
46
136
0
14 Mar 2023
Previous
123...171819202122
Next