ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02114
  4. Cited By
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

3 November 2021
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs"

47 / 1,097 papers shown
Title
American == White in Multimodal Language-and-Image AI
American == White in Multimodal Language-and-Image AI
Robert Wolfe
Aylin Caliskan
VLM
27
46
0
01 Jul 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
107
1,062
0
22 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
48
348
0
17 Jun 2022
MixGen: A New Multi-Modal Data Augmentation
MixGen: A New Multi-Modal Data Augmentation
Xiaoshuai Hao
Yi Zhu
Srikar Appalaraju
Aston Zhang
Wanqian Zhang
Boyang Li
Mu Li
VLM
22
83
0
16 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
72
527
0
13 Jun 2022
Blended Latent Diffusion
Blended Latent Diffusion
Omri Avrahami
Ohad Fried
Dani Lischinski
DiffM
59
373
0
06 Jun 2022
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
28
183
0
06 Jun 2022
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal
  Pre-training
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training
Yan Zeng
Wangchunshu Zhou
Ao Luo
Ziming Cheng
Xinsong Zhang
VLM
27
30
0
01 Jun 2022
The Dialog Must Go On: Improving Visual Dialog via Generative
  Self-Training
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
32
10
0
25 May 2022
Reassessing Evaluation Practices in Visual Question Answering: A Case
  Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Aishwarya Agrawal
Ivana Kajić
Emanuele Bugliarello
Elnaz Davoodi
Anita Gergely
Phil Blunsom
Aida Nematzadeh
OOD
40
17
0
24 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
66
5,778
0
23 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language Modeling
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
36
18
0
20 May 2022
LANTERN-RD: Enabling Deep Learning for Mitigation of the Invasive
  Spotted Lanternfly
LANTERN-RD: Enabling Deep Learning for Mitigation of the Invasive Spotted Lanternfly
Srivatsa Kundurthy
22
1
0
12 May 2022
Explain to Not Forget: Defending Against Catastrophic Forgetting with
  XAI
Explain to Not Forget: Defending Against Catastrophic Forgetting with XAI
Sami Ede
Serop Baghdadlian
Leander Weber
A. Nguyen
Dario Zanca
Wojciech Samek
Sebastian Lapuschkin
CLL
27
6
0
04 May 2022
Data Governance in the Age of Large-Scale Data-Driven Language
  Technology
Data Governance in the Age of Large-Scale Data-Driven Language Technology
Yacine Jernite
Huu Nguyen
Stella Biderman
A. Rogers
Maraim Masoud
...
Jorg Frohberg
Aaron Gokaslan
Peter Henderson
Rishi Bommasani
Margaret Mitchell
26
52
0
04 May 2022
Data Determines Distributional Robustness in Contrastive Language Image
  Pre-training (CLIP)
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
Alex Fang
Gabriel Ilharco
Mitchell Wortsman
Yu Wan
Vaishaal Shankar
Achal Dave
Ludwig Schmidt
VLM
OOD
33
138
0
03 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
46
3,349
0
29 Apr 2022
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model
  Pretraining
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining
Yuting Gao
Jinfeng Liu
Zihan Xu
Jinchao Zhang
Ke Li
Rongrong Ji
Chunhua Shen
VLM
CLIP
29
100
0
29 Apr 2022
Semi-Parametric Neural Image Synthesis
Semi-Parametric Neural Image Synthesis
A. Blattmann
Robin Rombach
Kaan Oktay
Jonas Muller
Bjorn Ommer
DiffM
33
28
0
25 Apr 2022
iCAR: Bridging Image Classification and Image-text Alignment for Visual
  Recognition
iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition
Yixuan Wei
Yue Cao
Zheng-Wei Zhang
Zhuliang Yao
Zhenda Xie
Han Hu
B. Guo
VLM
23
11
0
22 Apr 2022
Considerations for Multilingual Wikipedia Research
Considerations for Multilingual Wikipedia Research
Isaac Johnson
Emily A. Lescak
24
3
0
05 Apr 2022
PromptDet: Towards Open-vocabulary Detection using Uncurated Images
PromptDet: Towards Open-vocabulary Detection using Uncurated Images
Chengjian Feng
Yujie Zhong
Zequn Jie
Xiangxiang Chu
Haibing Ren
Xiaolin K. Wei
Weidi Xie
Lin Ma
VPVLM
VLM
19
152
0
30 Mar 2022
Large-scale Bilingual Language-Image Contrastive Learning
Large-scale Bilingual Language-Image Contrastive Learning
ByungSoo Ko
Geonmo Gu
VLM
32
14
0
28 Mar 2022
CLIP-Mesh: Generating textured meshes from text using pretrained
  image-text models
CLIP-Mesh: Generating textured meshes from text using pretrained image-text models
N. Khalid
Tianhao Xie
Eugene Belilovsky
Tiberiu Popa
CLIP
10
291
0
24 Mar 2022
Complex Scene Image Editing by Scene Graph Comprehension
Complex Scene Image Editing by Scene Graph Comprehension
Zhongping Zhang
Huiwen He
Bryan A. Plummer
Z. Liao
Huayan Wang
DiffM
27
6
0
24 Mar 2022
WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models
Shan Yuan
Shuai Zhao
Jiahong Leng
Zhao Xue
Hanyu Zhao
Peiyu Liu
Zheng Gong
Wayne Xin Zhao
Junyi Li
Tang Jie
VLM
29
5
0
22 Mar 2022
Bamboo: Building Mega-Scale Vision Dataset Continually with
  Human-Machine Synergy
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy
Yuanhan Zhang
Qi Sun
Yichun Zhou
Zexin He
Zhen-fei Yin
Kunze Wang
Lu Sheng
Yu Qiao
Jing Shao
Ziwei Liu
ObjD
VLM
32
19
0
15 Mar 2022
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Shixing Chen
Chundi Liu
Xiang Hao
Xiaohan Nie
Maxim Arap
Raffay Hamid
31
17
0
22 Feb 2022
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial
  Auto-Encoders
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders
Huangjie Zheng
Pengcheng He
Weizhu Chen
Mingyuan Zhou
DiffM
17
46
0
19 Feb 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training
  Benchmark
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
43
88
0
14 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
53
850
0
07 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Optimal Representations for Covariate Shift
Optimal Representations for Covariate Shift
Yangjun Ruan
Yann Dubois
Chris J. Maddison
OOD
28
68
0
31 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
29
48
0
27 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
150
14,641
0
20 Dec 2021
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Zhongping Zhang
Yiwen Gu
Bryan A. Plummer
42
2
0
11 Dec 2021
MAGMA -- Multimodal Augmentation of Generative Models through
  Adapter-based Finetuning
MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
C. Eichenberg
Sid Black
Samuel Weinbach
Letitia Parcalabescu
Anette Frank
MLLM
VLM
15
100
0
09 Dec 2021
General Facial Representation Learning in a Visual-Linguistic Manner
General Facial Representation Learning in a Visual-Linguistic Manner
Yinglin Zheng
Hao Yang
Ting Zhang
Jianmin Bao
Dongdong Chen
Yangyu Huang
Lu Yuan
Dong Chen
Ming Zeng
Fang Wen
CVBM
146
163
0
06 Dec 2021
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
71
757
0
29 Nov 2021
Generating More Pertinent Captions by Leveraging Semantics and Style on
  Multi-Source Datasets
Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
Marcella Cornia
Lorenzo Baraldi
G. Fiameni
Rita Cucchiara
20
12
0
24 Nov 2021
Scaling Up Vision-Language Pre-training for Image Captioning
Scaling Up Vision-Language Pre-training for Image Captioning
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Zhengyuan Yang
Zicheng Liu
Yumao Lu
Lijuan Wang
MLLM
VLM
34
246
0
24 Nov 2021
L-Verse: Bidirectional Generation Between Image and Text
L-Verse: Bidirectional Generation Between Image and Text
Taehoon Kim
Gwangmo Song
Sihaeng Lee
Sangyun Kim
Yewon Seo
Soonyoung Lee
S. Kim
Honglak Lee
Kyunghoon Bae
23
25
0
22 Nov 2021
Making Images Real Again: A Comprehensive Survey on Deep Image
  Composition
Making Images Real Again: A Comprehensive Survey on Deep Image Composition
Li Niu
Wenyan Cong
Liu Liu
Yan Hong
Bo Zhang
Jing Liang
Liqing Zhang
VLM
DiffM
CoGe
30
76
0
28 Jun 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
322
3,708
0
11 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
279
1,996
0
31 Dec 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
Previous
123...202122