Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00823
Cited By
M6: A Chinese Multimodal Pretrainer
1 March 2021
Junyang Lin
Rui Men
An Yang
Chan Zhou
Ming Ding
Yichang Zhang
Peng Wang
Ang Wang
Le Jiang
Xianyan Jia
Jie Zhang
Jianwei Zhang
Xu Zou
Zhikang Li
X. Deng
Jie Liu
Jinbao Xue
Huiling Zhou
Jianxin Ma
Jin Yu
Yong Li
Wei Lin
Jingren Zhou
J. Tang
Hongxia Yang
VLM
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"M6: A Chinese Multimodal Pretrainer"
44 / 44 papers shown
Title
TG-LMM: Enhancing Medical Image Segmentation Accuracy through Text-Guided Large Multi-Modal Model
Yihao Zhao
Enhao Zhong
Cuiyun Yuan
Yang Li
Man Zhao
Chunxia Li
Jun Hu
Chenbin Liu
VLM
MedIm
45
0
0
05 Sep 2024
SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning
Yuze Zhao
Jintao Huang
Jinghan Hu
Xingjun Wang
Yunlin Mao
...
Zhikai Wu
Baole Ai
Ang Wang
W. Zhou
Yingda Chen
44
30
0
10 Aug 2024
Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification
Chenyu Zhao
Yunjiang Jiang
Yiming Qiu
Han Zhang
Wen-Yun Yang
RALM
26
5
0
18 Aug 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
27
234
0
25 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
48
115
0
18 May 2023
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning
Youhe Jiang
Fangcheng Fu
Xupeng Miao
Xiaonan Nie
Tengjiao Wang
36
11
0
17 May 2023
Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform
Shiwei Zhang
Lansong Diao
Siyu Wang
Zongyan Cao
Yiliang Gu
Chang Si
Ziji Shi
Zhen Zheng
Chuan Wu
W. Lin
AI4CE
27
4
0
16 Feb 2023
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao
Bingkun Bao
Hao Tang
Changsheng Xu
DiffM
VLM
74
101
0
30 Jan 2023
Transferring General Multimodal Pretrained Models to Text Recognition
Junyang Lin
Xuancheng Ren
Yichang Zhang
Gao Liu
Peng Wang
An Yang
Chang Zhou
34
4
0
19 Dec 2022
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
30
31
0
21 Nov 2022
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
20
11
0
14 Nov 2022
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection
Yanxin Long
Jianhua Han
Runhu Huang
Xu Hang
Yi Zhu
Chunjing Xu
Xiaodan Liang
VLM
ObjD
32
18
0
02 Nov 2022
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Daniel Gehrig
Luc Van Gool
21
25
0
27 Oct 2022
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Dan Su
Pascale Fung
MLLM
VLM
32
62
0
14 Oct 2022
Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
Quan Liu
Youpeng Wen
Jianhua Han
Chunjing Xu
Hang Xu
Xiaodan Liang
VLM
26
67
0
18 Jul 2022
Knowledge Distillation of Transformer-based Language Models Revisited
Chengqiang Lu
Jianwei Zhang
Yunfei Chu
Zhengyu Chen
Jingren Zhou
Fei Wu
Haiqing Chen
Hongxia Yang
VLM
27
10
0
29 Jun 2022
Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval
Xiao Dong
Xunlin Zhan
Yunchao Wei
Xiaoyong Wei
Yaowei Wang
Minlong Lu
Xiaochun Cao
Xiaodan Liang
24
11
0
17 Jun 2022
DE-Net: Dynamic Text-guided Image Editing Adversarial Networks
Ming Tao
Bingkun Bao
Hao Tang
Fei Wu
Longhui Wei
Qi Tian
DiffM
24
15
0
02 Jun 2022
Modeling Image Composition for Complex Scene Generation
Zuopeng Yang
Daqing Liu
Chaoyue Wang
J. Yang
Dacheng Tao
ViT
36
50
0
02 Jun 2022
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
Jie Shi
Chenfei Wu
Jian Liang
Xiang Liu
Nan Duan
DiffM
14
25
0
01 Jun 2022
Improved Vector Quantized Diffusion Models
Zhicong Tang
Shuyang Gu
Jianmin Bao
Dong Chen
Fang Wen
DiffM
181
63
0
31 May 2022
M6-Fashion: High-Fidelity Multi-modal Image Generation and Editing
Zhikang Li
Huiling Zhou
Shuai Bai
Peike Li
Chang Zhou
Hongxia Yang
34
4
0
24 May 2022
Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters
Yang Xiang
Zhihua Wu
Weibao Gong
Siyu Ding
Xianjie Mo
...
Yue Yu
Ge Li
Yu Sun
Yanjun Ma
Dianhai Yu
24
4
0
19 May 2022
WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models
Shan Yuan
Shuai Zhao
Jiahong Leng
Zhao Xue
Hanyu Zhao
Peiyu Liu
Zheng Gong
Wayne Xin Zhao
Junyi Li
Tang Jie
VLM
29
5
0
22 Mar 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
22
90
0
12 Mar 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
43
88
0
14 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
53
850
0
07 Feb 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
M. Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
90
730
0
28 Jan 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
71
757
0
29 Nov 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
30
615
0
09 Nov 2021
Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Shaohua Wu
Xudong Zhao
Tong Yu
Rongguo Zhang
C. Shen
...
Feng Li
Hong Zhu
Jiangang Luo
Liang Xu
Xuanwei Zhang
ALM
29
59
0
10 Oct 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
206
110
0
22 Sep 2021
M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
Xiao Dong
Xunlin Zhan
Yangxin Wu
Yunchao Wei
Michael C. Kampffmeyer
Xiaoyong Wei
Minlong Lu
Yaowei Wang
Xiaodan Liang
33
36
0
09 Sep 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
M6-UFC: Unifying Multi-Modal Controls for Conditional Image Synthesis via Non-Autoregressive Generative Transformers
Zhu Zhang
Jianxin Ma
Chang Zhou
Rui Men
Zhikang Li
Ming Ding
Jie Tang
Jingren Zhou
Hongxia Yang
25
46
0
29 May 2021
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
21
762
0
26 May 2021
PanGu-
α
α
α
: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
32
212
0
26 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong-jin Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
30
41
0
19 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,220
0
16 Nov 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1