Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09117
Cited By
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
16 November 2022
Tianhong Li
Huiwen Chang
Shlok Kumar Mishra
Han Zhang
Dina Katabi
Dilip Krishnan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis"
50 / 124 papers shown
Title
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang
Haoyue Zhan
Liwei Liu
Ruihong Zeng
Haotian Guo
Jiachen Zheng
Qiang Zhang
Shunsi Zhang
Shunsi Zhang
Zhizheng Wu
34
38
0
01 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
29
7
0
31 Aug 2024
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
64
6
0
13 Aug 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
Qian Zhang
Xiangzi Dai
Ninghua Yang
Xiang An
Ziyong Feng
Xingyu Ren
VLM
CLIP
43
17
0
02 Aug 2024
Self-supervised transformer-based pre-training method with General Plant Infection dataset
Zhengle Wang
Ruifeng Wang
Minjuan Wang
Tianyun Lai
Man Zhang
31
8
0
20 Jul 2024
COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
Liu He
Daniel G. Aliaga
AI4TS
49
8
0
16 Jul 2024
On the Role of Discrete Tokenization in Visual Representation Learning
Tianqi Du
Yifei Wang
Yisen Wang
44
7
0
12 Jul 2024
Diffusion Models and Representation Learning: A Survey
Michael Fuest
Pingchuan Ma
Ming Gui
Johannes S. Fischer
Vincent Tao Hu
Bjorn Ommer
DiffM
30
19
0
30 Jun 2024
Video Occupancy Models
Manan Tomar
Philippe Hansen-Estruch
Philip Bachman
Alex Lamb
John Langford
Matthew E. Taylor
Sergey Levine
40
1
0
25 Jun 2024
Unified Auto-Encoding with Masked Diffusion
Philippe Hansen-Estruch
S. Vishwanath
Amy Zhang
Manan Tomar
DiffM
60
1
0
25 Jun 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità
Zhi-Wei Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
63
2
0
25 Jun 2024
Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds
Hongliang Zeng
Ping Zhang
Fang Li
Jiahua Wang
Tingyu Ye
Pengteng Guo
3DPC
37
0
0
25 Jun 2024
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
50
171
0
17 Jun 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Lei Zhu
Fangyun Wei
Yanye Lu
Dong Chen
VLM
41
33
0
17 Jun 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoE
VLM
MLLM
36
14
0
13 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
34
2
0
11 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
30
14
0
08 Jun 2024
TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction
Yinda Chen
Haoyuan Shi
Xiaoyu Liu
Te Shi
Ruobing Zhang
Dong Liu
Zhiwei Xiong
Feng Wu
42
9
0
27 May 2024
LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image
Ruikai Cui
Xibin Song
Weixuan Sun
Senbo Wang
Weizhe Liu
...
Taizhang Shang
Yang Li
Nick Barnes
Hongdong Li
Pan Ji
3DV
45
5
0
24 May 2024
ParamReL: Learning Parameter Space Representation via Progressively Encoding Bayesian Flow Networks
Zhangkai Wu
Xuhui Fan
Jin Li
Zhilin Zhao
Hui Chen
LongBing Cao
44
2
0
24 May 2024
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang
Zhi-Wei Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
43
2
0
23 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
29
7
0
13 May 2024
FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival
Liangrui Pan
Yijun Peng
Yan Li
Yiyi Liang
Liwen Xu
Qingchun Liang
Shaoliang Peng
34
0
0
13 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
81
36
0
06 May 2024
HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression
Lei Lu
Yanyue Xie
Wei Jiang
Wei Wang
Xue Lin
Yanzhi Wang
31
4
0
20 Apr 2024
Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation
Xiaoshuang Huang
Hongxiang Li
Meng Cao
Long Chen
Chenyu You
Dong An
VLM
41
5
0
03 Apr 2024
Transformer based Pluralistic Image Completion with Reduced Information Loss
Qiankun Liu
Yuqi Jiang
Zhentao Tan
Dongdong Chen
Ying Fu
Qi Chu
Gang Hua
Nenghai Yu
ViT
60
11
0
31 Mar 2024
SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
Rui Zhu
Yingwei Pan
Yehao Li
Ting Yao
Zhenglong Sun
Tao Mei
C. Chen
50
23
0
25 Mar 2024
SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival
Liangrui Pan
Yijun Peng
Yan Li
Xiang Wang
Wenjuan Liu
Liwen Xu
Qingchun Liang
Shaoliang Peng
35
3
0
14 Mar 2024
WildFake: A Large-scale Challenging Dataset for AI-Generated Images Detection
Yan Hong
Jianfu Zhang
74
9
0
19 Feb 2024
Improving Token-Based World Models with Parallel Observation Prediction
Lior Cohen
Kaixin Wang
Bingyi Kang
Shie Mannor
18
2
0
08 Feb 2024
Machine Unlearning for Image-to-Image Generative Models
Guihong Li
Hsiang Hsu
Chun-Fu Chen
R. Marculescu
MU
VLM
68
25
0
01 Feb 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
26
14
0
25 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
21
7
0
06 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
33
14
0
31 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
27
45
0
28 Dec 2023
Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception
Xiao Wang
Wentao Wu
Chenglong Li
Zhicheng Zhao
Zhe Chen
Yukai Shi
Jin Tang
46
4
0
15 Dec 2023
SeiT++: Masked Token Modeling Improves Storage-efficient Training
Min-Seob Lee
Song Park
Byeongho Heo
Dongyoon Han
Hyunjung Shim
MQ
VLM
21
1
0
15 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
41
62
0
11 Dec 2023
Two-Stage Adaptive Network for Semi-Supervised Cross-Domain Crater Detection under Varying Scenario Distributions
Yifan Liu
Tiecheng Song
Chengye Xian
Ruiyuan Chen
Yi Zhao
Rui Li
Tan Guo
32
1
0
11 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
12
34
0
04 Dec 2023
Improve Supervised Representation Learning with Masked Image Modeling
Kaifeng Chen
Daniel M. Salz
Huiwen Chang
Kihyuk Sohn
Dilip Krishnan
Mojtaba Seyedhosseini
SSL
ViT
37
3
0
01 Dec 2023
MoMask: Generative Masked Modeling of 3D Human Motions
Chuan Guo
Yuxuan Mu
Muhammad Gohar Javed
Sen Wang
Li Cheng
VGen
19
117
0
29 Nov 2023
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay
M. Gwilliam
Yosuke Yamaguchi
Vatsal Agarwal
Namitha Padmanabhan
Archana Swaminathan
Tianyi Zhou
Abhinav Shrivastava
DiffM
24
11
1
29 Nov 2023
Reinforcement Learning with Maskable Stock Representation for Portfolio Management in Customizable Stock Pools
Wentao Zhang
Yilei Zhao
Shuo Sun
Jie Ying
Yonggang Xie
Zitao Song
Xinrun Wang
Bo An
AIFin
OOD
8
5
0
17 Nov 2023
Blind Image Super-resolution with Rich Texture-Aware Codebooks
Rui Qin
Ming-hui Sun
Fangyuan Zhang
Xingsen Wen
Bin Wang
26
6
0
26 Oct 2023
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder
Huiwon Jang
Jihoon Tack
Daewon Choi
Jongheon Jeong
Jinwoo Shin
18
2
0
25 Oct 2023
A Pytorch Reproduction of Masked Generative Image Transformer
Victor Besnier
Mickael Chen
ViT
56
12
0
22 Oct 2023
Excision And Recovery: Visual Defect Obfuscation Based Self-Supervised Anomaly Detection Strategy
Yeonghyeon Park
Sungho Kang
Myung Jin Kim
Yeonho Lee
Hyeong Seok Kim
Juneho Yi
AAML
18
2
0
06 Oct 2023
Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency
Tianhong Li
Sangnie Bhardwaj
Yonglong Tian
Han Zhang
Jarred Barber
Dina Katabi
Guillaume Lajoie
Huiwen Chang
Dilip Krishnan
VLM
36
4
0
05 Oct 2023
Previous
1
2
3
Next