ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.03755
  4. Cited By
Denoising with a Joint-Embedding Predictive Architecture
v1v2 (latest)

Denoising with a Joint-Embedding Predictive Architecture

2 October 2024
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Denoising with a Joint-Embedding Predictive Architecture"

50 / 110 papers shown
Title
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
115
191
0
20 Aug 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffMVGen
237
565
0
12 Aug 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLMVLMMU
175
981
0
15 Jul 2024
Autoregressive Image Generation without Vector Quantization
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
129
235
0
17 Jun 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Le Zhuo
Ruoyi Du
Han Xiao
Yangguang Li
Dongyang Liu
...
Wanli Ouyang
Ziwei Liu
Ping Luo
Hongsheng Li
Peng Gao
100
58
0
05 Jun 2024
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and
  Next-Token Prediction
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
Maciej Kilian
Varun Jampani
Luke Zettlemoyer
DiffM
77
8
0
21 May 2024
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
201
452
0
20 May 2024
KAN: Kolmogorov-Arnold Networks
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
259
569
0
30 Apr 2024
S-JEPA: towards seamless cross-dataset transfer through dynamic spatial
  attention
S-JEPA: towards seamless cross-dataset transfer through dynamic spatial attention
Pierre Guetschel
Thomas Moreau
Michael Tangermann
61
9
0
18 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
291
1,388
0
05 Mar 2024
Revisiting Feature Prediction for Learning Visual Representations from
  Video
Revisiting Feature Prediction for Learning Visual Representations from Video
Adrien Bardes
Q. Garrido
Jean Ponce
Xinlei Chen
Michael G. Rabbat
Yann LeCun
Mahmoud Assran
Nicolas Ballas
MDEVLM
139
87
0
15 Feb 2024
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
  Interpolant Transformers
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Nanye Ma
Mark Goldstein
M. S. Albergo
Nicholas M. Boffi
Eric Vanden-Eijnden
Saining Xie
DiffM
121
214
0
16 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffMVGen
255
278
0
05 Jan 2024
FreeControl: Training-Free Spatial Control of Any Text-to-Image
  Diffusion Model with Any Condition
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Sicheng Mo
Fangzhou Mu
Kuan Heng Lin
Yanli Liu
Bochen Guan
Yin Li
Bolei Zhou
DiffM
97
67
0
12 Dec 2023
DiffiT: Diffusion Vision Transformers for Image Generation
DiffiT: Diffusion Vision Transformers for Image Generation
Ali Hatamizadeh
Jiaming Song
Guilin Liu
Jan Kautz
Arash Vahdat
71
74
0
04 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
65
41
0
04 Dec 2023
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Lijun Yu
José Lezama
N. B. Gundavarapu
Luca Versari
Kihyuk Sohn
...
Boqing Gong
Ming-Hsuan Yang
Irfan Essa
David A. Ross
Lu Jiang
109
323
0
09 Oct 2023
PixArt-$α$: Fast Training of Diffusion Transformer for
  Photorealistic Text-to-Image Synthesis
PixArt-ααα: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
104
456
0
30 Sep 2023
Finite Scalar Quantization: VQ-VAE Made Simple
Finite Scalar Quantization: VQ-VAE Made Simple
Fabian Mentzer
David C. Minnen
E. Agustsson
Michael Tschannen
98
190
0
27 Sep 2023
MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised
  Learning of Motion and Content Features
MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features
Adrien Bardes
Jean Ponce
Yann LeCun
MDE
85
27
0
24 Jul 2023
JourneyDB: A Benchmark for Generative Image Understanding
JourneyDB: A Benchmark for Generative Image Understanding
Keqiang Sun
Junting Pan
Yuying Ge
Hao Li
Haodong Duan
...
Yi Wang
Jifeng Dai
Yu Qiao
Limin Wang
Hongsheng Li
105
120
0
03 Jul 2023
Diffusion Models as Masked Autoencoders
Diffusion Models as Masked Autoencoders
Chen Wei
K. Mangalam
Po-Yao (Bernie) Huang
Yanghao Li
Haoqi Fan
Hu Xu
Huiyu Wang
Cihang Xie
Alan Yuille
Christoph Feichtenhofer
DiffMSyDa
79
53
0
06 Apr 2023
MoStGAN-V: Video Generation with Temporal Motion Styles
MoStGAN-V: Video Generation with Temporal Motion Styles
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VGen
61
32
0
05 Apr 2023
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
M. S. Albergo
Nicholas M. Boffi
Eric Vanden-Eijnden
DiffM
303
325
0
15 Mar 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
349
1,231
0
07 Mar 2023
Understanding Diffusion Objectives as the ELBO with Simple Data
  Augmentation
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation
Diederik P. Kingma
Ruiqi Gao
DiffM
54
142
0
01 Mar 2023
Video Probabilistic Diffusion Models in Projected Latent Space
Video Probabilistic Diffusion Models in Projected Latent Space
Sihyun Yu
Kihyuk Sohn
Subin Kim
Jinwoo Shin
VGenDiffM
92
170
0
15 Feb 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
219
344
0
30 Jan 2023
Self-Supervised Learning from Images with a Joint-Embedding Predictive
  Architecture
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Mahmoud Assran
Quentin Duval
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Yann LeCun
Nicolas Ballas
SSLAI4TSMDE
98
360
0
19 Jan 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
120
2,434
0
19 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffMVGen
77
248
0
10 Dec 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffMVGen
95
238
0
23 Nov 2022
MAGE: MAsked Generative Encoder to Unify Representation Learning and
  Image Synthesis
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Tianhong Li
Huiwen Chang
Shlok Kumar Mishra
Han Zhang
Dina Katabi
Dilip Krishnan
74
169
0
16 Nov 2022
Flow Matching for Generative Modeling
Flow Matching for Generative Modeling
Y. Lipman
Ricky T. Q. Chen
Heli Ben-Hamu
Maximilian Nickel
Matt Le
OOD
213
1,389
0
06 Oct 2022
Diffusion Models in Vision: A Survey
Diffusion Models in Vision: A Survey
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffMVLMMedIm
327
1,238
0
10 Sep 2022
Contrastive Masked Autoencoders are Stronger Vision Learners
Contrastive Masked Autoencoders are Stronger Vision Learners
Zhicheng Huang
Xiaojie Jin
Cheng Lu
Qibin Hou
Mingg-Ming Cheng
Dongmei Fu
Xiaohui Shen
Jiashi Feng
119
153
0
27 Jul 2022
Elucidating the Design Space of Diffusion-Based Generative Models
Elucidating the Design Space of Diffusion-Based Generative Models
Tero Karras
M. Aittala
Timo Aila
S. Laine
DiffM
215
2,033
0
01 Jun 2022
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Alexander Kolesnikov
André Susano Pinto
Lucas Beyer
Xiaohua Zhai
Jeremiah Harmsen
N. Houlsby
158
71
0
20 May 2022
Masked Siamese Networks for Label-Efficient Learning
Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
105
323
0
14 Apr 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffMVGen
209
1,640
0
07 Apr 2022
MVP: Multimodality-guided Visual Pre-training
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
58
107
0
10 Mar 2022
Autoregressive Image Generation using Residual Quantization
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
277
378
0
03 Mar 2022
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
298
30,150
0
01 Mar 2022
Generating Videos with Dynamics-aware Implicit Generative Adversarial
  Networks
Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
Sihyun Yu
Jihoon Tack
Sangwoo Mo
Hyunsu Kim
Junho Kim
Jung-Woo Ha
Jinwoo Shin
DiffMVGen
104
201
0
21 Feb 2022
MaskGIT: Masked Generative Image Transformer
MaskGIT: Masked Generative Image Transformer
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
153
695
0
08 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech,
  Vision and Language
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSLVLMViT
99
859
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
123
396
0
07 Feb 2022
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality
  and Perks of StyleGAN2
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Ivan Skorokhodov
Sergey Tulyakov
Mohamed Elhoseiny
VGen
93
288
0
29 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
496
15,768
0
20 Dec 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
122
244
0
24 Nov 2021
123
Next