ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.15807
  4. Cited By
Emu: Enhancing Image Generation Models Using Photogenic Needles in a
  Haystack

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

27 September 2023
Xiaoliang Dai
Ji Hou
Chih-Yao Ma
Sam S. Tsai
Jialiang Wang
Rui Wang
Peizhao Zhang
Simon Vandenhende
Xiaofang Wang
Abhimanyu Dubey
Matthew Yu
Abhishek Kadian
Filip Radenovic
D. Mahajan
Kunpeng Li
Yue Zhao
Vladan Petrovic
Mitesh Kumar Singh
Simran Motwani
Yiqian Wen
Yi-Zhe Song
Roshan Sumbaly
Vignesh Ramanathan
Zijian He
Peter Vajda
Devi Parikh
    VLM
ArXivPDFHTML

Papers citing "Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack"

50 / 173 papers shown
Title
InstanceGen: Image Generation with Instance-level Instructions
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
33
0
0
08 May 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
76
1
0
24 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
36
0
0
20 Apr 2025
Autoregressive Distillation of Diffusion Transformers
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim
Sotiris Anagnostidis
Yuming Du
Edgar Schönfeld
Jonas Kohler
Markos Georgopoulos
Albert Pumarola
Ali K. Thabet
A. Sanakoyeu
28
0
0
15 Apr 2025
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Raman Dutt
Harleen Hanspal
Guoxuan Xia
Petru-Daniel Tudosiu
Alexander Black
Yongxin Yang
Jingyu Sun
Sarah Parisot
MoE
43
0
0
28 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
77
0
0
27 Mar 2025
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
Xuan Ju
Weicai Ye
Quande Liu
Qiulin Wang
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Qiang Xu
VGen
46
1
0
25 Mar 2025
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
Yunfan LU
Qichao Wang
H. Cao
Xierui Wang
Xiaoyin Xu
Min Zhang
64
0
0
24 Mar 2025
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan
Ting Zhang
Ying Deng
Jiapei Zhang
Yeshuang Zhu
Zexi Jia
Jie Zhou
Jinchao Zhang
VGen
44
0
0
22 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
74
1
0
20 Mar 2025
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation
Yihong Luo
Tianyang Hu
Weijian Luo
Kenji Kawaguchi
Jing Tang
EGVM
186
0
0
17 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
71
0
0
13 Mar 2025
Adding Additional Control to One-Step Diffusion with Joint Distribution Matching
Yihong Luo
Tianyang Hu
Yifan Song
Jiacheng Sun
Zechao Li
Jing Tang
DiffM
81
1
0
13 Mar 2025
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
Chen Chen
Rui Qian
Wenze Hu
Tsu-jui Fu
Jialing Tong
...
Lezhi Li
Bowen Zhang
A. Schwing
Wei Liu
Yuqing Yang
64
0
0
13 Mar 2025
Efficient Distillation of Classifier-Free Guidance using Adapters
Cristian Perez Jensen
Seyedmorteza Sadat
53
1
0
10 Mar 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao
Weijia Mao
Mike Zheng Shou
66
0
0
05 Mar 2025
All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Chaitanya K. Joshi
Xiang Fu
Yi-Lun Liao
Vahe Gharakhanyan
Benjamin Kurt Miller
Anuroop Sriram
Zachary W. Ulissi
DiffM
55
4
0
05 Mar 2025
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang
Shaobin Zhuang
Canmiao Fu
Binxin Yang
Ying Zhang
Chong Sun
Zhizheng Zhang
Yali Wang
Chen Li
Zheng-Jun Zha
DiffM
69
2
0
03 Mar 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Zhendong Wang
Jianmin Bao
Shuyang Gu
Dong Chen
Wengang Zhou
Yiming Li
DiffM
53
0
0
03 Mar 2025
A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images
A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images
Zineb Sordo
Eric Chagnon
Daniela Ushizima
EGVM
MedIm
69
1
0
28 Feb 2025
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars
Tobias Kirschstein
Javier Romero
Artem Sevastopolsky
Matthias Nießner
Shunsuke Saito
3DGS
70
0
0
27 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
92
0
0
27 Feb 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Guoqing Ma
Haoyang Huang
K. Yan
L. Chen
Nan Duan
...
Yansen Wang
Yuanwei Lu
Yu-Cheng Chen
Yu-Juan Luo
Y. Luo
DiffM
VGen
175
18
0
14 Feb 2025
Understanding Classifier-Free Guidance: High-Dimensional Theory and Non-Linear Generalizations
Understanding Classifier-Free Guidance: High-Dimensional Theory and Non-Linear Generalizations
Krunoslav Lehman Pavasovic
Jakob Verbeek
Giulio Biroli
Marc Mézard
64
0
0
11 Feb 2025
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model
Omid Saghatchian
Atiyeh Gh. Moghadam
Ahmad Nickabadi
MoMe
49
1
0
03 Jan 2025
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Ting Zhang
Zhiqiang Yuan
Yeshuang Zhu
Jinchao Zhang
DiffM
103
0
0
31 Dec 2024
DreamOmni: Unified Image Generation and Editing
DreamOmni: Unified Image Generation and Editing
Bin Xia
Yuechen Zhang
Jingyao Li
Chengyao Wang
Yitong Wang
Xinglong Wu
Bei Yu
Jiaya Jia
SyDa
MLLM
91
3
0
22 Dec 2024
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Runtao Liu
Haoyu Wu
Zheng Ziqiang
Chen Wei
Yingqing He
Renjie Pi
Qifeng Chen
VGen
83
12
0
18 Dec 2024
ColorFlow: Retrieval-Augmented Image Sequence Colorization
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Junhao Zhuang
Xuan Ju
Zhe Zhang
Yong-Jin Liu
Shiyi Zhang
Chun Yuan
Ying Shan
DiffM
110
1
0
16 Dec 2024
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation
  with Linear Computational Complexity
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Hongjie Wang
Chih-Yao Ma
Yen-Cheng Liu
Ji Hou
Tao Xu
...
Peizhao Zhang
Tingbo Hou
Peter Vajda
N. Jha
Xiaoliang Dai
LMTD
DiffM
VGen
VLM
83
5
0
13 Dec 2024
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Yuyang Wang
Anurag Ranjan
J. Susskind
Miguel Angel Bautista
3DPC
81
0
0
05 Dec 2024
Unleashing In-context Learning of Autoregressive Models for Few-shot
  Image Manipulation
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai
F. Xu
Miao Liu
Xiaoliang Dai
Nikhil Mehta
...
Zeyi Huang
James M. Rehg
Sangmin Lee
Ning Zhang
Tong Xiao
73
2
0
02 Dec 2024
IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models
IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models
Khaled Abud
Sergey Lavrushkin
Alexey Kirillov
D. Vatolin
94
0
0
02 Dec 2024
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Zongjian Li
Bin Lin
Yang Ye
Liuhan Chen
Xinhua Cheng
Shenghai Yuan
Li-xin Yuan
VGen
DiffM
115
16
0
26 Nov 2024
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved
  Visual Data Generation
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation
Tim Elsner
Paula Usinger
Julius Nehring-Wirxel
Gregor Kobsik
Victor Czech
Yanjiang He
I. Lim
Leif Kobbelt
39
0
0
15 Nov 2024
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply
  Better Samples
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples
Noël Vouitsis
Rasa Hosseinzadeh
Brendan Leigh Ross
Valentin Villecroze
S. Gorti
Jesse C. Cresswell
G. Loaiza-Ganem
DiffM
48
0
0
13 Nov 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient
  Recognition and Generation
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Anil Kag
Huseyin Coskun
Jierun Chen
Junli Cao
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
Jian Ren
51
3
0
07 Nov 2024
Boosting Latent Diffusion with Perceptual Objectives
Boosting Latent Diffusion with Perceptual Objectives
Tariq Berrada
Pietro Astolfi
Jakob Verbeek
Melissa Hall
Marton Havasi
M. Drozdzal
Yohann Benchetrit
Adriana Romero Soriano
Karteek Alahari
48
0
0
06 Nov 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
68
2
0
05 Nov 2024
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
Wei Cheng
Juncheng Mu
Xianfang Zeng
Xin Chen
Anqi Pang
...
Zhibin Wang
Bin-Bin Fu
Gang Yu
Z. Liu
Liang Pan
44
9
0
04 Nov 2024
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding
  and Conditioning
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
Penghui Ruan
Pichao Wang
Divya Saxena
Jiannong Cao
Yuhui Shi
DiffM
VGen
36
78
0
31 Oct 2024
Revisiting Reliability in Large-Scale Machine Learning Research Clusters
Revisiting Reliability in Large-Scale Machine Learning Research Clusters
Apostolos Kokolis
Michael Kuchnik
John Hoffman
Adithya Kumar
Parth Malani
Faye Ma
Zachary DeVito
Shri Kiran Srinivasan
Kalyan Saladi
Carole-Jean Wu
175
7
0
29 Oct 2024
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image
  Generative Models
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Weijian Luo
C. Zhang
Debing Zhang
Zhengyang Geng
28
3
0
28 Oct 2024
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via
  Dynamically Optimizing 3D Gaussians
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Chongjian Ge
Chenfeng Xu
Yuanfeng Ji
C-T.John Peng
Masayoshi Tomizuka
Ping Luo
Mingyu Ding
Varun Jampani
Weidong Zhan
3DGS
34
4
0
28 Oct 2024
Diff-Instruct++: Training One-step Text-to-image Generator Model to
  Align with Human Preferences
Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences
Weijian Luo
EGVM
36
6
0
24 Oct 2024
Scalable Ranked Preference Optimization for Text-to-Image Generation
Scalable Ranked Preference Optimization for Text-to-Image Generation
Shyamgopal Karthik
Huseyin Coskun
Zeynep Akata
Sergey Tulyakov
J. Ren
Anil Kag
EGVM
57
5
0
23 Oct 2024
Residual vector quantization for KV cache compression in large language
  model
Residual vector quantization for KV cache compression in large language model
Ankur Kumar
MQ
34
0
0
21 Oct 2024
Group Diffusion Transformers are Unsupervised Multitask Learners
Group Diffusion Transformers are Unsupervised Multitask Learners
Lianghua Huang
Wei Wang
Zhi-Fan Wu
Huanzhang Dou
Yupeng Shi
Yutong Feng
C. Liang
Yu Liu
Jingren Zhou
VLM
49
12
0
19 Oct 2024
Preference Optimization with Multi-Sample Comparisons
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
72
10
0
16 Oct 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
  Transformers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Yunfan LU
Song Han
VLM
46
51
0
14 Oct 2024
1234
Next