ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.11739
  4. Cited By
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
v1v2 (latest)

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

16 April 2025
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
ArXiv (abs)PDFHTML

Papers citing "The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation"

44 / 44 papers shown
Title
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
Shuai Yang
Jing Tan
Mengchen Zhang
Tong Wu
Yongqian Li
Gordon Wetzstein
Ziwei Liu
Dahua Lin
MDEVGen
153
9
0
24 Feb 2025
WorldSimBench: Towards Video Generation Models as World Simulators
WorldSimBench: Towards Video Generation Models as World Simulators
Yiran Qin
Zhelun Shi
Jiwen Yu
Xijun Wang
Enshen Zhou
...
Lu Sheng
Jing Shao
Junlin Wu
Wanli Ouyang
Ruimao Zhang
EGVMVGen
197
474
0
23 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
158
86
0
08 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
148
35
0
03 Oct 2024
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in
  Text-to-Image Encoders through Causal Analysis and Embedding Optimization
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Chieh-Yun Chen
Chiang Tseng
Li-Wu Tsao
Hong-Han Shuai
74
8
0
01 Oct 2024
AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation
AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation
Boyu Han
Qianqian Xu
Zhiyong Yang
Shilong Bao
Peisong Wen
Yangbangyan Jiang
Qingming Huang
101
4
0
30 Sep 2024
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We
  Learn How Vision-Language Models Function
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Chenyi Zhuang
Ying Hu
Pan Gao
DiffMVLM
88
11
0
30 Sep 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffMVGen
241
565
0
12 Aug 2024
Prompt Refinement with Image Pivot for Text-to-Image Generation
Prompt Refinement with Image Pivot for Text-to-Image Generation
Jingtao Zhan
Qingyao Ai
Yiqun Liu
Yingwei Pan
Ting Yao
Jiaxin Mao
Shaoping Ma
Tao Mei
EGVM
65
4
0
28 Jun 2024
Dynamic Prompt Optimizing for Text-to-Image Generation
Dynamic Prompt Optimizing for Text-to-Image Generation
Wenyi Mo
Tianyu Zhang
Yalong Bai
Fuchun Sun
Ji-Rong Wen
Qing Yang
75
13
0
05 Apr 2024
Capability-aware Prompt Reformulation Learning for Text-to-Image
  Generation
Capability-aware Prompt Reformulation Learning for Text-to-Image Generation
Jingtao Zhan
Qingyao Ai
Yiqun Liu
Jia Chen
Shaoping Ma
DiffM
75
4
0
27 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
310
1,403
0
05 Mar 2024
A User-Friendly Framework for Generating Model-Preferred Prompts in
  Text-to-Image Synthesis
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
Nailei Hei
Qianyu Guo
Zihao Wang
Yan Wang
Haofen Wang
Wenqiang Zhang
DiffM
63
2
0
20 Feb 2024
Vlogger: Make Your Dream A Vlog
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGenDiffM
74
38
0
17 Jan 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video
  Diffusion Models
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen
Yong Zhang
Xiaodong Cun
Menghan Xia
Xintao Wang
Chao-Liang Weng
Ying Shan
VGenDiffM
246
322
0
17 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffMVGen
271
278
0
05 Jan 2024
VBench: Comprehensive Benchmark Suite for Video Generative Models
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang
Yinan He
Jiashuo Yu
Fan Zhang
Chenyang Si
...
Xinyuan Chen
Limin Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
198
451
0
29 Nov 2023
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
...
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
281
1,188
0
25 Nov 2023
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and
  Prediction
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Xinyuan Chen
Yaohui Wang
Lingjun Zhang
Shaobin Zhuang
Xin Ma
Jiashuo Yu
Yali Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGenDiffM
75
145
0
31 Oct 2023
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
Haonan Qiu
Menghan Xia
Yong Zhang
Yin-Yin He
Xintao Wang
Ying Shan
Ziwei Liu
DiffMVGen
87
101
0
23 Oct 2023
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Yaofang Liu
Xiaodong Cun
Xuebo Liu
Xintao Wang
Yong Zhang
Haoxin Chen
Yang Liu
Tieyong Zeng
Raymond H. F. Chan
Ying Shan
VGenEGVM
93
144
0
17 Oct 2023
Tailored Visions: Enhancing Text-to-Image Generation with Personalized
  Prompt Rewriting
Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Zijie Chen
Lichao Zhang
Fangsheng Weng
Lili Pan
Zhenzhong Lan
75
10
0
12 Oct 2023
Mistral 7B
Mistral 7B
Albert Q. Jiang
Alexandre Sablayrolles
A. Mensch
Chris Bamford
Devendra Singh Chaplot
...
Teven Le Scao
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoELRM
113
2,246
0
10 Oct 2023
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
David Junhao Zhang
Jay Zhangjie Wu
Jia-Wei Liu
Rui Zhao
L. Ran
Yuchao Gu
Difei Gao
Mike Zheng Shou
DiffMVGen
107
222
0
27 Sep 2023
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion
  Models
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Yaohui Wang
Xinyuan Chen
Xin Ma
Shangchen Zhou
Ziqi Huang
...
Chen Change Loy
Bo Dai
Dahua Lin
Yu Qiao
Ziwei Liu
VGenDiffM
102
232
0
26 Sep 2023
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Yin-Yin He
Menghan Xia
Haoxin Chen
Xiaodong Cun
Yuan Gong
...
Yong Zhang
Xintao Wang
Chao-Liang Weng
Ying Shan
Qifeng Chen
DiffMVGen
40
79
0
13 Jul 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models
  without Specific Tuning
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
134
877
0
10 Jul 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
266
2,450
0
04 Jul 2023
Grounded Text-to-Image Synthesis with Attention Refocusing
Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
90
113
0
08 Jun 2023
Probabilistic Adaptation of Text-to-Video Models
Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang
Yilun Du
Bo Dai
Dale Schuurmans
J. Tenenbaum
Pieter Abbeel
VGenDiffM
121
25
0
02 Jun 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGenDiffM
114
93
0
29 May 2023
Fantasia3D: Disentangling Geometry and Appearance for High-quality
  Text-to-3D Content Creation
Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
Rui Chen
Yuxiao Chen
Ningxin Jiao
Kui Jia
DiffM
109
591
0
24 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,761
0
15 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.5K
13,490
0
27 Feb 2023
Structure and Content-Guided Video Synthesis with Diffusion Models
Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
DiffMVGen
180
538
0
06 Feb 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image
  Diffusion Models
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
122
516
0
31 Jan 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
122
2,436
0
19 Dec 2022
Optimizing Prompts for Text-to-Image Generation
Optimizing Prompts for Text-to-Image Generation
Y. Hao
Zewen Chi
Li Dong
Furu Wei
109
151
0
19 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional
  Text-to-Image Synthesis
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Xinze Wang
William Yang Wang
CoGe
152
318
0
09 Dec 2022
Magic3D: High-Resolution Text-to-3D Content Creation
Magic3D: High-Resolution Text-to-3D Content Creation
Chen-Hsuan Lin
Jun Gao
Luming Tang
Towaki Takikawa
Fangyin Wei
Xun Huang
Karsten Kreis
Sanja Fidler
Ming-Yuan Liu
Nayeon Lee
201
1,166
0
18 Nov 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
466
6,083
0
23 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLMDiffM
425
6,921
0
13 Apr 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffMVGen
230
1,642
0
07 Apr 2022
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
511
15,788
0
20 Dec 2021
1