Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.11565
Cited By
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
22 December 2022
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
W. Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation"
50 / 565 papers shown
Title
On the Generalization Properties of Diffusion Models
Puheng Li
Zhong Li
Huishuai Zhang
Jiang Bian
74
29
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
Accelerating Diffusion Sampling via Exploiting Local Transition Coherence
Shangwen Zhu
Han Zhang
Zhantao Yang
Qianyu Peng
Zhao Pu
Haoran Wang
Fan Cheng
DiffM
48
0
0
12 Mar 2025
TPDiff: Temporal Pyramid Video Diffusion Model
L. Ran
Mike Zheng Shou
80
0
0
12 Mar 2025
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space
Yifan Zhou
Zeqi Xiao
Shuai Yang
Xingang Pan
69
2
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
74
0
0
11 Mar 2025
Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu
Zeyu Zhu
Mike Zheng Shou
VGen
77
2
0
10 Mar 2025
IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement
Zhihao Shi
Dong Huo
Yuhongze Zhou
Kejia Yin
Yan Min
Juwei Lu
Wei Ji
69
1
0
06 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
92
1
0
06 Mar 2025
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Zhao Yang
Zezhong Qian
Xiaofan Li
Weixiang Xu
Gongpeng Zhao
Ruohong Yu
Lingsi Zhu
Longjun Liu
DiffM
VGen
65
1
0
05 Mar 2025
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models
Jay Zhangjie Wu
Yuxuan Zhang
Haithem Turki
Xuanchi Ren
Jun Gao
Mike Zheng Shou
Sanja Fidler
Zan Gojcic
Huan Ling
142
1
0
03 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
150
0
0
01 Mar 2025
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
Yifei Xia
Suhan Ling
Fangcheng Fu
Y. Wang
Huixia Li
Xuefeng Xiao
Bin Cui
VGen
65
2
0
28 Feb 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
93
0
0
27 Feb 2025
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffM
VGen
49
0
0
24 Feb 2025
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffM
VGen
49
5
0
24 Feb 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
56
0
0
23 Feb 2025
CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers
D. She
Mushui Liu
Jingxuan Pang
Jin Wang
Zhen Yang
...
Yi Wang
Qihan Huang
Haobin Tang
Yunlong Yu
Siming Fu
VGen
96
4
0
21 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
137
2
0
20 Feb 2025
Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
Yizhuo Lu
Changde Du
Chong Wang
Xuanliu Zhu
Liuyun Jiang
Xujin Li
Huiguang He
VGen
125
4
0
20 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Z. Yang
Mike Zheng Shou
MoE
78
0
0
10 Feb 2025
Dual Caption Preference Optimization for Diffusion Models
Amir Saeidi
Yiran Luo
Agneet Chatterjee
Shamanthak Hegde
Bimsara Pathiraja
Yezhou Yang
Chitta Baral
DiffM
63
0
0
09 Feb 2025
AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection
Shuheng Zhang
Yong-Jin Liu
Hongbo Zhou
Jun Peng
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
VGen
43
0
0
08 Feb 2025
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Yongfan Chen
Xiuwen Zhu
Tianyu Li
EGVM
VGen
56
3
0
08 Feb 2025
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
183
11
0
03 Feb 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
46
8
0
23 Jan 2025
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
Zheng Chong
Wenqing Zhang
Shiyue Zhang
Jun Zheng
Xiao Dong
Haoxiang Li
Yiling Wu
D. Jiang
Xiaodan Liang
DiffM
32
1
0
20 Jan 2025
SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing
Varun Biyyala
Bharat Chanderprakash Kathuria
Jialu Li
Youshan Zhang
52
0
0
13 Jan 2025
IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion
Tharun Anand
Aryan Garg
Kaushik Mitra
VGen
DiffM
52
0
0
13 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffM
VGen
39
0
0
11 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
45
10
0
08 Jan 2025
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control
Yuanpeng Tu
Hao Luo
Xi Chen
S. Ji
Xiang Bai
Hengshuang Zhao
VGen
DiffM
42
3
0
08 Jan 2025
TDM: Temporally-Consistent Diffusion Model for All-in-One Real-World Video Restoration
Yizhou Li
Zihua Liu
Yusuke Monno
Masatoshi Okutomi
DiffM
VGen
31
1
0
04 Jan 2025
MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
Haoyu Zheng
Wenqiao Zhang
Zheqi Lv
Yu Zhong
Yang Dai
...
Yongliang Shen
Juncheng Billy Li
Dongping Zhang
Siliang Tang
Yueting Zhuang
DiffM
VGen
57
0
0
31 Dec 2024
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
H. Zhang
Tat-Seng Chua
Shuicheng Yan
64
38
0
31 Dec 2024
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
Hui Zhang
Zuxuan Wu
Zhen Xing
Jie Shao
Yu-Gang Jiang
51
9
0
31 Dec 2024
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Ting Zhang
Zhiqiang Yuan
Yeshuang Zhu
Jinchao Zhang
DiffM
101
0
0
31 Dec 2024
VidTwin: Video VAE with Decoupled Structure and Dynamics
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Jiang Bian
DRL
VGen
77
3
0
23 Dec 2024
Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Beiyuan Zhang
Yue Ma
Chunlei Fu
Xinyang Song
Zhenan Sun
Ziqiang Li
DiffM
VGen
84
1
0
21 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
105
3
0
16 Dec 2024
OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation
Bohan Li
Xin Jin
J. Wang
Yukai Shi
Yasheng Sun
...
Zhuang Ma
Baao Xie
Chao Ma
Xiaokang Yang
Wenjun Zeng
DiffM
169
1
0
15 Dec 2024
EVLM: Self-Reflective Multimodal Reasoning for Cross-Dimensional Visual Editing
Umar Khalid
Hasan Iqbal
Azib Farooq
Nazanin Rahnavard
Jing Hua
...
H. Iqbal
Azib Farooq
Nazanin Rahnavard
Jing Hua
Chen Chen
77
0
0
13 Dec 2024
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
DiffM
VGen
100
4
0
13 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip H. S. Torr
VLM
ObjD
197
0
0
12 Dec 2024
DIVE: Taming DINO for Subject-Driven Video Editing
Yi Huang
Wei Xiong
He Zhang
Chaoqi Chen
Jianzhuang Liu
Mingfu Yan
Shifeng Chen
VGen
DiffM
76
0
0
04 Dec 2024
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He
Chin-Hsuan Wu
Igor Gilitschenski
DiffM
3DGS
81
0
0
02 Dec 2024
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
Yuelei Wang
Jian Zhang
Pengtao Jiang
H. Zhang
Jinwei Chen
Bo Li
VGen
DiffM
107
4
0
02 Dec 2024
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang
Bin Zhu
Bin Lin
Mingzhe Zheng
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
VGen
3DH
79
4
0
30 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
F. Khan
Mubarak Shah
89
3
0
29 Nov 2024
Previous
1
2
3
4
5
...
10
11
12
Next