Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.17177
Cited By
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
27 February 2024
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
Ruoxi Chen
Zhengqing Yuan
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models"
50 / 77 papers shown
Title
The Role of Video Generation in Enhancing Data-Limited Action Understanding
Wei Li
Dezhao Luo
Dongbao Yang
Zhenhang Li
Weiping Wang
Yu Zhou
DiffM
VGen
140
0
0
26 May 2025
Temporal Differential Fields for 4D Motion Modeling via Image-to-Video Synthesis
Xin You
Minghui Zhang
Hanxiao Zhang
J. Yang
Nassir Navab
DiffM
VGen
MedIm
153
0
0
22 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Haoyang Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
128
0
0
12 May 2025
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
98
1
0
07 May 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGen
MDE
105
0
0
15 Apr 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
112
2
0
20 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
...
Zelin Peng
Junjun He
Junjun He
Zongyuan Ge
Imran Razzak
DiffM
VGen
172
2
0
20 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
69
0
0
13 Mar 2025
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
Yiming Zhong
Qi Jiang
Jingyi Yu
Yuexin Ma
119
3
0
11 Mar 2025
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Ning Ding
Jing Han
Yuchuan Tian
Chao Xu
Kai Han
Yehui Tang
MQ
107
0
0
10 Mar 2025
LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
Quanjian Song
Zhihang Lin
Zhanpeng Zeng
Ziyue Zhang
Liujuan Cao
Rongrong Ji
VGen
75
0
0
09 Mar 2025
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
Jian Ma
Qirong Peng
Xu Guo
Chen Chen
H. Lu
Zhenyu Yang
VLM
103
1
0
08 Mar 2025
BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling
Hao Li
Yu Huang
Chang Xu
Viktor Schlegel
Ren-He Jiang
Riza Batista-Navarro
Goran Nenadic
Jiang Bian
DiffM
AI4CE
319
3
0
04 Mar 2025
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation
Junchen Fu
Xuri Ge
Kaiwen Zheng
Ioannis Arapakis
Xin Xin
J. Jose
107
1
0
20 Feb 2025
DiffGuard: Text-Based Safety Checker for Diffusion Models
Massine El Khader
Elias Al Bouzidi
Abdellah Oumida
Mohammed Sbaihi
Eliott Binard
Jean-Philippe Poli
Wassila Ouerdane
Boussad Addad
Katarzyna Kapusta
DiffM
169
0
0
20 Feb 2025
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Yongfan Chen
Xiuwen Zhu
Tianyu Li
EGVM
VGen
88
3
0
08 Feb 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
118
10
0
23 Jan 2025
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
M. Gwilliam
Han Cai
Di Wu
Abhinav Shrivastava
Zhiyu Cheng
132
0
0
22 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
66
0
0
02 Jan 2025
ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation
Lujia Zhong
Shuo Huang
Yonggang Shi
84
0
0
31 Dec 2024
Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy
Shaoyan Pan
Yikang Liu
Lin Zhao
Eric Z. Chen
Xiao Chen
Terrence Chen
Shanhui Sun
VGen
MedIm
116
0
0
20 Dec 2024
Towards Efficient and Explainable Hate Speech Detection via Model Distillation
Paloma Piot
Javier Parapar
140
169
0
18 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
173
12
0
16 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
140
3
0
16 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
120
2
0
11 Dec 2024
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Jiahao Cui
Hui Li
Yun Zhan
Hanlin Shang
K. Cheng
Yuqi Ma
Shan Mu
Hang Zhou
Jingdong Wang
Siyu Zhu
ViT
VGen
126
7
0
01 Dec 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
95
3
0
05 Nov 2024
Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation
Xiaoyu Zhang
Teng Zhou
Xinlong Zhang
Jia Wei
Yongchuan Tang
67
2
0
24 Oct 2024
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Hongkang Li
Songtao Lu
Pin-Yu Chen
Xiaodong Cui
Meng Wang
LRM
39
6
0
03 Oct 2024
Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling
Mohammad R. Rezaei
Rahul G. Krishnan
Milos R. Popovic
M. Lankarany
DiffM
40
0
0
22 Sep 2024
LT3SD: Latent Trees for 3D Scene Diffusion
Quan Meng
Lei Li
Matthias Nießner
Angela Dai
126
12
0
12 Sep 2024
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Junjie Li
Yang Liu
Weiqing Liu
Shikai Fang
Lewen Wang
Chang Xu
Jiang Bian
VGen
63
4
0
04 Sep 2024
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
84
3
0
03 Sep 2024
Understanding Generative AI Content with Embedding Models
Max Vargas
Reilly Cannon
A. Engel
Anand D. Sarwate
Tony Chiang
116
3
0
19 Aug 2024
CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation
Weiheng Yao
Shuqiang Wang
Mufti Mahmud
Ning Zhong
Baiying Lei
Shuqiang Wang
MedIm
DiffM
43
2
0
16 Jul 2024
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
Delin Qu
Qizhi Chen
Pingrui Zhang
Xianqiang Gao
Bin Zhao
Bin Zhao
Dong Wang
Xuelong Li
AI4CE
61
8
0
23 Jun 2024
TerDiT: Ternary Diffusion Models with Transformers
Xudong Lu
Aojun Zhou
Ziyi Lin
Qi Liu
Yuhui Xu
Renrui Zhang
Yafei Wen
Shuai Ren
Peng Gao
Junchi Yan
MQ
76
3
0
23 May 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
79
57
0
22 Feb 2024
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
158
252
0
05 Jan 2024
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
Bichen Wu
Ching-Yao Chuang
Xiaoyan Wang
Yichen Jia
K. Krishnakumar
Tong Xiao
Feng Liang
Licheng Yu
Peter Vajda
DiffM
VGen
29
22
0
20 Dec 2023
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models
Yueqing Liang
Lu Cheng
Ali Payani
Kai Shu
51
3
0
15 Nov 2023
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Xinyuan Chen
Yaohui Wang
Lingjun Zhang
Shaobin Zhuang
Xin Ma
Jiashuo Yu
Yali Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
DiffM
29
137
0
31 Oct 2023
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Danfeng Hong
Wenming Weng
Hao Li
Yuhui Yuan
Jing Yao
Chong Luo
Zhibo Chen
Baining Guo
DiffM
VGen
31
44
0
28 Sep 2023
LawBench: Benchmarking Legal Knowledge of Large Language Models
Zhiwei Fei
Xiaoyu Shen
D. Zhu
Fengzhe Zhou
Zhuo Han
Songyang Zhang
Kai-xiang Chen
Zongwen Shen
Jidong Ge
ELM
AILaw
66
42
0
28 Sep 2023
A Practical Survey on Zero-shot Prompt Design for In-context Learning
Yinheng Li
LRM
40
47
0
22 Sep 2023
DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation
Afshin Bozorgpour
Yousef Sadegheih
Amirhossein Kazerouni
Reza Azad
Dorit Merhof
DiffM
MedIm
46
26
0
05 Aug 2023
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Wei Ping
Weixin Chen
Hengzhi Pei
Chulin Xie
Mintong Kang
...
Zinan Lin
Yuk-Kit Cheng
Sanmi Koyejo
D. Song
Yue Liu
58
405
0
20 Jun 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
55
12
0
22 May 2023
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni
Changhao Shi
Kaican Li
Sharon X. Huang
Martin Renqiang Min
VGen
DiffM
57
171
0
24 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
60
523
0
07 Mar 2023
1
2
Next