Title |
|---|
| Name | # Papers | # Citations |
|---|---|---|
| Date | Location | Event |
|---|---|---|
Innovative methods and technologies for generating high-quality video content using AI and machine learning techniques.
Title |
|---|
Title | |||
|---|---|---|---|
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Wenqiang Sun Haiyu Zhang Haoyuan Wang Junta Wu Zehan Wang Zhenwei Wang Yunhong Wang Jun Zhang Tengfei Wang Chunchao Guo | |||
![]() End-to-End Learning-based Video Streaming Enhancement Pipeline: A Generative AI Approach Emanuele Artioli Farzad Tashtarian Christian Timmerer | |||
Distill Video Datasets into Images Zhenghao Zhao Haoxuan Wang Kai Wang Yuzhang Shang Yuan Hong Yan Yan | |||
![]() S2D: Sparse-To-Dense Keymask Distillation for Unsupervised Video Instance Segmentation Leon Sick Lukas Hoyer Dominik Engel Pedro Hermosilla Timo Ropinski | |||
MMGR: Multi-Modal Generative Reasoning Zefan Cai Haoyi Qiu Tianyi Ma Haozhe Zhao Gengze Zhou ...Minjia Zhang Xiao Wen Jiuxiang Gu Nanyun Peng Junjie Hu | |||
![]() DRAW2ACT: Turning Depth-Encoded Trajectories into Robotic Demonstration Videos Yang Bai Liudi Yang George Eskandar Fengyi Shen Mohammad Altillawi Ziyuan Liu Gitta Kutyniok | |||
![]() MobileWorldBench: Towards Semantic World Modeling For Mobile Agents Shufan Li Konstantinos Kallidromitis Akash Gokul Yusuke Kato Kazuki Kozuka Aditya Grover | |||
![]() AnimaMimic: Imitating 3D Animation from Video Priors Tianyi Xie Yunuo Chen Yaowei Guo Yin Yang Bolei Zhou Demetri Terzopoulos Ying Jiang Chenfanfu Jiang | |||
![]() ViBES: A Conversational Agent with Behaviorally-Intelligent 3D Virtual Body Juze Zhang Changan Chen Xin Chen Heng Yu Tiange Xiang Ali Sartaz Khan Shrinidhi K. Lakshmikanth Ehsan Adeli | |||
![]() Elastic3D: Controllable Stereo Video Conversion with Guided Latent Decoding Nando Metzger Prune Truong Goutam Bhat Konrad Schindler Federico Tombari | |||
![]() Recurrent Video Masked Autoencoders Daniel Zoran Nikhil Parthasarathy Yi Yang Drew A Hudson Joao Carreira Andrew Zisserman | |||
![]() Beyond the Visible: Disocclusion-Aware Editing via Proxy Dynamic Graphs Anran Qi Changjian Li Adrien Bousseau Niloy J.Mitra | |||
![]() JoVA: Unified Multimodal Learning for Joint Video-Audio Generation Xiaohu Huang Hao Zhou Qiangpeng Yang Shilei Wen Kai Han | |||
![]() PoseAnything: Universal Pose-guided Video Generation with Part-aware Temporal Coherence Ruiyan Wang Teng Hu Kaihui Huang Zihan Su Ran Yi Lizhuang Ma | |||
![]() Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model Heyi Chen Siyan Chen Xin Chen Yanfei Chen Ying Chen ...Xueqiong Qu Yuxi Ren Kai Shen Guang Shi Lei Shi | |||
![]() KlingAvatar 2.0 Technical Report Kling Team Jialu Chen Yikang Ding Zhixue Fang Kun Gai ...Chao Wang Xuebo Wang Haoxian Zhang Yuanxing Zhang Yan Zhou | |||
![]() LongVie 2: Multimodal Controllable Ultra-Long Video World Model Jianxiong Gao Zhaoxi Chen Xian Liu Junhao Zhuang Chengming Xu Jianfeng Feng Yu Qiao Yanwei Fu Chenyang Si Ziwei Liu | |||
![]() Content Adaptive based Motion Alignment Framework for Learned Video Compression Tiange Zhang Xiandong Meng Siwei Ma | |||
![]() Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Jiaqi Wang Weijia Wu Yi Zhan Rui Zhao Ming Hu James Cheng Wei Liu Philip Torr Kevin Qinghong Lin | |||
![]() World Models Can Leverage Human Videos for Dexterous Manipulation Raktim Gautam Goswami Amir Bar David Fan Tsung-Yen Yang Gaoyue Zhou Prashanth Krishnamurthy Michael Rabbat Farshad Khorrami Yann LeCun | |||
![]() DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders Susung Hong Chongjian Ge Zhifei Zhang Jui-Hsien Wang | |||
![]() Do-Undo: Generating and Reversing Physical Actions in Vision-Language Models Shweta Mahajan Shreya Kadambi Hoang Le Munawar Hayat Fatih Porikli | |||
![]() STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits Foivos Paraperas Papantoniou Stathis Galanakis Rolandos Alexandros Potamias Bernhard Kainz Stefanos Zafeiriou | |||
![]() UniVCD: A New Method for Unsupervised Change Detection in the Open-Vocabulary Era Ziqiang Zhu Bowei Yang | |||
![]() SneakPeek: Future-Guided Instructional Streaming Video Generation Cheeun Hong German Barquero Fadime Sener Markos Georgopoulos Edgar Schönfeld Stefan Popov Yuming Du Oscar Mañas Albert Pumarola | |||
![]() Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation Jiangning Zhang Junwei Zhu Zhenye Gan Donghao Luo Chuming Lin ...Xu Chen Chencan Fu Keke He Xiaobin Hu Chengjie Wang | |||
![]() GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation Zhenya Yang Zhe Liu Yuxiang Lu Liping Hou Chenxuan Miao Siyi Peng Bailan Feng Xiang Bai Hengshuang Zhao | |||
![]() FysicsWorld: A Unified Full-Modality Benchmark for Any-to-Any Understanding, Generation, and Reasoning Yue Jiang Dingkang Yang Minghao Han Jinghang Han Zizhi Chen Yizhou Liu Mingcheng Li Peng Zhai Lihua Zhang | |||
![]() Robust Motion Generation using Part-level Reliable Data from Videos Boyuan Li Sipeng Zheng Bin Cao Ruihua Song Zongqing Lu | |||
![]() Animus3D: Text-driven 3D Animation via Motion Score Distillation Qi Sun Can Wang Jiaxiang Shang Wensen Feng Jing Liao | |||
![]() Generative Spatiotemporal Data Augmentation Jinfan Zhou Lixin Luo Sungmin Eum Heesung Kwon Jeong Joon Park | |||
![]() Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal Weihan Xu Kan Jen Cheng Koichi Saito Muhammad Jehanzeb Mirza Tingle Li ...Masato Ishii Takashi Shibuya Yuki Mitsufuji Gopala Anumanchipalli Paul Pu Liang | |||
![]() V-Warper: Appearance-Consistent Video Diffusion Personalization via Value Warping Hyunkoo Lee Wooseok Jang Jini Yang Taehwan Kim Sangoh Kim Sangwon Jung Seungryong Kim | |||
![]() STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative Peixuan Zhang Zijian Jia Kaiqi Liu Shuchen Weng Si Li Boxin Shi | |||
![]() ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States Haowen Wang Xiaoping Yuan Fugang Zhang Rui Jian Yuanwei Zhu Xiuquan Qiao Yakun Huang | |||
![]() AutoMV: An Automatic Multi-Agent System for Music Video Generation Xiaoxuan Tang Xinping Lei Chaoran Zhu Shiyun Chen Ruibin Yuan ...Wenhao Huang Emmanouil Benetos Yang Liu Jiaheng Liu Yinghao Ma | |||
![]() SMRABooth: Subject and Motion Representation Alignment for Customized Video Generation Xuancheng Xu Yaning Li Sisi You Bing-Kun Bao | |||
![]() CineLOG: A Training Free Approach for Cinematic Long Video Generation Zahra Dehghanian Morteza Abolghasemi Hamid Beigy Hamid R. Rabiee | |||
![]() Endless World: Real-Time 3D-Aware Long Video Generation Ke Zhang Yiqun Mei Jiacong Xu Vishal M. Patel | |||
![]() Audio-Visual Camera Pose Estimation with Passive Scene Sounds and In-the-Wild Video Daniel Adebi Sagnik Majumder Kristen Grauman | |||
![]() AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path Zhengyang Yu Akio Hayakawa Masato Ishii Qingtao Yu Takashi Shibuya Jing Zhang Yuki Mitsufuji | |||
![]() FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint Jiapeng Tang Kai Li Chengxiang Yin Liuhao Ge Fei Jiang ...Matthias Nießner Christian Häne Timur Bagautdinov Egor Zakharov Peihong Guo | |||
![]() Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation Yang Fei George Stoica Jingyuan Liu Qifeng Chen Ranjay Krishna Xiaojuan Wang Benlin Liu | |||
![]() V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Ye Fang Tong Wu Valentin Deschaintre Duygu Ceylan Iliyan Georgiev Chun-Hao Paul Huang Yiwei Hu Xuelin Chen Tuanfeng Yang Wang | |||
![]() Referring Change Detection in Remote Sensing Imagery Yilmaz Korkmaz Jay N. Paranjape Celso M. de Melo Vishal M. Patel | |||
![]() JoyAvatar: Real-time and Infinite Audio-Driven Avatar Generation with Autoregressive Diffusion Chaochao Li Ruikui Wang Liangbo Zhou Jinheng Feng Huaishao Luo Huan Zhang Youzheng Wu Xiaodong He | |||
![]() Flowception: Temporally Expansive Flow Matching for Video Generation Tariq Berrada Ifriqi John Nguyen Karteek Alahari Jakob Verbeek Ricky T. Q. Chen | |||
![]() FutureX: Enhance End-to-End Autonomous Driving via Latent Chain-of-Thought World Model Hongbin Lin Yiming Yang Yifan Zhang Chaoda Zheng Jie Feng ...Boyang Wang Yu Zhang Xianming Liu Shuguang Cui Zhen Li | |||
![]() BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models Ryan Po Eric Ryan Chan Changan Chen Gordon Wetzstein | |||
| Name (-) |
|---|
| Name (-) |
|---|
| Name (-) |
|---|
| Date | Location | Event | |
|---|---|---|---|
| No social events available | |||