ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.04509
  4. Cited By
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable
  Diffusion

The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

8 September 2023
Yujin Jeong
Won-Wha Ryoo
Seunghyun Lee
Dabin Seo
Wonmin Byeon
Sangpil Kim
Jinkyu Kim
    DiffM
ArXivPDFHTML

Papers citing "The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion"

22 / 22 papers shown
Title
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Md. Naimur Asif Borno
Md Sakib Hossain Shovon
Asmaa Soliman Al-Moisheer
Mohammad Ali Moni
36
0
0
11 May 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Zihan Wang
Xiaodong Yu
Jialian Wu
Xingchen Sun
Yusheng Su
Alan L. Yuille
Zicheng Liu
Emad Barsoum
DiffM
VGen
51
0
0
13 Apr 2025
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
63
1
0
12 Nov 2024
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
34
7
0
09 Oct 2024
D$^4$M: Dataset Distillation via Disentangled Diffusion Model
D4^44M: Dataset Distillation via Disentangled Diffusion Model
Duo Su
Junjie Hou
Weizhi Gao
Yingjie Tian
Bowen Tang
DD
53
19
0
21 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serrà
DiffM
VGen
47
15
0
15 Jul 2024
Read, Watch and Scream! Sound Generation from Text and Video
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
34
12
0
08 Jul 2024
Sequential Contrastive Audio-Visual Learning
Sequential Contrastive Audio-Visual Learning
Ioannis Tsiamas
Santiago Pascual
Chunghsin Yeh
Joan Serrà
44
2
0
08 Jul 2024
VCoME: Verbal Video Composition with Multimodal Editing Effects
VCoME: Verbal Video Composition with Multimodal Editing Effects
Weibo Gong
Xiaojie Jin
Xin Li
Dongliang He
Xinglong Wu
43
0
0
05 Jul 2024
Diffusion Model-Based Video Editing: A Survey
Diffusion Model-Based Video Editing: A Survey
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Dacheng Tao
VGen
66
22
0
26 Jun 2024
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
Trevine Oorloff
Surya Koppisetti
Nicolo Bonettini
Divyaraj Solanki
Ben Colman
Yaser Yacoob
Ali Shahriyari
Gaurav Bharaj
46
21
0
05 Jun 2024
SonicDiffusion: Audio-Driven Image Generation and Editing with
  Pretrained Diffusion Models
SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models
Burak Can Biner
Farrin Marouf Sofian
Umur Berkay Karakacs
Duygu Ceylan
Erkut Erdem
Aykut Erdem
23
8
0
01 May 2024
Audio-Synchronized Visual Animation
Audio-Synchronized Visual Animation
Lin Zhang
Shentong Mo
Yijing Zhang
Pedro Morgado
DiffM
45
20
0
08 Mar 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for
  Text-to-Image Generation
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee
Yinxiao Li
Junjie Ke
Innfarn Yoo
Han Zhang
...
Junfeng He
Gang Li
Sangpil Kim
Irfan Essa
Feng Yang
EGVM
41
18
0
11 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
54
9
0
31 Dec 2023
Diffusion for Natural Image Matting
Diffusion for Natural Image Matting
Yihan Hu
Yiheng Lin
Wei Wang
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
28
7
0
10 Dec 2023
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional
  Modeling
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling
Ruihan Yang
H. Gamper
Sebastian Braun
DiffM
32
5
0
08 Dec 2023
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Moayed Haji-Ali
Guha Balakrishnan
Vicente Ordonez
56
24
0
30 Nov 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of
  Latent-Based Diffusion Models
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Poyuan Mao
Shashank Kotyan
Tham Yik Foong
Danilo Vasconcellos Vargas
29
5
0
24 Nov 2023
A Survey on Video Diffusion Models
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
57
117
0
16 Oct 2023
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware
  Motion Model
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Xinya Ji
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Wayne Wu
Feng Xu
Xun Cao
CVBM
60
157
0
30 May 2022
Sound2Sight: Generating Visual Dynamics from Sound and Context
Sound2Sight: Generating Visual Dynamics from Sound and Context
A. Cherian
Moitreya Chatterjee
Narendra Ahuja
VGen
77
35
0
23 Jul 2020
1