Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.16195
Cited By
SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
22 May 2025
Zhi-Wei Zhong
Akira Takahashi
Shuyang Cui
Keisuke Toyama
Shusuke Takahashi
Yuki Mitsufuji
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet"
4 / 4 papers shown
Title
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Siyuan Hou
Shansong Liu
Ruibin Yuan
Wei Xue
Ying Shan
Mangsuo Zhao
Chao Zhang
143
6
0
17 Jan 2025
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
Ruben Ciranni
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Giorgio Fabbro
Emanuele Rodolà
Luca Cosmo
275
8
0
10 Jan 2025
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Alex Schwing
Yuki Mitsufuji
VGen
285
18
0
19 Dec 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
215
1
0
02 Nov 2024
1