ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.13354
  4. Cited By
Audio-to-Image Cross-Modal Generation

Audio-to-Image Cross-Modal Generation

27 September 2021
Maciej Żelaszczyk
Jacek Mañdziuk
    DiffM
ArXivPDFHTML

Papers citing "Audio-to-Image Cross-Modal Generation"

11 / 11 papers shown
Title
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Joe Dhanith
Shravan Venkatraman
Modigari Narendra
Vigya Sharma
Santhosh Malarvannan
76
0
0
20 Feb 2025
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from
  Text
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
64
1
0
03 Dec 2024
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion
  Latent Aligners
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Yazhou Xing
Yin-Yin He
Zeyue Tian
Xintao Wang
Qifeng Chen
27
50
0
27 Feb 2024
Image Anything: Towards Reasoning-coherent and Training-free Multi-modal
  Image Generation
Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation
Yuanhuiyi Lyu
Xueye Zheng
Lin Wang
DiffM
35
9
0
31 Jan 2024
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model
  Adaptation
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Guy Yariv
Itai Gat
Sagie Benaim
Lior Wolf
Idan Schwartz
Yossi Adi
DiffM
VGen
31
36
0
28 Sep 2023
Audio Generation with Multiple Conditional Diffusion Model
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo
Jianguo Mao
Ruijie Tao
Long Yan
Kazushige Ouchi
Hong Liu
Xiangdong Wang
DiffM
21
11
0
23 Aug 2023
AudioToken: Adaptation of Text-Conditioned Diffusion Models for
  Audio-to-Image Generation
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Guy Yariv
Itai Gat
Lior Wolf
Yossi Adi
Idan Schwartz
DiffM
20
20
0
22 May 2023
Soundini: Sound-Guided Diffusion for Natural Video Editing
Soundini: Sound-Guided Diffusion for Natural Video Editing
Seung Hyun Lee
Si-Yeol Kim
Innfarn Yoo
Feng Yang
Donghyeon Cho
Youngseo Kim
Huiwen Chang
Jinkyu Kim
Sangpil Kim
VGen
DiffM
35
15
0
13 Apr 2023
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Haohe Liu
Zehua Chen
Yiitan Yuan
Xinhao Mei
Xubo Liu
Danilo P. Mandic
Wenwu Wang
Mark D. Plumbley
DiffM
33
467
0
29 Jan 2023
Audio-guided Album Cover Art Generation with Genetic Algorithms
Audio-guided Album Cover Art Generation with Genetic Algorithms
James Marien
Sam Leroux
Bart Dhoedt
Cedric De Boom
22
1
0
14 Jul 2022
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional
  Vision-Language Generation
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
27
58
0
31 Dec 2021
1