Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.08292
Cited By
Deep Cross-Modal Audio-Visual Generation
26 April 2017
Lele Chen
Sudhanshu Srivastava
Z. Duan
Chenliang Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Cross-Modal Audio-Visual Generation"
43 / 43 papers shown
Title
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Minjae Kang
Martim Brandão
64
0
0
25 Apr 2025
MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field
Zijian Győző Yang
Zhongwei Qiu
Chang Xu
Dongmei Fu
50
2
0
28 Jan 2025
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Wei Guo
Heng Wang
Jianbo Ma
Weidong Cai
DiffM
90
3
0
23 Nov 2024
X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios
Yichen Xie
Chenfeng Xu
C-T.John Peng
Shuqi Zhao
Nhat Ho
Alexander T. Pham
Mingyu Ding
Masayoshi Tomizuka
W. Zhan
DiffM
41
2
0
02 Nov 2024
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
31
12
0
08 Jul 2024
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
Yu-Fen Huang
Nikki Moran
Simon Coleman
Jon Kelly
Shun-Hwa Wei
...
Chih-Hsuan Li
Da-Yu Huang
Hsuan-Kai Kao
Ting-Wei Lin
Li Su
38
1
0
10 Jun 2024
Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval
Donghuo Zeng
Yanan Wang
Jianming Wu
K. Ikeda
24
4
0
07 Nov 2022
Multimodal Transformer for Parallel Concatenated Variational Autoencoders
Stephen D. Liang
J. Mendel
ViT
27
5
0
28 Oct 2022
Robust Sound-Guided Image Manipulation
Seung Hyun Lee
Gyeongrok Oh
Wonmin Byeon
Sang Ho Yoon
Jinkyu Kim
Sangpil Kim
DiffM
26
7
0
30 Aug 2022
Auto-regressive Image Synthesis with Integrated Quantization
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Kai Cui
Changgong Zhang
Shijian Lu
35
10
0
21 Jul 2022
Cross-Modal Contrastive Representation Learning for Audio-to-Image Generation
Haechun Chung
JooYong Shim
Jong-Kook Kim
27
3
0
20 Jul 2022
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
27
58
0
31 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
29
48
0
27 Dec 2021
Automated Side Channel Analysis of Media Software with Manifold Learning
Yuanyuan Yuan
Qi Pang
Shuai Wang
AAML
40
18
0
09 Dec 2021
Sound-Guided Semantic Image Manipulation
Seung Hyun Lee
Wonseok Roh
Wonmin Byeon
Sang Ho Yoon
Chanyoung Kim
Jinkyu Kim
Sangpil Kim
DiffM
24
43
0
30 Nov 2021
Learning Signal-Agnostic Manifolds of Neural Fields
Yilun Du
Katherine M. Collins
J. Tenenbaum
Vincent Sitzmann
MedIm
29
47
0
11 Nov 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
28
121
0
17 Oct 2021
Cross-Modal Virtual Sensing for Combustion Instability Monitoring
Tryambak Gangopadhyay
V. Ramanan
S. Chakravarthy
S. Sarkar
21
1
0
04 Oct 2021
Audio-to-Image Cross-Modal Generation
Maciej Żelaszczyk
Jacek Mańdziuk
DiffM
53
15
0
27 Sep 2021
Cross-modal Spectrum Transformation Network For Acoustic Scene classification
Yang Liu
A. Neophytou
Sunando Sengupta
Eric Sommerlade
21
9
0
13 Aug 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
27
26
0
20 Jul 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Rodrigo Mira
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Björn W. Schuller
M. Pantic
26
43
0
27 Apr 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
28
360
0
22 Apr 2021
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
31
37
0
05 Apr 2021
Sim-to-Real for Robotic Tactile Sensing via Physics-Based Simulation and Learned Latent Projections
Yashraj S. Narang
Balakumar Sundaralingam
Miles Macklin
Arsalan Mousavian
Dieter Fox
30
58
0
31 Mar 2021
Learning Audio-Visual Correlations from Variational Cross-Modal Generation
Ye Zhu
Yu Wu
Hugo Latapie
Yi Yang
Yan Yan
SSL
32
20
0
05 Feb 2021
Sound Synthesis, Propagation, and Rendering: A Survey
Shiguang Liu
Tianyi Zhou
24
26
0
11 Nov 2020
Video Generative Adversarial Networks: A Review
Nuha Aldausari
Arcot Sowmya
Nadine Marcus
Gelareh Mohammadi
EGVM
21
102
0
04 Nov 2020
Temporally Guided Music-to-Body-Movement Generation
Hsuan-Kai Kao
Li Su
39
42
0
17 Sep 2020
A Systematic Survey on Deep Generative Models for Graph Generation
Xiaojie Guo
Liang Zhao
MedIm
44
147
0
13 Jul 2020
Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and Fusion
Yang Wang
33
195
0
15 Jun 2020
Direct Speech-to-image Translation
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
36
29
0
07 Apr 2020
Deep Audio-Visual Learning: A Survey
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
Ran He
31
156
0
14 Jan 2020
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
29
88
0
24 Oct 2019
Translating Visual Art into Music
Max Müller-Eberstein
Nanne van Noord
DRL
21
7
0
03 Sep 2019
Realistic Speech-Driven Facial Animation with GANs
Konstantinos Vougioukas
Stavros Petridis
M. Pantic
39
289
0
14 Jun 2019
Co-Separating Sounds of Visual Objects
Ruohan Gao
Kristen Grauman
30
205
0
16 Apr 2019
Talking Face Generation by Conditional Recurrent Adversarial Network
Yang Song
Jingwen Zhu
Dawei Li
Xiaolong Wang
Hairong Qi
CVBM
27
192
0
13 Apr 2018
Lip Movements Generation at a Glance
Lele Chen
Zhiheng Li
R. Maddox
Z. Duan
Chenliang Xu
25
259
0
28 Mar 2018
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
33
425
0
23 Mar 2018
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
27
602
0
12 Feb 2018
Creating A Multi-track Classical Musical Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications
Bochen Li
Xinzhao Liu
K. Dinesh
Z. Duan
Gaurav Sharma
23
148
0
27 Dec 2016
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
170
840
0
17 May 2016
1