Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2208.14114
Cited By
Robust Sound-Guided Image Manipulation
30 August 2022
Seung Hyun Lee
Gyeongrok Oh
Wonmin Byeon
Sang Ho Yoon
Jinkyu Kim
Sangpil Kim
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Robust Sound-Guided Image Manipulation"
41 / 41 papers shown
Title
Sound-Guided Semantic Image Manipulation
Seung Hyun Lee
Wonseok Roh
Wonmin Byeon
Sang Ho Yoon
Chanyoung Kim
Jinkyu Kim
Sangpil Kim
DiffM
68
43
0
30 Nov 2021
Wav2CLIP: Learning Robust Audio Representations From CLIP
Ho-Hsiang Wu
Prem Seetharaman
Kundan Kumar
J. P. Bello
CLIP
VLM
85
268
0
21 Oct 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
57
362
0
24 Jun 2021
Alias-Free Generative Adversarial Networks
Tero Karras
M. Aittala
S. Laine
Erik Härkönen
Janne Hellsten
J. Lehtinen
Timo Aila
GAN
148
1,582
0
23 Jun 2021
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Yanbei Chen
Yongqin Xian
A. Sophia Koepke
Ying Shan
Zeynep Akata
95
82
0
22 Apr 2021
Aligning Latent and Image Spaces to Connect the Unconnectable
Ivan Skorokhodov
Grigorii Sotnikov
Mohamed Elhoseiny
DiffM
33
79
0
14 Apr 2021
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen-Or
Dani Lischinski
CLIP
VLM
64
1,204
0
31 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
324
21,175
0
25 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
681
28,659
0
26 Feb 2021
TräumerAI: Dreaming Music with StyleGAN
Dasaem Jeong
Seungheon Doh
Taegyun Kwon
GAN
22
16
0
09 Feb 2021
Designing an Encoder for StyleGAN Image Manipulation
Omer Tov
Yuval Alaluf
Yotam Nitzan
Or Patashnik
Daniel Cohen-Or
246
780
0
04 Feb 2021
Crossing You in Style: Cross-modal Style Transfer from Music to Visual Arts
Cheng-Che Lee
Wan-Yi Lin
Yen-Ting Shih
P. Kuo
Li Su
34
15
0
17 Sep 2020
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
Elad Richardson
Yuval Alaluf
Or Patashnik
Yotam Nitzan
Yaniv Azar
Stav Shapiro
Daniel Cohen-Or
106
1,103
0
03 Aug 2020
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
109
373
0
29 Jun 2020
AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings
Pratik Mazumder
Pravendra Singh
Kranti K. Parida
Vinay P. Namboodiri
35
33
0
27 May 2020
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
60
564
0
29 Apr 2020
Deep Audio-Visual Learning: A Survey
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
Ran He
54
157
0
14 Jan 2020
ManiGAN: Text-Guided Image Manipulation
Bowen Li
Xiaojuan Qi
Thomas Lukasiewicz
Philip Torr
EGVM
79
285
0
12 Dec 2019
StarGAN v2: Diverse Image Synthesis for Multiple Domains
Yunjey Choi
Youngjung Uh
Jaejun Yoo
Jung-Woo Ha
3DH
94
1,732
0
04 Dec 2019
Analyzing and Improving the Image Quality of StyleGAN
Tero Karras
S. Laine
M. Aittala
Janne Hellsten
J. Lehtinen
Timo Aila
GAN
256
5,769
0
03 Dec 2019
Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis
Zhongkai Sun
P. Sarma
W. Sethares
Yingyu Liang
38
319
0
13 Nov 2019
Speech2Face: Learning the Face Behind a Voice
Tae-Hyun Oh
Tali Dekel
Changil Kim
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Wojciech Matusik
SSL
CVBM
97
163
0
23 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
149
3,435
0
18 Apr 2019
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
Rameen Abdal
Yipeng Qin
Peter Wonka
GAN
93
1,109
0
05 Apr 2019
Semantic Image Synthesis with Spatially-Adaptive Normalization
Taesung Park
Ming-Yuan Liu
Ting-Chun Wang
Jun-Yan Zhu
125
2,679
0
18 Mar 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
508
10,500
0
12 Dec 2018
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction
Alaaeldin El-Nouby
Shikhar Sharma
Hannes Schulz
Devon Hjelm
Layla El Asri
Samira Ebrahimi Kahou
Yoshua Bengio
Graham W.Taylor
VLM
76
122
0
24 Nov 2018
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
Seonghyeon Nam
Yunji Kim
Seon Joo Kim
GAN
65
207
0
29 Oct 2018
Towards Audio to Scene Image Synthesis using Generative Adversarial Network
Chia-Hung Wan
Shun-Po Chuang
Hung-yi Lee
GAN
42
61
0
13 Aug 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
SSL
97
141
0
02 May 2018
Cross-modal Embeddings for Video and Audio Retrieval
Dídac Surís
A. Duarte
Amaia Salvador
Jordi Torres
Xavier Giró-i-Nieto
SSL
41
69
0
07 Jan 2018
CMCGAN: A Uniform Framework for Cross-Modal Visual-Audio Mutual Generation
Wangli Hao
Zhaoxiang Zhang
He Guan
61
88
0
22 Nov 2017
Semantic Image Synthesis via Adversarial Learning
Hao Dong
Simiao Yu
Chao Wu
Yike Guo
GAN
38
265
0
21 Jul 2017
See, Hear, and Read: Deep Aligned Representations
Y. Aytar
Carl Vondrick
Antonio Torralba
VLM
AI4TS
80
136
0
03 Jun 2017
Deep Cross-Modal Audio-Visual Generation
Lele Chen
Sudhanshu Srivastava
Z. Duan
Chenliang Xu
76
221
0
26 Apr 2017
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
92
2,488
0
29 Sep 2016
A Neural Algorithm of Artistic Style
Leon A. Gatys
Alexander S. Ecker
Matthias Bethge
GAN
65
2,851
0
26 Aug 2015
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
Feng Yu
Ari Seff
Yinda Zhang
Shuran Song
Thomas Funkhouser
Jianxiong Xiao
52
2,320
0
10 Jun 2015
Cyclical Learning Rates for Training Neural Networks
L. Smith
ODL
118
2,515
0
03 Jun 2015
Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature
Babak Saleh
Ahmed Elgammal
42
288
0
05 May 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
238
19,523
0
09 Mar 2015
1