Learning Visual Styles from Audio-Visual Associations

10 May 2022

Hang Zhao

Papers citing "Learning Visual Styles from Audio-Visual Associations"

36 / 36 papers shown

Title
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs Moayed Haji-Ali Andrew Bond Tolga Birdal Duygu Ceylan Levent Karacan Erkut Erdem Aykut Erdem VGen DiffM 187 2 0 12 Apr 2023
Multimodal Deep Learning Cem Akkus Jiquan Ngiam Vladana Djakovic Steffen Jauch-Walser A. Khosla ... Jann Goschenhofer Honglak Lee A. Ng Daniel Schalk Matthias Aßenmacher 120 3,174 0 12 Jan 2023
Wav2CLIP: Learning Robust Audio Representations From CLIP Ho-Hsiang Wu Prem Seetharaman Kundan Kumar J. P. Bello CLIP VLM 145 271 0 21 Oct 2021
Taming Visually Guided Sound Generation Vladimir E. Iashin Esa Rahtu VLM 97 128 0 17 Oct 2021
Localizing Visual Sounds the Hard Way Honglie Chen Weidi Xie Triantafyllos Afouras Arsha Nagrani Andrea Vedaldi Andrew Zisserman ObjD 85 190 0 06 Apr 2021
Paint by Word A. Andonian David Bau Audrey Cui YeonHwan Park Ali Jahanian Antonio Torralba A. Oliva DiffM 67 125 0 19 Mar 2021
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 418 4,987 0 24 Feb 2021
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild Prajwal K R Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar EGVM 103 787 0 23 Aug 2020
Describing Textures using Natural Language Chenyun Wu Mikayla Timm Subhransu Maji 3DV 42 10 0 03 Aug 2020
Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Y. Zhang Jun-Yan Zhu SSL 86 1,232 0 30 Jul 2020
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis Prajwal K R Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar 63 113 0 17 May 2020
VGGSound: A Large-scale Audio-Visual Dataset Honglie Chen Weidi Xie Andrea Vedaldi Andrew Zisserman 89 582 0 29 Apr 2020
Audio-Visual Instance Discrimination with Cross-Modal Agreement Pedro Morgado Nuno Vasconcelos Ishan Misra SSL 80 276 0 27 Apr 2020
Music Gesture for Visual Sound Separation Chuang Gan Deng Huang Hang Zhao J. Tenenbaum Antonio Torralba 95 205 0 20 Apr 2020
Learning Individual Styles of Conversational Gesture Shiry Ginosar Amir Bar Gefen Kohavi Caroline Chan Andrew Owens Jitendra Malik SLR 48 332 0 10 Jun 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 83 254 0 11 Apr 2019
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language Seonghyeon Nam Yunji Kim Seon Joo Kim GAN 79 207 0 29 Oct 2018
Deep Audio-Visual Speech Recognition Triantafyllos Afouras Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 95 707 0 06 Sep 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou Yu Liu Ziwei Liu Ping Luo Xiaogang Wang CVBM 92 442 0 20 Jul 2018
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization Bruno Korbar Du Tran Lorenzo Torresani 99 476 0 30 Jun 2018
Exploring the Limits of Weakly Supervised Pretraining D. Mahajan Ross B. Girshick Vignesh Ramanathan Kaiming He Manohar Paluri Yixuan Li Ashwin R. Bharambe Laurens van der Maaten VLM 196 1,369 0 02 May 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 98 753 0 10 Apr 2018
The Sound of Pixels Hang Zhao Chuang Gan Andrew Rouditchenko Carl Vondrick Josh H. McDermott Antonio Torralba VLM 102 536 0 09 Apr 2018
Image Generation from Scene Graphs Justin Johnson Agrim Gupta Li Fei-Fei GNN 303 820 0 04 Apr 2018
Audio to Body Dynamics Eli Shlizerman Lucio Dery Hayden Schoen Ira Kemelmacher-Shlizerman VGen 85 154 0 19 Dec 2017
Look, Listen and Learn Relja Arandjelović Andrew Zisserman SSL 120 906 0 23 May 2017
You said that? Joon Son Chung A. Jamaludin Andrew Zisserman CVBM 72 259 0 08 May 2017
DualGAN: Unsupervised Dual Learning for Image-to-Image Translation Zili Yi Hao Zhang P. Tan Minglun Gong GAN VLM 133 1,943 0 08 Apr 2017
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization Xun Huang Serge J. Belongie OOD 181 4,368 0 20 Mar 2017
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks Taeksoo Kim Moonsu Cha Hyunsoo Kim Jung Kwon Lee Jiwon Kim GAN OOD 91 1,980 0 15 Mar 2017
Least Squares Generative Adversarial Networks Xudong Mao Qing Li Haoran Xie Raymond Y. K. Lau Zhen Wang Stephen Paul Smolley GAN 333 4,575 0 13 Nov 2016
CNN Architectures for Large-Scale Audio Classification Shawn Hershey Sourish Chaudhuri D. Ellis J. Gemmeke A. Jansen ... Rif A. Saurous Bryan Seybold M. Slaney Ron J. Weiss K. Wilson 123 2,506 0 29 Sep 2016
Generative Adversarial Text to Image Synthesis Scott E. Reed Zeynep Akata Xinchen Yan Lajanugen Logeswaran Bernt Schiele Honglak Lee GAN 205 3,148 0 17 May 2016
Perceptual Losses for Real-Time Style Transfer and Super-Resolution Justin Johnson Alexandre Alahi Li Fei-Fei SupR 237 10,262 0 27 Mar 2016
Rethinking the Inception Architecture for Computer Vision Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jonathon Shlens Z. Wojna 3DV BDL 886 27,412 0 02 Dec 2015
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 680 31,538 0 16 Jan 2013