Objects that Sound

18 December 2017

Papers citing "Objects that Sound"

36 / 136 papers shown

Title
Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention Bin Duan Hao Tang Wei Wang Ziliang Zong Guowei Yang Yan Yan 33 59 0 14 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning Ying Cheng Ruize Wang Zhihao Pan Rui Feng Yuejie Zhang SSL 30 106 0 13 Aug 2020
Self-Supervised Learning of Audio-Visual Objects from Video Triantafyllos Afouras Andrew Owens Joon Son Chung Andrew Zisserman SSL 19 252 0 10 Aug 2020
Multiple Sound Sources Localization from Coarse to Fine Rui Qian Di Hu Heinrich Dinkel Mengyue Wu N. Xu Weiyao Lin 28 155 0 13 Jul 2020
Self-Supervised MultiModal Versatile Networks Jean-Baptiste Alayrac Adrià Recasens R. Schneider Relja Arandjelović Jason Ramapuram J. Fauw Lucas Smaira Sander Dieleman Andrew Zisserman SSL 40 371 0 29 Jun 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Andrew Rouditchenko Angie Boggust David Harwath Brian Chen D. Joshi ... Rogerio Feris Brian Kingsbury M. Picheny Antonio Torralba James R. Glass SSL 22 141 0 16 Jun 2020
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound Karren D. Yang Bryan C. Russell Justin Salamon SSL 24 75 0 11 Jun 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network Lingyu Zhu Esa Rahtu 22 23 0 04 Jun 2020
What Makes for Good Views for Contrastive Learning? Yonglong Tian Chen Sun Ben Poole Dilip Krishnan Cordelia Schmid Phillip Isola SSL 39 1,305 0 20 May 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation Ruohan Gao Changan Chen Ziad Al-Halah Carl Schissler Kristen Grauman MDE SSL 171 83 0 04 May 2020
Conditioned Source Separation for Music Instrument Performances Olga Slizovskaia G. Haro E. Gómez 30 38 0 08 Apr 2020
Learning word-referent mappings and concepts from raw inputs Wai Keen Vong Brenden Lake 17 5 0 12 Mar 2020
Evolving Losses for Unsupervised Video Representation Learning A. Piergiovanni A. Angelova Michael S. Ryoo SSL 24 138 0 26 Feb 2020
Audiovisual SlowFast Networks for Video Recognition Fanyi Xiao Yong Jae Lee Kristen Grauman Jitendra Malik Christoph Feichtenhofer 197 206 0 23 Jan 2020
Deep Audio-Visual Learning: A Survey Hao Zhu Mandi Luo Rui Wang A. Zheng Ran He 31 156 0 14 Jan 2020
STAViS: Spatio-Temporal AudioVisual Saliency Network A. Tsiami Petros Koutras Petros Maragos 24 73 0 09 Jan 2020
Listen to Look: Action Recognition by Previewing Audio Ruohan Gao Tae-Hyun Oh Kristen Grauman Lorenzo Torresani VLM 29 251 0 10 Dec 2019
Learning to Localize Sound Sources in Visual Scenes: Analysis and Applications Arda Senocak Tae-Hyun Oh Junsik Kim Ming-Hsuan Yang In So Kweon SSL 33 52 0 20 Nov 2019
Vision-Infused Deep Audio Inpainting Hang Zhou Ziwei Liu Lingfeng Guo Ping Luo Dahua Lin 27 88 0 24 Oct 2019
Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos Kranti K. Parida Neeraj Matiyali T. Guha Gaurav Sharma VLM 27 41 0 19 Oct 2019
Learning to Have an Ear for Face Super-Resolution Givi Meishvili Simon Jenni Paolo Favaro SupR CVBM 33 23 0 27 Sep 2019
Recursive Visual Sound Separation Using Minus-Plus Net Xudong Xu Bo Dai Dahua Lin 32 91 0 30 Aug 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition Evangelos Kazakos Arsha Nagrani Andrew Zisserman Dima Damen EgoV 16 332 0 22 Aug 2019
Multi-task Self-Supervised Learning for Human Activity Detection Aaqib Saeed T. Ozcelebi J. Lukkien SSL 21 268 0 27 Jul 2019
Learning Representations by Maximizing Mutual Information Across Views Philip Bachman R. Devon Hjelm William Buchwalter SSL 67 1,452 0 03 Jun 2019
Data-Efficient Image Recognition with Contrastive Predictive Coding Olivier J. Hénaff A. Srinivas J. Fauw Ali Razavi Carl Doersch S. M. Ali Eslami Aaron van den Oord SSL 58 1,415 0 22 May 2019
Scaling and Benchmarking Self-Supervised Visual Representation Learning Priya Goyal D. Mahajan Abhinav Gupta Ishan Misra SSL 24 396 0 03 May 2019
Audio-Visual Model Distillation Using Acoustic Images Andrés F. Pérez Valentina Sanguineti Pietro Morerio Vittorio Murino VLM 15 27 0 16 Apr 2019
Co-Separating Sounds of Visual Objects Ruohan Gao Kristen Grauman 27 205 0 16 Apr 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 17 250 0 11 Apr 2019
An Attempt towards Interpretable Audio-Visual Video Captioning Yapeng Tian Chenxiao Guan Justin Goodman Marc Moore Chenliang Xu 36 20 0 07 Dec 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity Arsha Nagrani Samuel Albanie Andrew Zisserman SSL 26 140 0 02 May 2018
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events Sanjeel Parekh S. Essid A. Ozerov Ngoc Q. K. Duong P. Pérez G. Richard SSL 8 19 0 19 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 51 743 0 10 Apr 2018
The Sound of Pixels Hang Zhao Chuang Gan Andrew Rouditchenko Carl Vondrick Josh H. McDermott Antonio Torralba VLM 22 527 0 09 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos Yapeng Tian Jing Shi Bochen Li Zhiyao Duan Chenliang Xu 31 425 0 23 Mar 2018