2.5D Visual Sound

11 December 2018

Papers citing "2.5D Visual Sound"

32 / 32 papers shown

Title
OmniAudio: Generating Spatial Audio from 360-Degree Video Huadai Liu Tianyi Luo Qikai Jiang Kaicheng Luo Peiwen Sun ... Xin Li Shiliang Zhang Zhijie Yan Zhou Zhao Wei Xue VGen 58 0 0 21 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding Mingfei Chen I. D. Gebru Ishwarya Ananthabhotla Christian Richardt Dejan Marković Jake Sandakly Steven Krenn Todd Keebler Eli Shlizerman Alexander Richard 24 0 0 08 Apr 2025
Images that Sound: Composing Images and Sounds on a Single Canvas Ziyang Chen Daniel Geng Andrew Owens DiffM 50 9 0 20 May 2024
RealImpact: A Dataset of Impact Sound Fields for Real Objects Samuel Clarke Ruohan Gao Mason Wang M. Rau Julia Xu Jui-Hsien Wang Doug L. James Jiajun Wu 40 9 0 16 Jun 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation Ruixin Zheng Yang Ai Zhenhua Ling 32 8 0 24 May 2023
A Closer Look at Weakly-Supervised Audio-Visual Source Localization Shentong Mo Pedro Morgado 83 64 0 30 Aug 2022
Visually Supervised Speaker Detection and Localization via Microphone Array Davide Berghi A. Hilton Philip J. B. Jackson 24 11 0 07 Mar 2022
Visual Acoustic Matching Changan Chen Ruohan Gao P. Calamia Kristen Grauman 21 56 0 14 Feb 2022
Active Audio-Visual Separation of Dynamic Sound Sources Sagnik Majumder Kristen Grauman 27 21 0 02 Feb 2022
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video Rishabh Garg Ruohan Gao Kristen Grauman 15 28 0 21 Nov 2021
A trained humanoid robot can perform human-like crossmodal social attention and conflict resolution Di Fu Fares Abawi Hugo C. C. Carneiro Matthias Kerzel Ziwei Chen Erik Strahl Xun Liu S. Wermter 17 6 0 02 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 269 1,026 0 13 Oct 2021
$Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$ Videos$ Pano-AVQA: Grounded Audio-Visual Question Answering on 360 $^\circ$ Videos Heeseung Yun Youngjae Yu Wonsuk Yang Kangil Lee Gunhee Kim 25 79 0 11 Oct 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos Sanchita Ghose John J. Prevost GAN 27 26 0 20 Jul 2021
Unsupervised Sound Localization via Iterative Contrastive Learning Yan-Bo Lin Hung-Yu Tseng Hsin-Ying Lee Yen-Yu Lin Ming-Hsuan Yang SSL 27 34 0 01 Apr 2021
Robust Audio-Visual Instance Discrimination Pedro Morgado Ishan Misra Nuno Vasconcelos SSL 19 110 0 29 Mar 2021
Semantic Audio-Visual Navigation Changan Chen Ziad Al-Halah Kristen Grauman 50 104 0 21 Dec 2020
Learning Representations from Audio-Visual Spatial Alignment Pedro Morgado Yi Li Nuno Vasconcelos SSL 27 121 0 03 Nov 2020
Self-Supervised Learning of Audio-Visual Objects from Video Triantafyllos Afouras Andrew Owens Joon Son Chung Andrew Zisserman SSL 19 253 0 10 Aug 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing Yapeng Tian Dingzeyu Li Chenliang Xu 34 180 0 21 Jul 2020
Multiple Sound Sources Localization from Coarse to Fine Rui Qian Di Hu Heinrich Dinkel Mengyue Wu N. Xu Weiyao Lin 28 155 0 13 Jul 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Andrew Rouditchenko Angie Boggust David Harwath Brian Chen D. Joshi ... Rogerio Feris Brian Kingsbury M. Picheny Antonio Torralba James R. Glass SSL 22 141 0 16 Jun 2020
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound Karren D. Yang Bryan C. Russell Justin Salamon SSL 24 75 0 11 Jun 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation Ruohan Gao Changan Chen Ziad Al-Halah Carl Schissler Kristen Grauman MDE SSL 171 84 0 04 May 2020
The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation S. Kasaei J. Melsen Floris van Beers Christiaan Steenkist K. Vončina 22 12 0 18 Mar 2020
Listen to Look: Action Recognition by Previewing Audio Ruohan Gao Tae-Hyun Oh Kristen Grauman Lorenzo Torresani VLM 29 251 0 10 Dec 2019
Learning to Localize Sound Sources in Visual Scenes: Analysis and Applications Arda Senocak Tae-Hyun Oh Junsik Kim Ming-Hsuan Yang In So Kweon SSL 33 52 0 20 Nov 2019
Vision-Infused Deep Audio Inpainting Hang Zhou Ziwei Liu Lingfeng Guo Ping Luo Dahua Lin 35 88 0 24 Oct 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition Evangelos Kazakos Arsha Nagrani Andrew Zisserman Dima Damen EgoV 16 332 0 22 Aug 2019
Audio-Visual Model Distillation Using Acoustic Images Andrés F. Pérez Valentina Sanguineti Pietro Morerio Vittorio Murino VLM 15 27 0 16 Apr 2019
Co-Separating Sounds of Visual Objects Ruohan Gao Kristen Grauman 33 206 0 16 Apr 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 17 251 0 11 Apr 2019