ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.04210
  4. Cited By
Semantic Object Prediction and Spatial Sound Super-Resolution with
  Binaural Sounds

Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds

9 March 2020
A. Vasudevan
Dengxin Dai
Luc Van Gool
    ObjD
ArXivPDFHTML

Papers citing "Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds"

15 / 15 papers shown
Title
OmniAudio: Generating Spatial Audio from 360-Degree Video
OmniAudio: Generating Spatial Audio from 360-Degree Video
Huadai Liu
Tianyi Luo
Qikai Jiang
Kaicheng Luo
Peiwen Sun
...
Xin Li
Shiliang Zhang
Zhijie Yan
Zhou Zhao
Wei Xue
VGen
58
0
0
21 Apr 2025
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
59
7
0
11 Nov 2024
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
50
7
0
23 Oct 2023
Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for
  Audio-Visual Machine Learning Research
Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research
Davide Berghi
M. Volino
Philip J. B. Jackson
VGen
23
6
0
04 Dec 2022
Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
Hao Ai
Zidong Cao
Jin Zhu
Haotian Bai
Yucheng Chen
Ling Wang
44
35
0
21 May 2022
Invisible-to-Visible: Privacy-Aware Human Segmentation using Airborne
  Ultrasound via Collaborative Learning Probabilistic U-Net
Invisible-to-Visible: Privacy-Aware Human Segmentation using Airborne Ultrasound via Collaborative Learning Probabilistic U-Net
Risako Tanigawa
Yasunori Ishii
Kazuki Kozuka
Takayoshi Yamashita
35
1
0
11 May 2022
Visually Supervised Speaker Detection and Localization via Microphone
  Array
Visually Supervised Speaker Detection and Localization via Microphone Array
Davide Berghi
A. Hilton
Philip J. B. Jackson
27
11
0
07 Mar 2022
Sound and Visual Representation Learning with Multiple Pretraining Tasks
Sound and Visual Representation Learning with Multiple Pretraining Tasks
A. Vasudevan
Dengxin Dai
Luc Van Gool
SSL
38
6
0
04 Jan 2022
Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with
  Depth and Cross Modal Attention
Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
41
20
0
15 Nov 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$
  Videos
Pano-AVQA: Grounded Audio-Visual Question Answering on 360∘^\circ∘ Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
30
79
0
11 Oct 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
13
55
0
13 Apr 2021
Can audio-visual integration strengthen robustness under multimodal
  attacks?
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
38
37
0
05 Apr 2021
Beyond Image to Depth: Improving Depth Prediction using Echoes
Beyond Image to Depth: Improving Depth Prediction using Echoes
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
50
37
0
15 Mar 2021
Capturing Omni-Range Context for Omnidirectional Segmentation
Capturing Omni-Range Context for Omnidirectional Segmentation
Kailun Yang
Jiaming Zhang
Simon Reiß
Xinxin Hu
Rainer Stiefelhagen
37
71
0
09 Mar 2021
There is More than Meets the Eye: Self-Supervised Multi-Object Detection
  and Tracking with Sound by Distilling Multimodal Knowledge
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Francisco Rivera Valverde
Juana Valeria Hurtado
Abhinav Valada
28
72
0
01 Mar 2021
1