ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.04006
  4. Cited By
Few-Shot Audio-Visual Learning of Environment Acoustics
v1v2 (latest)

Few-Shot Audio-Visual Learning of Environment Acoustics

8 June 2022
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
ArXiv (abs)PDFHTML

Papers citing "Few-Shot Audio-Visual Learning of Environment Acoustics"

44 / 44 papers shown
Title
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
Rui Liu
Shuwei He
Yifan Hu
Hong Li
VLM
145
3
0
16 Dec 2024
Transforming Game Play: A Comparative Study of DCQN and DTQN
  Architectures in Reinforcement Learning
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning
William A. Stigall
117
0
0
14 Oct 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
157
3
0
20 Jul 2024
SOAF: Scene Occlusion-aware Neural Acoustic Field
SOAF: Scene Occlusion-aware Neural Acoustic Field
Huiyu Gao
Jiahao Ma
David Ahmedt-Aristizabal
Chuong H. Nguyen
Miaomiao Liu
115
2
0
02 Jul 2024
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Jiankang Deng
Xiatian Zhu
106
5
0
13 Jun 2024
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
Amandine Brunetto
Sascha Hornauer
Fabien Moutarde
120
2
0
28 May 2024
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
102
86
0
16 Jun 2022
Learning Neural Acoustic Fields
Learning Neural Acoustic Fields
Andrew F. Luo
Yilun Du
Michael J. Tarr
J. Tenenbaum
Antonio Torralba
Chuang Gan
AI4CE
67
84
0
04 Apr 2022
Sound Adversarial Audio-Visual Navigation
Sound Adversarial Audio-Visual Navigation
Yinfeng Yu
Wenbing Huang
Gang Hua
Changan Chen
Yikai Wang
Xiaohong Liu
AAML
69
29
0
22 Feb 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
71
58
0
14 Feb 2022
Active Audio-Visual Separation of Dynamic Sound Sources
Active Audio-Visual Separation of Dynamic Sound Sources
Sagnik Majumder
Kristen Grauman
93
21
0
02 Feb 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
83
41
0
06 Jan 2022
FAST-RIR: Fast neural diffuse room impulse response generator
FAST-RIR: Fast neural diffuse room impulse response generator
Anton Ratnarajah
Shi-Xiong Zhang
Meng Yu
Zhenyu Tang
Tianyi Zhou
Dong Yu
68
56
0
07 Oct 2021
Learning Audio-Visual Dereverberation
Learning Audio-Visual Dereverberation
Changan Chen
Wei-Ju Sun
David Harwath
Kristen Grauman
69
32
0
14 Jun 2021
Move2Hear: Active Audio-Visual Source Separation
Move2Hear: Active Audio-Visual Source Separation
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
58
44
0
15 May 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
49
59
0
13 Apr 2021
TS-RIR: Translated synthetic room impulse responses for speech
  augmentation
TS-RIR: Translated synthetic room impulse responses for speech augmentation
Anton Ratnarajah
Zhenyu Tang
Tianyi Zhou
55
18
0
31 Mar 2021
Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis
Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis
Nikhil Singh
Jeff Mentch
Jerry Ng
Matthew Beveridge
Iddo Drori
60
47
0
26 Mar 2021
Audio-Visual Floorplan Reconstruction
Audio-Visual Floorplan Reconstruction
Senthil Purushwalkam
S. V. A. Garí
V. Ithapu
Carl Schissler
Philip Robinson
Abhinav Gupta
Kristen Grauman
VGen3DV
140
41
0
31 Dec 2020
Semantic Audio-Visual Navigation
Semantic Audio-Visual Navigation
Changan Chen
Ziad Al-Halah
Kristen Grauman
96
106
0
21 Dec 2020
Discriminative Sounding Objects Localization via Self-supervised
  Audiovisual Matching
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
Di Hu
Rui Qian
Minyue Jiang
Xiao Tan
Shilei Wen
Errui Ding
Weiyao Lin
Dejing Dou
77
136
0
12 Oct 2020
Learning to Set Waypoints for Audio-Visual Navigation
Learning to Set Waypoints for Audio-Visual Navigation
Changan Chen
Sagnik Majumder
Ziad Al-Halah
Ruohan Gao
Santhosh Kumar Ramakrishnan
Kristen Grauman
SSL
87
5
0
21 Aug 2020
Self-Supervised Learning of Audio-Visual Objects from Video
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
121
256
0
10 Aug 2020
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating
  Source Separation
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
Hang Zhou
Xudong Xu
Dahua Lin
Xiaogang Wang
Ziwei Liu
DiffM
80
84
0
20 Jul 2020
See, Hear, Explore: Curiosity via Audio-Visual Association
See, Hear, Explore: Curiosity via Audio-Visual Association
Victoria Dean
Shubham Tulsiani
Abhinav Gupta
103
59
0
07 Jul 2020
Implicit Neural Representations with Periodic Activation Functions
Implicit Neural Representations with Periodic Activation Functions
Vincent Sitzmann
Julien N. P. Martel
Alexander W. Bergman
David B. Lindell
Gordon Wetzstein
AI4TS
174
2,579
0
17 Jun 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation
VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao
Changan Chen
Ziad Al-Halah
Carl Schissler
Kristen Grauman
MDESSL
229
84
0
04 May 2020
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
Chuang Gan
Yiwei Zhang
Jiajun Wu
Boqing Gong
J. Tenenbaum
82
139
0
25 Dec 2019
BatVision: Learning to See 3D Spatial Layout with Two Ears
BatVision: Learning to See 3D Spatial Layout with Two Ears
J. H. Christensen
Sascha Hornauer
Stella X. Yu
51
57
0
15 Dec 2019
Vision-Infused Deep Audio Inpainting
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
142
88
0
24 Oct 2019
Audio-visual Speech Enhancement Using Conditional Variational
  Auto-Encoders
Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders
M. Sadeghi
Simon Leglaive
Xavier Alameda-Pineda
Laurent Girin
Radu Horaud
DiffM
96
66
0
07 Aug 2019
MOSNet: Deep Learning based Objective Assessment for Voice Conversion
MOSNet: Deep Learning based Objective Assessment for Voice Conversion
Chen-Chou Lo
Szu-Wei Fu
Wen-Chin Huang
Xin Wang
Junichi Yamagishi
Yu Tsao
H. Wang
63
275
0
17 Apr 2019
The Sound of Motions
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
86
254
0
11 Apr 2019
Habitat: A Platform for Embodied AI Research
Habitat: A Platform for Embodied AI Research
Manolis Savva
Abhishek Kadian
Oleksandr Maksymets
Yili Zhao
Erik Wijmans
...
Jia-Wei Liu
V. Koltun
Jitendra Malik
Devi Parikh
Dhruv Batra
LM&Ro
131
1,423
0
02 Apr 2019
The Conversation: Deep Audio-Visual Speech Enhancement
The Conversation: Deep Audio-Visual Speech Enhancement
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
87
360
0
11 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
100
754
0
10 Apr 2018
Matterport3D: Learning from RGB-D Data in Indoor Environments
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV3DPC
211
1,918
0
18 Sep 2017
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional
  Neural Networks
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
Jen-Cheng Hou
Syu-Siang Wang
Ying-Hui Lai
Yu Tsao
Hsiu-Wen Chang
H. Wang
97
198
0
01 Sep 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
819
132,725
0
12 Jun 2017
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg3DV
1.9K
77,520
0
18 May 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
469
43,357
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,433
0
22 Dec 2014
Deeply learned face representations are sparse, selective, and robust
Deeply learned face representations are sparse, selective, and robust
Yi Sun
Xiaogang Wang
Xiaoou Tang
CVBM
329
924
0
03 Dec 2014
1