ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.15771
  4. Cited By
wav2pos: Sound Source Localization using Masked Autoencoders

wav2pos: Sound Source Localization using Masked Autoencoders

28 August 2024
Axel Berg
Jens Gulin
Mark O'Connor
Chuteng Zhou
Karl Åström
Magnus Oskarsson
ArXivPDFHTML

Papers citing "wav2pos: Sound Source Localization using Masked Autoencoders"

12 / 12 papers shown
Title
Graph neural networks for sound source localization on distributed
  microphone networks
Graph neural networks for sound source localization on distributed microphone networks
Eric Grinstein
Mike Brookes
Patrick A. Naylor
34
9
0
28 Jun 2023
The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for
  Indoor Localization
The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for Indoor Localization
Ilayda Yaman
Guoda Tian
Martin Larsson
Patrik Persson
Michiel Sandra
...
Fredrik Tufvesson
Karl Åström
O. Edfors
Steffen Malkowsky
Liang Liu
33
3
0
10 Feb 2023
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
74
280
0
13 Jul 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
64
118
0
27 May 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural
  Networks
Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural Networks
David Diaz-Guerra
A. Miguel
J. R. Beltrán
64
88
0
16 Jun 2020
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
Towards End-to-End Acoustic Localization using Deep Learning: from Audio
  Signal to Source Position Coordinates
Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates
Juan Manuel Vera-Diaz
Daniel Pizarro-Perez
Javier Macias-Guarasa
44
116
0
29 Jul 2018
Pyroomacoustics: A Python package for audio room simulations and array
  processing algorithms
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms
Robin Scheibler
Eric Bezzam
Ivan Dokmanić
57
517
0
11 Oct 2017
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
410
10,482
0
21 Jul 2016
Gaussian Error Linear Units (GELUs)
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
169
5,000
0
27 Jun 2016
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,289
0
11 Feb 2015
1