ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11565
  4. Cited By
Deep Neural Networks for Multiple Speaker Detection and Localization

Deep Neural Networks for Multiple Speaker Detection and Localization

30 November 2017
Weipeng He
P. Motlícek
J. Odobez
ArXivPDFHTML

Papers citing "Deep Neural Networks for Multiple Speaker Detection and Localization"

25 / 25 papers shown
Title
Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Jakob Kienegger
Alina Mannanova
Timo Gerkmann
46
0
0
10 Jan 2025
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
51
7
0
11 Nov 2024
The Neural-SRP method for positional sound source localization
The Neural-SRP method for positional sound source localization
Eric Grinstein
Toon van Waterschoot
Mike Brookes
Patrick A. Naylor
26
2
0
14 Mar 2024
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
45
7
0
23 Oct 2023
Lightweight Neural Architecture Search for Temporal Convolutional
  Networks at the Edge
Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge
Matteo Risso
Alessio Burrello
Francesco Conti
Lorenzo Lamberti
Yukai Chen
Luca Benini
Enrico Macii
M. Poncino
Daniele Jahier Pagliari
30
33
0
24 Jan 2023
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware
  Beamforming Network for Speech Separation
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Yanjie Fu
Haoran Yin
Meng Ge
Longbiao Wang
Gaoyan Zhang
J. Dang
Chengyun Deng
Fei Wang
CVBM
18
2
0
07 Dec 2022
Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for
  Audio-Visual Machine Learning Research
Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research
Davide Berghi
M. Volino
Philip J. B. Jackson
VGen
23
6
0
04 Dec 2022
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using
  Permutation-Free Loss Function
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Qing Wang
Hang Chen
Yannan Jiang
Zhe Wang
Yuyang Wang
Jun Du
Chin-Hui Lee
16
4
0
26 Oct 2022
Extending GCC-PHAT using Shift Equivariant Neural Networks
Extending GCC-PHAT using Shift Equivariant Neural Networks
Axel Berg
Mark O'Connor
Kalle Åström
Magnus Oskarsson
28
10
0
09 Aug 2022
MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with
  Unknown Number of Sound Sources
MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources
Haoran Yin
Meng Ge
Yanjie Fu
Gaoyan Zhang
Longbiao Wang
Lei Zhang
Lin Qiu
J. Dang
37
4
0
15 Jul 2022
Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for
  Temporal Convolutional Networks
Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for Temporal Convolutional Networks
Matteo Risso
Alessio Burrello
Daniele Jahier Pagliari
Francesco Conti
Lorenzo Lamberti
Enrico Macii
Luca Benini
M. Poncino
23
10
0
28 Mar 2022
A Deep Reinforcement Learning Approach for Audio-based Navigation and
  Audio Source Localization in Multi-speaker Environments
A Deep Reinforcement Learning Approach for Audio-based Navigation and Audio Source Localization in Multi-speaker Environments
Petros Giannakopoulos
Aggelos Pikrakis
Y. Cotronis
19
3
0
25 Oct 2021
A Dataset of Dynamic Reverberant Sound Scenes with Directional
  Interferers for Sound Event Localization and Detection
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection
A. Politis
Sharath Adavanne
D. Krause
Antoine Deleforge
Prerak Srivastava
Tuomas Virtanen
25
66
0
13 Jun 2021
PILOT: Introducing Transformers for Probabilistic Sound Event
  Localization
PILOT: Introducing Transformers for Probabilistic Sound Event Localization
C. Schymura
Benedikt T. Bönninghoff
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Tomohiro Nakatani
S. Araki
D. Kolossa
27
24
0
07 Jun 2021
Multi-target DoA Estimation with an Audio-visual Fusion Mechanism
Multi-target DoA Estimation with an Audio-visual Fusion Mechanism
Xinyuan Qian
Maulik C. Madhavi
Zexu Pan
Jiadong Wang
Haizhou Li
27
44
0
13 May 2021
A Deep Reinforcement Learning Approach to Audio-Based Navigation in a
  Multi-Speaker Environment
A Deep Reinforcement Learning Approach to Audio-Based Navigation in a Multi-Speaker Environment
Petros Giannakopoulos
A. Pikrakis
Y. Cotronis
14
7
0
10 May 2021
BeamLearning: an end-to-end Deep Learning approach for the angular
  localization of sound sources using raw multichannel acoustic pressure data
BeamLearning: an end-to-end Deep Learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data
Hadrien Pujol
Éric Bavu
Alexandre Garcia
44
22
0
27 Apr 2021
Deep Learning based Multi-Source Localization with Source Splitting and
  its Effectiveness in Multi-Talker Speech Recognition
Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Aswin Shanmugam Subramanian
Chao Weng
Shinji Watanabe
Meng Yu
Dong Yu
34
78
0
16 Feb 2021
Efficient Training Data Generation for Phase-Based DOA Estimation
Efficient Training Data Generation for Phase-Based DOA Estimation
Fabian Hübner
Wolfgang Mack
Emanuel Habets
6
7
0
09 Nov 2020
Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
  Spiking Neural Network
Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by Spiking Neural Network
Zihan Pan
Malu Zhang
Jibin Wu
Haizhou Li
21
9
0
07 Jul 2020
Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural
  Networks
Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural Networks
David Diaz-Guerra
A. Miguel
J. R. Beltrán
32
87
0
16 Jun 2020
Multimodal active speaker detection and virtual cinematography for video
  conferencing
Multimodal active speaker detection and virtual cinematography for video conferencing
Ross Cutler
Ramin Mehran
Sam Johnson
Cha Zhang
Adam G. Kirk
Oliver Whyte
Adarsh Kowdle
18
7
0
10 Feb 2020
Time Difference of Arrival Estimation from Frequency-Sliding Generalized
  Cross-Correlations Using Convolutional Neural Networks
Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks
Luca Comanducci
M. Cobos
Fabio Antonacci
Augusto Sarti
6
26
0
03 Feb 2020
MuMMER: Socially Intelligent Human-Robot Interaction in Public Spaces
MuMMER: Socially Intelligent Human-Robot Interaction in Public Spaces
Mary Ellen Foster
B. Craenen
A. Deshmukh
Oliver Lemon
E. Bastianelli
...
Maxime Caniot
Marketta Niemelä
Päivi Heikkilä
Hanna Lammi
Antti Tammela
10
34
0
15 Sep 2019
Listening for Sirens: Locating and Classifying Acoustic Alarms in City
  Scenes
Listening for Sirens: Locating and Classifying Acoustic Alarms in City Scenes
Letizia Marchegiani
Paul Newman
11
35
0
11 Oct 2018
1