ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.16794
80
2
v1v2v3 (latest)

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

24 February 2025
Xilin Jiang
Sukru Samet Dindar
Vishal B. Choudhari
Stephan Bickel
A. Mehta
Guy M McKhann
A. Flinker
D. Friedman
N. Mesgarani
ArXiv (abs)PDFHTML
Main:7 Pages
9 Figures
Bibliography:4 Pages
14 Tables
Appendix:12 Pages
Abstract

Auditory foundation models, including auditory large language models (LLMs), process all sound inputs equally, independent of listener perception. However, human auditory perception is inherently selective: listeners focus on specific speakers while ignoring others in complex auditory scenes. Existing models do not incorporate this selectivity, limiting their ability to generate perception-aligned responses. To address this, we introduce Intention-Informed Auditory Scene Understanding (II-ASU) and present Auditory Attention-Driven LLM (AAD-LLM), a prototype system that integrates brain signals to infer listener attention. AAD-LLM extends an auditory LLM by incorporating intracranial electroencephalography (iEEG) recordings to decode which speaker a listener is attending to and refine responses accordingly. The model first predicts the attended speaker from neural activity, then conditions response generation on this inferred attentional state. We evaluate AAD-LLM on speaker description, speech transcription and extraction, and question answering in multitalker scenarios, with both objective and subjective ratings showing improved alignment with listener intention. By taking a first step toward intention-aware auditory AI, this work explores a new paradigm where listener perception informs machine listening, paving the way for future listener-centered auditory systems. Demo and code available:this https URL.

View on arXiv
@article{jiang2025_2502.16794,
  title={ AAD-LLM: Neural Attention-Driven Auditory Scene Understanding },
  author={ Xilin Jiang and Sukru Samet Dindar and Vishal Choudhari and Stephan Bickel and Ashesh Mehta and Guy M McKhann and Daniel Friedman and Adeen Flinker and Nima Mesgarani },
  journal={arXiv preprint arXiv:2502.16794},
  year={ 2025 }
}
Comments on this paper