ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.07031
  4. Cited By
The Benefit Of Temporally-Strong Labels In Audio Event Classification

The Benefit Of Temporally-Strong Labels In Audio Event Classification

14 May 2021
Shawn Hershey
D. Ellis
Eduardo Fonseca
A. Jansen
Caroline Liu
Channing Moore
Manoj Plakal
ArXivPDFHTML

Papers citing "The Benefit Of Temporally-Strong Labels In Audio Event Classification"

28 / 28 papers shown
Title
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
Paul Primus
Florian Schmid
Gerhard Widmer
CLIP
AI4TS
VLM
36
0
0
12 May 2025
FLAM: Frame-Wise Language-Audio Modeling
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Rameswar Panda
Oriol Nieto
Prem Seetharaman
Justin Salamon
50
0
0
08 May 2025
Formula-Supervised Sound Event Detection: Pre-Training Without Real Data
Formula-Supervised Sound Event Detection: Pre-Training Without Real Data
Yuto Shibata
Keitaro Tanaka
Yoshiaki Bando
Keisuke Imoto
Hirokatsu Kataoka
Yoshimitsu Aoki
31
0
0
06 Apr 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Yu Guo
67
3
0
13 Mar 2025
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
81
2
0
10 Jan 2025
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Wei Guo
Heng Wang
Jianbo Ma
Weidong Cai
DiffM
93
3
0
23 Nov 2024
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Kim Sung-Bin
Oh Hyun-Bin
JungMok Lee
Arda Senocak
Joon Son Chung
Tae-Hyun Oh
MLLM
VLM
48
3
0
23 Oct 2024
Do Audio-Language Models Understand Linguistic Variations?
Do Audio-Language Models Understand Linguistic Variations?
Ramaneswaran Selvakumar
Sonal Kumar
Hemant Kumar Giri
Nishit Anand
Ashish Seth
Sreyan Ghosh
Dinesh Manocha
AuLLM
VLM
55
1
0
21 Oct 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
Yishuo Wang
Hangting Chen
Dongchao Yang
Zhiyong Wu
Xixin Wu
DiffM
45
2
0
19 Sep 2024
Effective Pre-Training of Audio Transformers for Sound Event Detection
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
33
2
0
14 Sep 2024
Improving Audio Spectrogram Transformers for Sound Event Detection
  Through Multi-Stage Training
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
Florian Schmid
Paul Primus
T. Morocutti
Jonathan Greif
Gerhard Widmer
34
5
0
17 Jul 2024
AudioSetMix: Enhancing Audio-Language Datasets with LLM-Assisted
  Augmentations
AudioSetMix: Enhancing Audio-Language Datasets with LLM-Assisted Augmentations
David Xu
25
2
0
17 May 2024
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
Zhengcong Fei
Mingyuan Fan
Junshi Huang
25
17
0
27 Nov 2023
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language
  Models
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
Sreyan Ghosh
Ashish Seth
Sonal Kumar
Utkarsh Tyagi
Chandra Kiran Reddy Evuru
S. Ramaneswaran
S. Sakshi
Oriol Nieto
R. Duraiswami
Dinesh Manocha
AuLLM
VLM
CoGe
43
23
0
12 Oct 2023
Joint Audio and Speech Understanding
Joint Audio and Speech Understanding
Yuan Gong
Alexander H. Liu
Hongyin Luo
Leonid Karlinsky
James R. Glass
AuLLM
28
69
0
25 Sep 2023
Advancing Natural-Language Based Audio Retrieval with PaSST and Large
  Audio-Caption Data Sets
Advancing Natural-Language Based Audio Retrieval with PaSST and Large Audio-Caption Data Sets
Paul Primus
Khaled Koutini
Gerhard Widmer
32
13
0
08 Aug 2023
AnuraSet: A dataset for benchmarking Neotropical anuran calls
  identification in passive acoustic monitoring
AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring
Juan Sebastián Canas
Maria Paula Toro-Gómez
L. S. M. Sugai
H. Benítez-Restrepo
J. Rudas
...
José Luiz Massao Moreira Sugai
Carolina Emília dos Santos
R. Bastos
Diego Llusia
J. Ulloa
36
18
0
11 Jul 2023
Streaming Audio Transformers for Online Audio Tagging
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
34
4
0
29 May 2023
Training sound event detection with soft labels from crowdsourced
  annotations
Training sound event detection with soft labels from crowdsourced annotations
Irene Martín-Morató
Manu Harju
Paul Ahokas
A. Mesaros
15
16
0
28 Feb 2023
Epic-Sounds: A Large-scale Dataset of Actions That Sound
Epic-Sounds: A Large-scale Dataset of Actions That Sound
Jaesung Huh
Jacob Chalk
Evangelos Kazakos
Dima Damen
Andrew Zisserman
EgoV
21
41
0
01 Feb 2023
UniKW-AT: Unified Keyword Spotting and Audio Tagging
UniKW-AT: Unified Keyword Spotting and Audio Tagging
Heinrich Dinkel
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Yujun Wang
45
3
0
23 Sep 2022
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
21
268
0
13 Jul 2022
RaDur: A Reference-aware and Duration-robust Network for Target Sound
  Detection
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection
Dongchao Yang
Helin Wang
Zhongjie Ye
Yuexian Zou
Wenwu Wang
28
0
0
05 Apr 2022
A Mixed supervised Learning Framework for Target Sound Detection
A Mixed supervised Learning Framework for Target Sound Detection
Dongchao Yang
Helin Wang
Yuexian Zou
Wenwu Wang
22
0
0
05 Apr 2022
AudioTagging Done Right: 2nd comparison of deep learning methods for
  environmental sound classification
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Billy Li
Shuhui Qu
Po-Yao (Bernie) Huang
Florian Metze
VLM
36
9
0
25 Mar 2022
A benchmark of state-of-the-art sound event detection systems evaluated
  on synthetic soundscapes
A benchmark of state-of-the-art sound event detection systems evaluated on synthetic soundscapes
Francesca Ronchini
Romain Serizel
40
14
0
03 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
127
263
0
02 Feb 2022
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
24
436
0
01 Oct 2020
1