ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.01719
  4. Cited By
Learning Temporal Resolution in Spectrogram for Audio Classification

Learning Temporal Resolution in Spectrogram for Audio Classification

4 October 2022
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
ArXivPDFHTML

Papers citing "Learning Temporal Resolution in Spectrogram for Audio Classification"

37 / 37 papers shown
Title
Simple Pooling Front-ends For Efficient Audio Classification
Simple Pooling Front-ends For Efficient Audio Classification
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Mark D. Plumbley
Wenwu Wang
62
16
0
03 Oct 2022
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection
Haohe Liu
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
58
8
0
15 Jul 2022
Real time spectrogram inversion on mobile phone
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
79
4
0
01 Mar 2022
Learning strides in convolutional neural networks
Learning strides in convolutional neural networks
Rachid Riad
O. Teboul
David Grangier
Neil Zeghidour
56
42
0
03 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
136
269
0
02 Feb 2022
SSAST: Self-Supervised Audio Spectrogram Transformer
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
65
273
0
19 Oct 2021
Efficient Training of Audio Transformers with Patchout
Efficient Training of Audio Transformers with Patchout
Khaled Koutini
Jan Schluter
Hamid Eghbalzadeh
Gerhard Widmer
ViT
117
257
0
11 Oct 2021
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music
  Source Separation
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation
Qiuqiang Kong
Yin Cao
Haohe Liu
Keunwoo Choi
Yuxuan Wang
141
100
0
12 Sep 2021
Learning Fast Sample Re-weighting Without Reward Data
Learning Fast Sample Re-weighting Without Reward Data
Zizhao Zhang
Tomas Pfister
65
75
0
07 Sep 2021
Joint Echo Cancellation and Noise Suppression based on Cascaded
  Magnitude and Complex Mask Estimation
Joint Echo Cancellation and Noise Suppression based on Cascaded Magnitude and Complex Mask Estimation
Xiaofeng Shu
Yehang Zhu
Yanjie Chen
Li Chen
Haohe Liu
Chuanzeng Huang
Yuxuan Wang
30
11
0
20 Jul 2021
Codified audio language modeling learns useful representations for music
  information retrieval
Codified audio language modeling learns useful representations for music information retrieval
Rodrigo Castellon
Chris Donahue
Percy Liang
101
88
0
12 Jul 2021
Broadcasted Residual Learning for Efficient Keyword Spotting
Broadcasted Residual Learning for Efficient Keyword Spotting
Byeonggeun Kim
Simyung Chang
Jinkyu Lee
Dooyong Sung
51
122
0
08 Jun 2021
Slow-Fast Auditory Streams For Audio Recognition
Slow-Fast Auditory Streams For Audio Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
69
67
0
05 Mar 2021
Speech enhancement with weakly labelled data from AudioSet
Speech enhancement with weakly labelled data from AudioSet
Qiuqiang Kong
Haohe Liu
Xingjian Du
Li Chen
Rui Xia
Yuxuan Wang
58
18
0
19 Feb 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
152
147
0
02 Feb 2021
LEAF: A Learnable Frontend for Audio Classification
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
98
146
0
21 Jan 2021
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
67
453
0
01 Oct 2020
Channel-wise Subband Input for Better Voice and Accompaniment Separation
  on High Resolution Music
Channel-wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music
Haohe Liu
Lei Xie
Jian Wu
Geng Yang
34
31
0
12 Aug 2020
VGGSound: A Large-scale Audio-Visual Dataset
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
70
573
0
29 Apr 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
160
1,074
0
21 Dec 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
129
18,058
0
28 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech
  Recognition
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
159
3,451
0
18 Apr 2019
Interpretable Convolutional Filters with SincNet
Interpretable Convolutional Filters with SincNet
Mirco Ravanelli
Yoshua Bengio
47
105
0
23 Nov 2018
Speaker Recognition from Raw Waveform with SincNet
Speaker Recognition from Raw Waveform with SincNet
Mirco Ravanelli
Yoshua Bengio
130
712
0
29 Jul 2018
Norm-Preservation: Why Residual Networks Can Become Extremely Deep?
Norm-Preservation: Why Residual Networks Can Become Extremely Deep?
Alireza Zaeemzadeh
Nazanin Rahnavard
M. Shah
55
70
0
18 May 2018
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Pete Warden
74
1,615
0
09 Apr 2018
Raw Waveform-based Audio Classification Using Sample-level CNN
  Architectures
Raw Waveform-based Audio Classification Using Sample-level CNN Architectures
Jongpil Lee
Taejun Kim
Jiyoung Park
Juhan Nam
38
67
0
04 Dec 2017
Learning Filterbanks from Raw Speech for Phone Recognition
Learning Filterbanks from Raw Speech for Phone Recognition
Neil Zeghidour
Nicolas Usunier
Iasonas Kokkinos
Thomas Schatz
Gabriel Synnaeve
Emmanuel Dupoux
64
119
0
03 Nov 2017
mixup: Beyond Empirical Risk Minimization
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
269
9,743
0
25 Oct 2017
Implicit Regularization in Deep Learning
Implicit Regularization in Deep Learning
Behnam Neyshabur
50
146
0
06 Sep 2017
Comparison of Time-Frequency Representations for Environmental Sound
  Classification using Convolutional Neural Networks
Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks
M. Huzaifah
AI4TS
54
148
0
22 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
624
130,942
0
12 Jun 2017
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Jesse Engel
Cinjon Resnick
Adam Roberts
Sander Dieleman
Douglas Eck
Karen Simonyan
Mohammad Norouzi
106
623
0
05 Apr 2017
Opening the Black Box of Deep Neural Networks via Information
Opening the Black Box of Deep Neural Networks via Information
Ravid Shwartz-Ziv
Naftali Tishby
AI4CE
98
1,407
0
02 Mar 2017
Trainable Frontend For Robust and Far-Field Keyword Spotting
Trainable Frontend For Robust and Far-Field Keyword Spotting
Yuxuan Wang
Pascal Getreuer
Thad Hughes
R. Lyon
Rif A. Saurous
74
142
0
19 Jul 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
415
43,234
0
11 Feb 2015
1