ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.09145
  4. Cited By
GAFX: A General Audio Feature eXtractor

GAFX: A General Audio Feature eXtractor

19 July 2022
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
ArXivPDFHTML

Papers citing "GAFX: A General Audio Feature eXtractor"

28 / 28 papers shown
Title
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Dading Chong
Helin Wang
Peilin Zhou
Qingcheng Zeng
64
67
0
27 Apr 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning
  General-purpose Audio Representation
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
72
68
0
26 Apr 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
54
172
0
05 Nov 2021
Codified audio language modeling learns useful representations for music
  information retrieval
Codified audio language modeling learns useful representations for music information retrieval
Rodrigo Castellon
Chris Donahue
Percy Liang
101
89
0
12 Jul 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
104
863
0
05 Apr 2021
LEAF: A Learnable Frontend for Audio Classification
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
100
147
0
21 Jan 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
377
6,762
0
23 Dec 2020
Rethinking CNN Models for Audio Classification
Rethinking CNN Models for Audio Classification
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
59
144
0
22 Jul 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio
  Representation
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
36
147
0
18 May 2020
Jukebox: A Generative Model for Music
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
107
746
0
30 Apr 2020
ESResNet: Environmental Sound Classification Based on Visual Domain
  Models
ESResNet: Environmental Sound Classification Based on Visual Domain Models
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
VLM
104
92
0
15 Apr 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
184
1,076
0
21 Dec 2019
Audiovisual Transformer Architectures for Large-Scale Classification and
  Synchronization of Weakly Labeled Audio Events
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
Wim Boes
Hugo Van hamme
46
17
0
02 Dec 2019
Music Source Separation in the Waveform Domain
Music Source Separation in the Waveform Domain
Alexandre Défossez
Nicolas Usunier
Léon Bottou
Francis R. Bach
114
272
0
27 Nov 2019
Demucs: Deep Extractor for Music Sources with extra unlabeled data
  remixed
Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed
Alexandre Défossez
Nicolas Usunier
Léon Bottou
Francis R. Bach
54
84
0
03 Sep 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
137
18,115
0
28 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source
  Separation
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
Daniel Stoller
Sebastian Ewert
S. Dixon
AI4TS
128
595
0
08 Jun 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
178
19,271
0
13 Jan 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,697
0
16 Dec 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
CNN Architectures for Large-Scale Audio Classification
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
120
2,498
0
29 Sep 2016
WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
401
7,391
0
12 Sep 2016
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
Aaron van den Oord
Nal Kalchbrenner
Oriol Vinyals
L. Espeholt
Alex Graves
Koray Kavukcuoglu
VLM
202
2,509
0
16 Jun 2016
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
875
27,358
0
02 Dec 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.8K
77,133
0
18 May 2015
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,348
0
04 Sep 2014
1