Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.01621
Cited By
ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition
3 June 2021
S. Verbitskiy
Vladimir Berikov
Viacheslav Vyshegorodtsev
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition"
32 / 32 papers shown
Title
Exploring Performance-Complexity Trade-Offs in Sound Event Detection
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
43
0
0
14 Mar 2025
Multimodal Emotion Recognition using Audio-Video Transformer Fusion with Cross Attention
Joe Dhanith
Shravan Venkatraman
Modigari Narendra
Vigya Sharma
Santhosh Malarvannan
84
0
0
20 Feb 2025
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities
Bilal Faye
Hanane Azzag
M. Lebbah
ObjD
41
0
0
17 Sep 2024
Audio-based Step-count Estimation for Running -- Windowing and Neural Network Baselines
Philipp Wagner
Andreas Triantafyllopoulos
Alexander Gebhard
Björn Schuller
37
0
0
10 Jun 2024
MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion Recognition
Peihao Xiang
Chaohao Lin
Kaida Wu
Ou Bai
34
3
0
28 Apr 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
44
1
0
21 Apr 2024
Mixer is more than just a model
Qingfeng Ji
Yuxin Wang
Letong Sun
40
0
0
28 Feb 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
52
8
0
29 Jan 2024
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
Guoying Zhao
Zheng Lian
Bin Liu
Jianhua Tao
53
29
0
11 Jan 2024
VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition
John Fischer
Marko Orescanin
Eric Eckstrand
UQCV
BDL
24
4
0
10 Jan 2024
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
Zhengcong Fei
Mingyuan Fan
Junshi Huang
25
17
0
27 Nov 2023
TACNET: Temporal Audio Source Counting Network
Amirreza Ahmadnejad
Ahmad Mahmmodian Darviishani
Mohmmad Mehrdad Asadi
Sajjad Saffariyeh
Pedram Yousef
Emad Fatemizadeh
34
2
0
04 Nov 2023
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
Florian Schmid
Khaled Koutini
Gerhard Widmer
18
11
0
24 Oct 2023
Audio classification with Dilated Convolution with Learnable Spacings
Ismail Khalfaoui-Hassani
T. Masquelier
Thomas Pellegrini
20
1
0
25 Sep 2023
Multimodal Fish Feeding Intensity Assessment in Aquaculture
Meng Cui
Xubo Liu
Haohe Liu
Zhuangzhuang Du
Tao Chen
Guoping Lian
Daoliang Li
Wenwu Wang
28
5
0
10 Sep 2023
E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural Networks
Arshdeep Singh
Haohe Liu
Mark D. Plumbley
VLM
19
4
0
30 May 2023
Zorro: the masked multimodal transformer
Adrià Recasens
Jason Lin
João Carreira
Drew Jaegle
Luyu Wang
...
Pauline Luc
Antoine Miech
Lucas Smaira
Ross Hemsley
Andrew Zisserman
39
20
0
23 Jan 2023
e-Inu: Simulating A Quadruped Robot With Emotional Sentience
Abhirup Chakravarty
Jatin Karthik Tripathy
S. Chakkaravarthy
Aswani Kumar Cherukuri
S. Anitha
Firuz Kamalov
Annapurna Jonnalagadda
14
0
0
03 Jan 2023
BEATs: Audio Pre-Training with Acoustic Tokenizers
Sanyuan Chen
Yu-Huan Wu
Chengyi Wang
Shujie Liu
Daniel C. Tompkins
Zhuo Chen
Furu Wei
41
255
0
18 Dec 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
25
58
0
09 Nov 2022
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Yunhao Chen
Yunjie Zhu
Zihui Yan
Yifan Huang
Zhen Ren
Jianlu Shen
Lifang Chen
28
9
0
05 Nov 2022
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
26
0
0
12 Aug 2022
CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Xin-Cheng Wen
Jiaxin Ye
Yan Luo
Yong-mei Xu
Xinyu Wang
Changqing Wu
Kun Liu
27
30
0
18 Jul 2022
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
21
268
0
13 Jul 2022
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Billy Li
Shuhui Qu
Po-Yao (Bernie) Huang
Florian Metze
VLM
36
9
0
25 Mar 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
121
264
0
02 Feb 2022
A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition
Ziwang Fu
Feng Liu
Hanyang Wang
Jiayin Qi
Xiangling Fu
Aimin Zhou
Zhibin Li
17
30
0
03 Nov 2021
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Sangeeta Srivastava
Yun Wang
Andros Tjandra
Anurag Kumar
Chunxi Liu
Kritika Singh
Yatharth Saraf
SSL
33
24
0
14 Oct 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
25
360
0
24 Jun 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
262
656
0
23 Mar 2020
1