Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.14078
Cited By
v1
v2 (latest)
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
28 September 2022
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MeWEHV: Mel and Wave Embeddings for Human Voice Tasks"
37 / 37 papers shown
Title
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
56
27
0
05 Apr 2022
BERT-LID: Leveraging BERT to Improve Spoken Language Identification
Yuting Nie
Junhong Zhao
Weiqiang Zhang
Jinfeng Bai
VLM
61
5
0
01 Mar 2022
Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Quan Wang
Yang Yu
Jason W. Pelecanos
Yiling Huang
Ignacio López Moreno
54
14
0
24 Feb 2022
Emotional Speaker Identification using a Novel Capsule Nets Model
Ali Bou Nassif
I. Shahin
A. Elnagar
Divya Velayudhan
A. Alhudhaif
K. Polat
57
28
0
09 Jan 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
265
1,905
0
26 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
184
3,003
0
14 Jun 2021
SUPERB: Speech processing Universal PERformance Benchmark
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
...
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
SSL
108
942
0
03 May 2021
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Xian Shi
Fan Yu
Yizhou Lu
Yuhao Liang
Qiangze Feng
Daliang Wang
Y. Qian
Lei Xie
55
67
0
20 Feb 2021
AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge
Houjun Huang
Xu Xiang
Yexin Yang
Rao Ma
Y. Qian
70
25
0
19 Feb 2021
CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions
Ali Bou Nassif
I. Shahin
Shibani Hamsa
Nawel Nemmour
K. Hirose
46
58
0
11 Feb 2021
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
99
511
0
07 Dec 2020
Audio Tagging by Cross Filtering Noisy Labels
Boqing Zhu
Kele Xu
Qiuqiang Kong
Huaimin Wang
Yuxing Peng
NoLa
298
16
0
16 Jul 2020
NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling
Shareef Babu Kalluri
Deepu Vijayasenan
Sriram Ganapathy
M. RageshRajan
Prashant Krishnan
44
18
0
12 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
154
782
0
24 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
299
5,849
0
20 Jun 2020
AP20-OLR Challenge: Three Tasks and Their Baselines
Zheng Li
Miao Zhao
Q. Hong
Lin Li
Zhiyuan Tang
Dong Wang
Liming Song
Cheng Yang
67
34
0
04 Jun 2020
AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition
Afroz Ahamad
Ankit Anand
Pranesh Bhargava
34
23
0
16 May 2020
Towards Learning a Universal Non-Semantic Representation of Speech
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
130
158
0
25 Feb 2020
Multi-Representation Knowledge Distillation For Audio Classification
Liang Gao
Kele Xu
Huaimin Wang
Yuxing Peng
110
26
0
22 Feb 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
194
1,084
0
21 Dec 2019
Libri-Light: A Benchmark for ASR with Limited or No Supervision
Jacob Kahn
M. Rivière
Weiyi Zheng
Evgeny Kharitonov
Qiantong Xu
...
Tatiana Likhomanenko
Gabriel Synnaeve
Armand Joulin
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
75
673
0
17 Dec 2019
Common Voice: A Massively-Multilingual Speech Corpus
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
VLM
93
1,620
0
13 Dec 2019
A Comprehensive Survey on Transfer Learning
Fuzhen Zhuang
Zhiyuan Qi
Keyu Duan
Dongbo Xi
Yongchun Zhu
Hengshu Zhu
Hui Xiong
Qing He
188
4,474
0
07 Nov 2019
Spoken Language Identification using ConvNets
Sarthak
Shikhar Shukla
Govind Mittal
39
28
0
09 Oct 2019
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible
Marcely Zanon Boito
William N. Havard
Mahault Garnerin
Éric Le Ferrand
Laurent Besacier
80
47
0
30 Jul 2019
Speech Model Pre-training for End-to-End Spoken Language Understanding
Loren Lugosch
Mirco Ravanelli
Patrick Ignoto
Vikrant Singh Tomar
Yoshua Bengio
SyDa
AuLLM
70
355
0
07 Apr 2019
A Survey of the Recent Architectures of Deep Convolutional Neural Networks
Asifullah Khan
A. Sohail
Umme Zahoora
Aqsa Saeed Qureshi
OOD
114
2,310
0
17 Jan 2019
Novel Cascaded Gaussian Mixture Model-Deep Neural Network Classifier for Speaker Identification in Emotional Talking Environments
I. Shahin
Ali Bou Nassif
Shibani Hamsa
27
49
0
11 Oct 2018
General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline
Eduardo Fonseca
Manoj Plakal
F. Font
D. Ellis
Xavier Favory
Jordi Pons
Xavier Serra
91
149
0
26 Jul 2018
A multi-device dataset for urban acoustic scene classification
A. Mesaros
Toni Heittola
Tuomas Virtanen
35
381
0
25 Jul 2018
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning
Jose Camacho-Collados
Mohammad Taher Pilehvar
80
341
0
10 May 2018
Negative Log Likelihood Ratio Loss for Deep Neural Network Classification
Donglai Zhu
Hengshuai Yao
Bei Jiang
YU Peng
54
78
0
27 Apr 2018
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Pete Warden
97
1,627
0
09 Apr 2018
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
127
2,283
0
26 Jun 2017
Learning without Forgetting
Zhizhong Li
Derek Hoiem
CLL
OOD
SSL
308
4,432
0
29 Jun 2016
THCHS-30 : A Free Chinese Speech Corpus
Dong Wang
Xuewei Zhang
84
233
0
07 Dec 2015
Semi-Supervised Learning with Deep Generative Models
Diederik P. Kingma
Danilo Jimenez Rezende
S. Mohamed
Max Welling
GAN
SSL
BDL
100
2,742
0
20 Jun 2014
1