Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.00982
Cited By
v1
v2
v3
v4 (latest)
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder
3 March 2016
Yu-An Chung
Chao-Chung Wu
Chia-Hao Shen
Hung-yi Lee
Lin-Shan Lee
AI4TS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"
50 / 94 papers shown
Title
Visually Grounded Speech Models have a Mutual Exclusivity Bias
Leanne Nortje
Dan Oneaţă
Yevgen Matusevych
Herman Kamper
SSL
91
1
0
20 Mar 2024
Acoustic models of Brazilian Portuguese Speech based on Neural Transformers
M. Gauy
Marcelo Finger
45
2
0
14 Dec 2023
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
64
4
0
27 Nov 2023
Spoken Word2Vec: Learning Skipgram Embeddings from Speech
Mohammad Amaan Sayeed
Hanan Aldarmaki
57
0
0
15 Nov 2023
Matching Latent Encoding for Audio-Text based Keyword Spotting
K. Nishu
Minsik Cho
Devang Naik
86
16
0
08 Jun 2023
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
C. Jacobs
Nathanaël Carraz Rakotonirina
E. Chimoto
Bruce A. Bassett
Herman Kamper
75
5
0
01 Jun 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Hiroshi Sato
Ryo Masumura
Tsubasa Ochiai
Marc Delcroix
Takafumi Moriya
...
Kentaro Shinayama
Saki Mizuno
Mana Ihori
Tomohiro Tanaka
Nobukatsu Hojo
81
5
0
24 May 2023
Exploring How Generative Adversarial Networks Learn Phonological Representations
Jing Chen
Micha Elsner
GAN
65
4
0
21 May 2023
A Survey on Time-Series Pre-Trained Models
Qianli Ma
Ziqiang Liu
Zhenjing Zheng
Ziyang Huang
Siying Zhu
Zhongzhong Yu
James T. Kwok
AI4TS
103
56
0
18 May 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
94
172
0
03 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages
Sreepratha Ram
Hanan Aldarmaki
SSL
71
3
0
03 Jan 2023
TESSP: Text-Enhanced Self-Supervised Speech Pre-training
Zhuoyuan Yao
Shuo Ren
Sanyuan Chen
Ziyang Ma
Pengcheng Guo
Linfu Xie
93
5
0
24 Nov 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
Xulong Zhang
Jianzong Wang
Ning Cheng
Kexin Zhu
Jing Xiao
65
1
0
25 Oct 2022
TVLT: Textless Vision-Language Transformer
Zineng Tang
Jaemin Cho
Yixin Nie
Joey Tianyi Zhou
VLM
137
31
0
28 Sep 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
74
7
0
29 Jul 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation
Jeong-Eun Choi
Seongwon Jang
Hyunsouk Cho
Sehee Chung
SSL
48
6
0
10 Jul 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
293
368
0
21 May 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Algayres Robin
Adel Nabli
Benoît Sagot
Emmanuel Dupoux
SSL
79
8
0
11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition
N. J. Wang
Zongfeng Quan
Shaojun Wang
Jing Xiao
48
1
0
08 Apr 2022
Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Myunghun Jung
Hoirin Kim
81
4
0
30 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data
Gašper Beguš
Alan Zhou
SSL
124
5
0
22 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
101
11
0
01 Mar 2022
On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification
A. Sarkar
Zheng-Hua Tan
56
2
0
17 Jan 2022
Deep Spoken Keyword Spotting: An Overview
Iván López-Espejo
Zheng-Hua Tan
John H. L. Hansen
Jesper Jensen
89
107
0
20 Nov 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
139
96
0
20 Oct 2021
Interpreting intermediate convolutional layers in unsupervised acoustic word classification
Gašper Beguš
Alan Zhou
FAtt
SSL
75
5
0
05 Oct 2021
Modeling Dynamics of Facial Behavior for Mental Health Assessment
Minh Tran
Ellen R. Bradley
Michelle Matvey
J. Woolley
M. Soleymani
CVBM
45
3
0
23 Aug 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
79
6
0
09 Jul 2021
Unsupervised Automatic Speech Recognition: A Review
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLM
SSL
57
59
0
09 Jun 2021
A Novel Semi-supervised Framework for Call Center Agent Malpractice Detection via Neural Feature Learning
cSukru Ozan
Leonardo O. Iheme
39
4
0
04 Jun 2021
Unsupervised Discriminative Learning of Sounds for Audio Event Classification
Sascha Hornauer
Ke Li
Stella X. Yu
Shabnam Ghaffarzadegan
Liu Ren
SSL
69
5
0
19 May 2021
Interpreting intermediate convolutional layers of generative CNNs trained on waveforms
Gašper Beguš
Alan Zhou
77
7
0
19 Apr 2021
Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales
Jacob Andreas
Gašper Beguš
M. Bronstein
R. Diamant
Denley Delaney
...
D. Tchernov
P. Tønnesen
Antonio Torralba
Daniel M. Vogt
Robert J. Wood
60
10
0
17 Apr 2021
Utilizing Self-supervised Representations for MOS Prediction
Wei-Cheng Tseng
Chien-yu Huang
Wei-Tsung Kao
Yist Y. Lin
Hung-yi Lee
SSL
117
65
0
07 Apr 2021
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines
Jingsong Wang
Yuxuan He
Chunyu Zhao
Qijie Shao
Wei-Wei Tu
Tom Ko
Hung-yi Lee
Lei Xie
66
4
0
31 Mar 2021
Broad-UNet: Multi-scale feature learning for nowcasting tasks
Jesús García Fernández
S. Mehrkanoon
70
70
0
12 Feb 2021
A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings
Lisa van Staden
Herman Kamper
SSL
67
16
0
14 Dec 2020
Acoustic span embeddings for multilingual query-by-example search
Yushi Hu
Shane Settle
Karen Livescu
RALM
74
8
0
24 Nov 2020
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Sung-Feng Huang
Shun-Po Chuang
Da-Rong Liu
Yi-Chen Chen
Gene-Ping Yang
Hung-yi Lee
SSL
92
14
0
29 Oct 2020
Probing Acoustic Representations for Phonetic Properties
Danni Ma
Neville Ryant
M. Liberman
110
45
0
25 Oct 2020
Contrastive Learning of General-Purpose Audio Representations
Aaqib Saeed
David Grangier
Neil Zeghidour
VLM
SSL
91
272
0
21 Oct 2020
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication
Gašper Beguš
GAN
SSL
59
16
0
13 Sep 2020
Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder
Si-Ioi Ng
Tan Lee
42
7
0
07 Aug 2020
Evaluating computational models of infant phonetic learning across languages
Yevgen Matusevych
Thomas Schatz
Herman Kamper
Naomi H Feldman
Sharon Goldwater
57
14
0
06 Aug 2020
Evaluating the reliability of acoustic speech embeddings
Robin Algayres
Mohamed Salah Zaiem
Benoît Sagot
Emmanuel Dupoux
94
29
0
27 Jul 2020
Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings
Bowen Shi
Shane Settle
Karen Livescu
74
4
0
01 Jul 2020
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks
Gašper Beguš
GAN
72
35
0
04 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
100
12
0
01 Jun 2020
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding
Yu-An Chung
James R. Glass
SSL
90
56
0
11 Apr 2020
Analyzing autoencoder-based acoustic word embeddings
Yevgen Matusevych
Herman Kamper
Sharon Goldwater
59
12
0
03 Apr 2020
1
2
Next