v1v2v3v4 (latest)

Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder

3 March 2016

Papers citing "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder"

50 / 94 papers shown

Title
Visually Grounded Speech Models have a Mutual Exclusivity Bias Leanne Nortje Dan Oneaţă Yevgen Matusevych Herman Kamper SSL 91 1 0 20 Mar 2024
Acoustic models of Brazilian Portuguese Speech based on Neural Transformers M. Gauy Marcelo Finger 45 2 0 14 Dec 2023
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors Shuyue Stella Li Beining Xu Xiangyu Zhang Hexin Liu Wen-Han Chao Leibny Paola García SSL 64 4 0 27 Nov 2023
Spoken Word2Vec: Learning Skipgram Embeddings from Speech Mohammad Amaan Sayeed Hanan Aldarmaki 57 0 0 15 Nov 2023
Matching Latent Encoding for Audio-Text based Keyword Spotting K. Nishu Minsik Cho Devang Naik 86 16 0 08 Jun 2023
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili C. Jacobs Nathanaël Carraz Rakotonirina E. Chimoto Bruce A. Bassett Herman Kamper 75 5 0 01 Jun 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss Hiroshi Sato Ryo Masumura Tsubasa Ochiai Marc Delcroix Takafumi Moriya ... Kentaro Shinayama Saki Mizuno Mana Ihori Tomohiro Tanaka Nobukatsu Hojo 81 5 0 24 May 2023
Exploring How Generative Adversarial Networks Learn Phonological Representations Jing Chen Micha Elsner GAN 65 4 0 21 May 2023
A Survey on Time-Series Pre-Trained Models Qianli Ma Ziqiang Liu Zhenjing Zheng Ziyang Huang Siying Zhu Zhongzhong Yu James T. Kwok AI4TS 103 56 0 18 May 2023
End-to-End Speech Recognition: A Survey Rohit Prabhavalkar Takaaki Hori Tara N. Sainath Ralf Schluter Shinji Watanabe VLM 94 172 0 03 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages Sreepratha Ram Hanan Aldarmaki SSL 71 3 0 03 Jan 2023
TESSP: Text-Enhanced Self-Supervised Speech Pre-training Zhuoyuan Yao Shuo Ren Sanyuan Chen Ziyang Ma Pengcheng Guo Linfu Xie 93 5 0 24 Nov 2022
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach Xulong Zhang Jianzong Wang Ning Cheng Kexin Zhu Jing Xiao 65 1 0 25 Oct 2022
TVLT: Textless Vision-Language Transformer Zineng Tang Jaemin Cho Yixin Nie Joey Tianyi Zhou VLM 137 31 0 28 Sep 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network Da-Rong Liu Po-Chun Hsu Yi-Chen Chen Sung-Feng Huang Shun-Po Chuang Da-Yi Wu Hung-yi Lee GAN 74 7 0 29 Jul 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation Jeong-Eun Choi Seongwon Jang Hyunsouk Cho Sehee Chung SSL 48 6 0 10 Jul 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 293 368 0 21 May 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning Algayres Robin Adel Nabli Benoît Sagot Emmanuel Dupoux SSL 79 8 0 11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition N. J. Wang Zongfeng Quan Shaojun Wang Jing Xiao 48 1 0 08 Apr 2022
Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings Myunghun Jung Hoirin Kim 81 4 0 30 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data Gašper Beguš Alan Zhou SSL 124 5 0 22 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 101 11 0 01 Mar 2022
On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification A. Sarkar Zheng-Hua Tan 56 2 0 17 Jan 2022
Deep Spoken Keyword Spotting: An Overview Iván López-Espejo Zheng-Hua Tan John H. L. Hansen Jesper Jensen 89 107 0 20 Nov 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training Ankur Bapna Yu-An Chung Na Wu Anmol Gulati Ye Jia J. Clark Melvin Johnson Jason Riesa Alexis Conneau Yu Zhang VLM 139 96 0 20 Oct 2021
Interpreting intermediate convolutional layers in unsupervised acoustic word classification Gašper Beguš Alan Zhou FAtt SSL 75 5 0 05 Oct 2021
Modeling Dynamics of Facial Behavior for Mental Health Assessment Minh Tran Ellen R. Bradley Michelle Matvey J. Woolley M. Soleymani CVBM 45 3 0 23 Aug 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation Jian Luo Jianzong Wang Ning Cheng Jing Xiao SSL 79 6 0 09 Jul 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 57 59 0 09 Jun 2021
A Novel Semi-supervised Framework for Call Center Agent Malpractice Detection via Neural Feature Learning cSukru Ozan Leonardo O. Iheme 39 4 0 04 Jun 2021
Unsupervised Discriminative Learning of Sounds for Audio Event Classification Sascha Hornauer Ke Li Stella X. Yu Shabnam Ghaffarzadegan Liu Ren SSL 69 5 0 19 May 2021
Interpreting intermediate convolutional layers of generative CNNs trained on waveforms Gašper Beguš Alan Zhou 77 7 0 19 Apr 2021
Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales Jacob Andreas Gašper Beguš M. Bronstein R. Diamant Denley Delaney ... D. Tchernov P. Tønnesen Antonio Torralba Daniel M. Vogt Robert J. Wood 60 10 0 17 Apr 2021
Utilizing Self-supervised Representations for MOS Prediction Wei-Cheng Tseng Chien-yu Huang Wei-Tsung Kao Yist Y. Lin Hung-yi Lee SSL 117 65 0 07 Apr 2021
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines Jingsong Wang Yuxuan He Chunyu Zhao Qijie Shao Wei-Wei Tu Tom Ko Hung-yi Lee Lei Xie 66 4 0 31 Mar 2021
Broad-UNet: Multi-scale feature learning for nowcasting tasks Jesús García Fernández S. Mehrkanoon 70 70 0 12 Feb 2021
A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings Lisa van Staden Herman Kamper SSL 67 16 0 14 Dec 2020
Acoustic span embeddings for multilingual query-by-example search Yushi Hu Shane Settle Karen Livescu RALM 74 8 0 24 Nov 2020
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training Sung-Feng Huang Shun-Po Chuang Da-Rong Liu Yi-Chen Chen Gene-Ping Yang Hung-yi Lee SSL 92 14 0 29 Oct 2020
Probing Acoustic Representations for Phonetic Properties Danni Ma Neville Ryant M. Liberman 110 45 0 25 Oct 2020
Contrastive Learning of General-Purpose Audio Representations Aaqib Saeed David Grangier Neil Zeghidour VLM SSL 91 272 0 21 Oct 2020
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication Gašper Beguš GAN SSL 59 16 0 13 Sep 2020
Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder Si-Ioi Ng Tan Lee 42 7 0 07 Aug 2020
Evaluating computational models of infant phonetic learning across languages Yevgen Matusevych Thomas Schatz Herman Kamper Naomi H Feldman Sharon Goldwater 57 14 0 06 Aug 2020
Evaluating the reliability of acoustic speech embeddings Robin Algayres Mohamed Salah Zaiem Benoît Sagot Emmanuel Dupoux 94 29 0 27 Jul 2020
Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings Bowen Shi Shane Settle Karen Livescu 74 4 0 01 Jul 2020
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks Gašper Beguš GAN 72 35 0 04 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder Kazi Nazmul Haque R. Rana Björn W Schuller DRL 100 12 0 01 Jun 2020
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding Yu-An Chung James R. Glass SSL 90 56 0 11 Apr 2020
Analyzing autoencoder-based acoustic word embeddings Yevgen Matusevych Herman Kamper Sharon Goldwater 59 12 0 03 Apr 2020