Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

18 May 2018

Papers citing "Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces"

27 / 27 papers shown

Title
Speechless: Speech Instruction Training Without Speech for Low Resource Languages Alan Dao Dinh Bach Vu Huy Hoang Ha Tuan Le Duc Anh Shreyas Gopal Yue Heng Yeo Warren Keng Hoong Low Eng Siong Chng J. Yip SyDa 24 0 0 23 May 2025
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only Yi-Chen Chen Chia-Hao Shen Sung-Feng Huang Hung-yi Lee 27 19 0 29 Mar 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech Yu-An Chung James R. Glass 3DV 49 184 0 23 Mar 2018
Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation A. Kocabiyikoglu Laurent Besacier Olivier Kraif 35 104 0 09 Feb 2018
Learning Word Embeddings from Speech Yu-An Chung James R. Glass SSL 58 19 0 05 Nov 2017
Unsupervised Machine Translation Using Monolingual Corpora Only Guillaume Lample Alexis Conneau Ludovic Denoyer MarcÁurelio Ranzato SSL 86 1,091 0 31 Oct 2017
Unsupervised Neural Machine Translation Mikel Artetxe Gorka Labaka Eneko Agirre Kyunghyun Cho 62 772 0 30 Oct 2017
Word Translation Without Parallel Data Alexis Conneau Guillaume Lample MarcÁurelio Ranzato Ludovic Denoyer Hervé Jégou 253 1,646 0 11 Oct 2017
An embedded segmental K-means model for unsupervised segmentation and clustering of speech Herman Kamper Karen Livescu Sharon Goldwater 32 95 0 23 Mar 2017
Offline bilingual word vectors, orthogonal transformations and the inverted softmax Samuel L. Smith David H. P. Turban Steven Hamblin Nils Y. Hammerla OffRL 42 536 0 13 Feb 2017
Multi-view Recurrent Neural Acoustic Word Embeddings Wanjia He Weiran Wang Karen Livescu 35 85 0 14 Nov 2016
Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches Shane Settle Karen Livescu 31 87 0 08 Nov 2016
Enriching Word Vectors with Subword Information Piotr Bojanowski Edouard Grave Armand Joulin Tomas Mikolov NAI SSL VLM 179 9,924 0 15 Jul 2016
Learning Crosslingual Word Embeddings without Bilingual Corpora Long Duong H. Kanayama Tengfei Ma Steven Bird Trevor Cohn 43 115 0 30 Jun 2016
A segmental framework for fully-unsupervised large-vocabulary speech recognition Herman Kamper A. Jansen Sharon Goldwater 38 103 0 22 Jun 2016
Unsupervised word segmentation and lexicon discovery using acoustic word embeddings Herman Kamper A. Jansen Sharon Goldwater SSL 21 74 0 09 Mar 2016
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder Yu-An Chung Chao-Chung Wu Chia-Hao Shen Hung-yi Lee Lin-Shan Lee AI4TS 50 182 0 03 Mar 2016
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin Dario Amodei Rishita Anubhai Eric Battenberg Carl Case Jared Casper ... Chong-Jun Wang Bo Xiao Dani Yogatama J. Zhan Zhenyao Zhu 87 2,965 0 08 Dec 2015
Deep convolutional acoustic word embeddings using word-pair side information Herman Kamper Weiran Wang Karen Livescu SSL 26 171 0 05 Oct 2015
Listen, Attend and Spell William Chan Navdeep Jaitly Quoc V. Le Oriol Vinyals RALM 121 2,257 0 05 Aug 2015
Attention-Based Models for Speech Recognition J. Chorowski Dzmitry Bahdanau Dmitriy Serdyuk Kyunghyun Cho Yoshua Bengio 81 2,602 0 24 Jun 2015
Improving zero-shot learning by mitigating the hubness problem Georgiana Dinu Angeliki Lazaridou Marco Baroni VLM 43 379 0 20 Dec 2014
Sequence to Sequence Learning with Neural Networks Ilya Sutskever Oriol Vinyals Quoc V. Le AIMat 242 20,467 0 10 Sep 2014
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Kyunghyun Cho B. V. Merrienboer Çağlar Gülçehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk Yoshua Bengio AIMat 517 23,235 0 03 Jun 2014
Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov Ilya Sutskever Kai Chen G. Corrado J. Dean NAI OCL 239 33,445 0 16 Oct 2013
Exploiting Similarities among Languages for Machine Translation Tomas Mikolov Quoc V. Le Ilya Sutskever 42 1,594 0 17 Sep 2013
Speech Recognition with Deep Recurrent Neural Networks Alex Graves Abdel-rahman Mohamed Geoffrey E. Hinton 106 8,503 0 22 Mar 2013