Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.08323
Cited By
v1
v2 (latest)
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
14 November 2023
Jian Zhu
Changbing Yang
Farhan Samir
Jahurul Islam
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language"
21 / 21 papers shown
Title
PyThaiNLP: Thai Natural Language Processing in Python
Wannaphong Phatthiyaphaibun
Korakot Chaovavanich
Charin Polpanumas
Arthit Suriyawongkul
Lalita Lowphansirikul
Pattarawat Chormai
Peerat Limkonchotiwat
Thanathip Suntorntip
Can Udomcharoenchaikit
76
90
0
07 Dec 2023
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords
Yong-Hyeok Lee
Namhyun Cho
46
21
0
31 Aug 2023
Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes
Kevin Glocker
Aaricia Herygers
Munir Georges
60
6
0
07 Jun 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
77
22
0
21 May 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
248
1,200
0
27 Mar 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
203
3,732
0
06 Dec 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
129
537
0
12 Nov 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
64
2
0
23 Oct 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
138
328
0
25 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
61
37
0
17 May 2022
ByT5 model for massively multilingual grapheme-to-phoneme conversion
Jian Zhu
Cong Zhang
David Jurgens
42
41
0
06 Apr 2022
g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin
Yi-Chang Chen
Yu-Chuan Chang
Yenling Chang
Yi-Ren Yeh
54
15
0
20 Mar 2022
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
110
708
0
17 Nov 2021
Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Qiantong Xu
Alexei Baevski
Michael Auli
VLM
133
90
0
23 Sep 2021
Keyword Transformer: A Self-Attention Model for Keyword Spotting
Axel Berg
Mark O'Connor
M. T. Cruz
71
136
0
01 Apr 2021
fugashi, a Tool for Tokenizing Japanese in Python
Paul O'Leary McCann
AI4TS
34
26
0
14 Oct 2020
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation
Felix Kreuk
Joseph Keshet
Yossi Adi
SSL
63
79
0
27 Jul 2020
A Corpus for Large-Scale Phonetic Typology
Elizabeth Salesky
Eleanor Chodroff
Tiago Pimentel
Sanjeev Khudanpur
Ryan Cotterell
A. Black
Jason Eisner
49
28
0
28 May 2020
Streaming keyword spotting on mobile devices
Oleg Rybakov
Natasha Kononenko
Niranjan A. Subrahmanya
Mirkó Visontai
Stella Laurenzo
AI4TS
106
112
0
14 May 2020
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
226
1,173
0
29 Apr 2018
Deep Residual Learning for Small-Footprint Keyword Spotting
Raphael Tang
Jimmy J. Lin
76
237
0
28 Oct 2017
1