ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.00991
  4. Cited By
Data Augmenting Contrastive Learning of Speech Representations in the
  Time Domain

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

2 July 2020
Eugene Kharitonov
M. Rivière
Gabriel Synnaeve
Lior Wolf
Pierre-Emmanuel Mazaré
Matthijs Douze
Emmanuel Dupoux
ArXivPDFHTML

Papers citing "Data Augmenting Contrastive Learning of Speech Representations in the Time Domain"

37 / 37 papers shown
Title
Towards Attention-based Contrastive Learning for Audio Spoof Detection
Towards Attention-based Contrastive Learning for Audio Spoof Detection
C. Goel
Surya Koppisetti
Ben Colman
Ali Shahriyari
Gaurav Bharaj
60
5
0
03 Jul 2024
MAD Speech: Measures of Acoustic Diversity of Speech
MAD Speech: Measures of Acoustic Diversity of Speech
Matthieu Futeral
A. Agostinelli
Marco Tagliasacchi
Neil Zeghidour
Eugene Kharitonov
56
1
0
16 Apr 2024
PhasePerturbation: Speech Data Augmentation via Phase Perturbation for
  Automatic Speech Recognition
PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition
Chengxi Lei
Satwinder Singh
Feng Hou
Xiaoyun Jia
Ruili Wang
25
1
0
13 Dec 2023
XLS-R fine-tuning on noisy word boundaries for unsupervised speech
  segmentation into words
XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Robin Algayres
Pablo Diego-Simon
Benoît Sagot
Emmanuel Dupoux
44
1
0
08 Oct 2023
Tagged End-to-End Simultaneous Speech Translation Training using
  Simultaneous Interpretation Data
Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Yuka Ko
Ryo Fukuda
Yuta Nishikawa
Yasumasa Kano
Katsuhito Sudoh
Satoshi Nakamura
29
6
0
14 Jun 2023
Inter-connection: Effective Connection between Pre-trained Encoder and
  Decoder for Speech Translation
Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation
Yuta Nishikawa
Satoshi Nakamura
38
4
0
26 May 2023
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
34
30
0
27 Oct 2022
Improving generalizability of distilled self-supervised speech
  processing models under distorted settings
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter-Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
32
14
0
14 Oct 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
73
570
0
07 Sep 2022
Towards Proper Contrastive Self-supervised Learning Strategies For Music
  Audio Representation
Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation
Jeong-Eun Choi
Seongwon Jang
Hyunsouk Cho
Sehee Chung
SSL
16
6
0
10 Jul 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Wav2Vec-Aug: Improved self-supervised training with limited data
Anuroop Sriram
Michael Auli
Alexei Baevski
SSL
VLM
22
15
0
27 Jun 2022
Self-supervised Context-aware Style Representation for Expressive Speech
  Synthesis
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Yihan Wu
Xi Wang
S. Zhang
Lei He
Ruihua Song
J. Nie
42
15
0
25 Jun 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Algayres Robin
Adel Nabli
Benoît Sagot
Emmanuel Dupoux
SSL
23
8
0
11 Apr 2022
Auditory-Based Data Augmentation for End-to-End Automatic Speech
  Recognition
Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Zehai Tu
Jack Deadman
Ning Ma
Jon Barker
32
4
0
08 Apr 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Takaaki Saeki
Detai Xin
Wataru Nakata
Tomoki Koriyama
Shinnosuke Takamichi
Hiroshi Saruwatari
39
177
0
05 Apr 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
40
106
0
02 Mar 2022
AugLy: Data Augmentations for Robustness
AugLy: Data Augmentations for Robustness
Zoe Papakipos
Joanna Bitton
AAML
33
53
0
17 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster
  Prediction
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
46
306
0
05 Jan 2022
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
26
28
0
16 Dec 2021
From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
  Comprehension
From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension
Nuo Chen
Linjun Shou
Ming Gong
Jian Pei
Daxin Jiang
32
16
0
09 Dec 2021
Textless Speech Emotion Conversion using Discrete and Decomposed
  Representations
Textless Speech Emotion Conversion using Discrete and Decomposed Representations
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
25
29
0
14 Nov 2021
RawBoost: A Raw Data Boosting and Augmentation Method applied to
  Automatic Speaker Verification Anti-Spoofing
RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Speaker Verification Anti-Spoofing
Hemlata Tak
Madhu R. Kamble
J. Patino
Massimiliano Todisco
Nicholas W. D. Evans
66
103
0
08 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation
  Learning
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
29
11
0
27 Oct 2021
Multi-view Contrastive Self-Supervised Learning of Accounting Data
  Representations for Downstream Audit Tasks
Multi-view Contrastive Self-Supervised Learning of Accounting Data Representations for Downstream Audit Tasks
Marco Schreyer
Timur Sattarov
Damian Borth
MLAU
32
15
0
23 Sep 2021
Self-supervised Contrastive Cross-Modality Representation Learning for
  Spoken Question Answering
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering
Chenyu You
Nuo Chen
Yuexian Zou
SSL
27
63
0
08 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
35
117
0
07 Sep 2021
Supervised Contrastive Learning for Accented Speech Recognition
Supervised Contrastive Learning for Accented Speech Recognition
Tao Han
Hantao Huang
Ziang Yang
Wei Han
49
15
0
02 Jul 2021
Contrastive Learning of Musical Representations
Contrastive Learning of Musical Representations
Janne Spijkervet
J. Burgoyne
36
111
0
17 Mar 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation
  Learning, Semi-Supervised Learning and Interpretation
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
25
462
0
02 Jan 2021
Unsupervised Contrastive Learning of Sound Event Representations
Unsupervised Contrastive Learning of Sound Event Representations
Eduardo Fonseca
Diego Ortego
Kevin McGuinness
Noel E. O'Connor
Xavier Serra
SSL
27
65
0
15 Nov 2020
Towards Semi-Supervised Semantics Understanding from Speech
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
22
7
0
11 Nov 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for
  Self-supervised Speech Representation Learning
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang
Wubo Li
Miao Cao
Wei Zou
Xiangang Li
SSL
21
65
0
27 Oct 2020
Contrastive Learning of General-Purpose Audio Representations
Contrastive Learning of General-Purpose Audio Representations
Aaqib Saeed
David Grangier
Neil Zeghidour
VLM
SSL
24
262
0
21 Oct 2020
Viewmaker Networks: Learning Views for Unsupervised Representation
  Learning
Viewmaker Networks: Learning Views for Unsupervised Representation Learning
Alex Tamkin
Mike Wu
Noah D. Goodman
SSL
28
64
0
14 Oct 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
189
288
0
25 Jan 2020
pyannote.audio: neural building blocks for speaker diarization
pyannote.audio: neural building blocks for speaker diarization
H. Bredin
Ruiqing Yin
Juan Manuel Coria
G. Gelly
Pavel Korshunov
Marvin Lavechin
D. Fustes
Hadrien Titeux
Wassim Bouaziz
Marie-Philippe Gill
202
313
0
04 Nov 2019
1