ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.12764
  4. Cited By
Towards Learning a Universal Non-Semantic Representation of Speech

Towards Learning a Universal Non-Semantic Representation of Speech

25 February 2020
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
    SSL
ArXivPDFHTML

Papers citing "Towards Learning a Universal Non-Semantic Representation of Speech"

50 / 105 papers shown
Title
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
26
0
0
10 May 2025
The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification
Birger Moëll
Fredrik Sand Aronsson
Per Östberg
Jonas Beskow
39
0
0
03 Mar 2025
Evaluation of Deep Audio Representations for Hearables
Evaluation of Deep Audio Representations for Hearables
Fabian Gröger
Pascal Baumann
L. Amruthalingam
Laurent Simon
Ruksana Giurda
Simone Lionetti
88
0
0
10 Feb 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
69
0
0
05 Feb 2025
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets
  with Heterogeneous Recording Conditions
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets with Heterogeneous Recording Conditions
L. Gauder
Pablo Riera
A. Slachevsky
G. Forno
Adolfo M. Garcia
Luciana Ferrer
30
1
0
11 Sep 2024
STAB: Speech Tokenizer Assessment Benchmark
STAB: Speech Tokenizer Assessment Benchmark
Shikhar Vashishth
Harman Singh
Shikhar Bharadwaj
Sriram Ganapathy
Chulayuth Asawaroengchai
Kartik Audhkhasi
Andrew Rosenberg
Ankur Bapna
Bhuvana Ramabhadran
52
0
0
04 Sep 2024
ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
Qingyu Liu
Longfei Song
Dongxing Xu
Yanhua Long
39
0
0
20 Aug 2024
Predicting Heart Activity from Speech using Data-driven and
  Knowledge-based features
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
Gasser Elbanna
Z. Mostaani
Mathew Magimai.-Doss
SSL
47
0
0
10 Jun 2024
MAD Speech: Measures of Acoustic Diversity of Speech
MAD Speech: Measures of Acoustic Diversity of Speech
Matthieu Futeral
A. Agostinelli
Marco Tagliasacchi
Neil Zeghidour
Eugene Kharitonov
51
1
0
16 Apr 2024
Exploring the Task-agnostic Trait of Self-supervised Learning in the
  Context of Detecting Mental Disorders
Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders
Rohan kumar Gupta
Rohit Sinha
47
0
0
22 Mar 2024
Predicting Generalization of AI Colonoscopy Models to Unseen Data
Predicting Generalization of AI Colonoscopy Models to Unseen Data
Joel Shor
C. McNeil
Yotam Intrator
Joe Ledsam
H. Yamano
...
Masaaki Miyo
Eiji Oki
Ichiro Takemasa
Ehud Rivlin
Roman Goldenberg
41
0
0
14 Mar 2024
HeAR -- Health Acoustic Representations
HeAR -- Health Acoustic Representations
Sebastien Baur
Zaid Nabulsi
Wei-Hung Weng
Jake Garrison
Louis Blankemeier
...
Shwetak N. Patel
S. Shetty
Shruthi Prabhakara
Monde Muyoyeta
Diego Ardila
LM&MA
24
10
0
04 Mar 2024
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings
  with Limited Data
Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
Hamza Mahdi
Eptehal Nashnoush
Rami Saab
Arjun Balachandar
Rishit Dagli
Lucas X. Perri
H. Khosravani
21
1
0
07 Feb 2024
Relationship between auditory and semantic entrainment using Deep Neural
  Networks (DNN)
Relationship between auditory and semantic entrainment using Deep Neural Networks (DNN)
Jay Kejriwal
Štefan Beňuš
30
4
0
27 Dec 2023
The unreasonable effectiveness of AI CADe polyp detectors to generalize
  to new countries
The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries
Joel Shor
H. Yamano
Daisuke Tsurumaru
Yotam Intrator
Hiroki Kayama
...
Kaho Kobayashi
Eiji Oki
Roman Goldenberg
Ehud Rivlin
Ichiro Takemasa
CML
17
0
0
11 Dec 2023
Reformulating NLP tasks to Capture Longitudinal Manifestation of
  Language Disorders in People with Dementia
Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia
Dimitris Gkoumas
Matthew Purver
M. Liakata
23
2
0
15 Oct 2023
A Digital Language Coherence Marker for Monitoring Dementia
A Digital Language Coherence Marker for Monitoring Dementia
Dimitris Gkoumas
Adam Tsakalidis
M. Liakata
11
1
0
14 Oct 2023
Performance Conditioning for Diffusion-Based Multi-Instrument Music
  Synthesis
Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis
Ben Maman
Johannes Zeitler
Meinard Muller
Amit H. Bermano
DiffM
14
4
0
21 Sep 2023
Beyond Accuracy: Measuring Representation Capacity of Embeddings to
  Preserve Structural and Contextual Information
Beyond Accuracy: Measuring Representation Capacity of Embeddings to Preserve Structural and Contextual Information
Sarwan Ali
29
0
0
20 Sep 2023
Crowdotic: A Privacy-Preserving Hospital Waiting Room Crowd Density
  Estimation with Non-speech Audio
Crowdotic: A Privacy-Preserving Hospital Waiting Room Crowd Density Estimation with Non-speech Audio
Forsad Al Hossain
Tanjid Hasan Tonmoy
A. Lover
George A. Corey
Mohammad Arif Ul Alam
Tauhidur Rahman
14
1
0
19 Sep 2023
EnCodecMAE: Leveraging neural codecs for universal audio representation
  learning
EnCodecMAE: Leveraging neural codecs for universal audio representation learning
L. Pepino
Pablo Riera
Luciana Ferrer
35
4
0
14 Sep 2023
Optimizing Audio Augmentations for Contrastive Learning of
  Health-Related Acoustic Signals
Optimizing Audio Augmentations for Contrastive Learning of Health-Related Acoustic Signals
Louis Blankemeier
Sebastien Baur
Wei-Hung Weng
Jake Garrison
Yossi Matias
Shruthi Prabhakara
Diego Ardila
Zaid Nabulsi
34
0
0
11 Sep 2023
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined
  Keywords
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords
Yong-Hyeok Lee
Namhyun Cho
24
18
0
31 Aug 2023
MASR: Multi-label Aware Speech Representation
MASR: Multi-label Aware Speech Representation
Anjali Raj
Shikhar Bharadwaj
Sriram Ganapathy
Min Ma
Shikhar Vashishth
SSL
18
0
0
20 Jul 2023
Representation Learning With Hidden Unit Clustering For Low Resource
  Speech Applications
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
Varun Krishna
T. Sai
Sriram Ganapathy
SSL
29
2
0
14 Jul 2023
Speech-based Age and Gender Prediction with Transformers
Speech-based Age and Gender Prediction with Transformers
Felix Burkhardt
Johannes Wagner
H. Wierstorf
F. Eyben
Björn Schuller
6
13
0
29 Jun 2023
Female mosquito detection by means of AI techniques inside release
  containers in the context of a Sterile Insect Technique program
Female mosquito detection by means of AI techniques inside release containers in the context of a Sterile Insect Technique program
Javier Naranjo-Alcazar
Jordi Grau-Haro
D. Almenar
P. Zuccarello
16
0
0
19 Jun 2023
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture
  Linguistic Knowledge?
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
Yusuke Ijima
Taichi Asami
Marc Delcroix
Yukinori Honma
SSL
ELM
27
11
0
14 Jun 2023
Label Aware Speech Representation Learning For Language Identification
Label Aware Speech Representation Learning For Language Identification
Shikhar Vashishth
Shikhar Bharadwaj
Sriram Ganapathy
Ankur Bapna
Min Ma
Wei Han
Vera Axelrod
Partha P. Talukdar
SSL
23
4
0
07 Jun 2023
Self-supervised Audio Teacher-Student Transformer for Both Clip-level
  and Frame-level Tasks
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
Xian Li
Nian Shao
Xiaofei Li
ViT
CLIP
21
25
0
07 Jun 2023
Automatic Data Augmentation for Domain Adapted Fine-Tuning of
  Self-Supervised Speech Representations
Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations
Salah Zaiem
Titouan Parcollet
S. Essid
33
2
0
01 Jun 2023
The Tunnel Effect: Building Data Representations in Deep Neural Networks
The Tunnel Effect: Building Data Representations in Deep Neural Networks
Wojciech Masarczyk
M. Ostaszewski
Ehsan Imani
Razvan Pascanu
Piotr Milo's
Tomasz Trzciñski
28
18
0
31 May 2023
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech
  Recognition
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Wangyou Zhang
Y. Qian
38
10
0
25 May 2023
Happy or Evil Laughter? Analysing a Database of Natural Audio Samples
Happy or Evil Laughter? Analysing a Database of Natural Audio Samples
Aljoscha Dusterhoft
Felix Burkhardt
Björn W. Schuller
14
2
0
23 May 2023
Pengi: An Audio Language Model for Audio Tasks
Pengi: An Audio Language Model for Audio Tasks
Soham Deshmukh
Benjamin Elizalde
Rita Singh
Huaming Wang
MLLM
AuLLM
34
157
0
19 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level
  Speech Representations
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
19
5
0
14 May 2023
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su
Judith Yue Li
Qingqing Huang
Dima Kuzmin
Joonseok Lee
...
Fei Sha
A. Jansen
Yu Wang
Mauro Verzetti
Timo I. Denk
VGen
31
12
0
11 May 2023
Emolysis: A Multimodal Open-Source Group Emotion Analysis and
  Visualization Toolkit
Emolysis: A Multimodal Open-Source Group Emotion Analysis and Visualization Toolkit
Shreya Ghosh
Zhixi Cai
Parul Gupta
Garima Sharma
Abhinav Dhall
Munawar Hayat
Tom Gedeon
24
2
0
09 May 2023
Looking Similar, Sounding Different: Leveraging Counterfactual
  Cross-Modal Pairs for Audiovisual Representation Learning
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
23
2
0
12 Apr 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality
  check case study with IEMOCAP
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
19
17
0
03 Apr 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Enhancing Unsupervised Audio Representation Learning via Adversarial
  Sample Generation
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
21
0
0
15 Mar 2023
Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Subhashini Venugopalan
Jimmy Tobin
Samuel J. Yang
Katie Seaver
Richard Cave
P. Jiang
Neil Zeghidour
Rus Heywood
Jordan R. Green
Michael P. Brenner
29
9
0
13 Mar 2023
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition
  Performance in Clinical Settings
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings
Joel Shor
R. Bi
Subhashini Venugopalan
Steven Ibara
Roman Goldenberg
Ehud Rivlen
AI4MH
23
4
0
10 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature
  Diversity and Decorrelation
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
42
3
0
07 Mar 2023
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Qingqing Huang
Daniel S. Park
Tao Wang
Timo I. Denk
Andy Ly
...
Jesse Engel
Quoc V. Le
William Chan
Zhifeng Chen
Wei Han
MGen
DiffM
36
190
0
08 Feb 2023
MusicLM: Generating Music From Text
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
44
417
0
26 Jan 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic
  Self-supervised Learning
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
32
5
0
19 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior
  Analysis
Multimodal Vision Transformers with Forced Attention for Behavior Analysis
Tanay Agrawal
Michal Balazia
Philippe Muller
Franccois Brémond
ViT
23
9
0
07 Dec 2022
Privacy against Real-Time Speech Emotion Detection via Acoustic
  Adversarial Evasion of Machine Learning
Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
Brian Testa
Yi Xiao
Harshit Sharma
Avery Gump
Asif Salekin
AAML
19
7
0
17 Nov 2022
123
Next