ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.00387
  4. Cited By
What all do audio transformer models hear? Probing Acoustic
  Representations for Language Delivery and its Structure

What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure

2 January 2021
Jui Shah
Yaman Kumar Singla
Changyou Chen
R. Shah
ArXivPDFHTML

Papers citing "What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure"

50 / 54 papers shown
Title
Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis?
Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis?
Subhankar Maity
Aniket Deroy
Sudeshna Sarkar
23
0
0
08 Apr 2025
Rethinking Few-Shot Medical Image Segmentation by SAM2: A Training-Free Framework with Augmentative Prompting and Dynamic Matching
Haiyue Zu
Jun Ge
Heting Xiao
Jile Xie
Zhangzhe Zhou
...
Jiayi Ni
Junjie Niu
Linlin Zhang
Li Ni
Huilin Yang
MedIm
VLM
54
0
0
05 Mar 2025
Enhancing and Exploring Mild Cognitive Impairment Detection with W2V-BERT-2.0
Yueguan Wang
Tatsunari Matsushima
Soichiro Matsushima
Toshimitsu Sakai
40
0
0
28 Jan 2025
Investigating large language models for their competence in extracting
  grammatically sound sentences from transcribed noisy utterances
Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterances
Alina Wróblewska
33
0
0
07 Oct 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech
  Processing Tasks
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks
Nakamasa Inoue
Shinta Otake
Takumi Hirose
Masanari Ohi
Rei Kawakami
45
1
0
28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake
  Detection
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
54
9
0
26 Jul 2024
Speech Representation Analysis based on Inter- and Intra-Model
  Similarities
Speech Representation Analysis based on Inter- and Intra-Model Similarities
Yassine El Kheir
Ahmed M. Ali
Shammur A. Chowdhury
SSL
43
2
0
23 Jun 2024
Impact of Speech Mode in Automatic Pathological Speech Detection
Impact of Speech Mode in Automatic Pathological Speech Detection
S. A. Sheikh
Ina Kodrasi
34
3
0
14 Jun 2024
Deep Learning for Assessment of Oral Reading Fluency
Deep Learning for Assessment of Oral Reading Fluency
Mithilesh Vaidya
Binaya Kumar Sahoo
Preeti Rao
26
0
0
29 May 2024
Exploring the Capabilities of Prompted Large Language Models in
  Educational and Assessment Applications
Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications
Subhankar Maity
Aniket Deroy
Sudeshna Sarkar
AI4Ed
ELM
34
11
0
19 May 2024
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice
  Conversion
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
26
3
0
14 Nov 2023
Self-Supervised Models of Speech Infer Universal Articulatory Kinematics
Self-Supervised Models of Speech Infer Universal Articulatory Kinematics
Cheol Jun Cho
Abdelrahman Mohamed
Alan W. Black
Gopala K. Anumanchipalli
SSL
24
10
0
16 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
A. Alishahi
36
12
0
15 Oct 2023
Do self-supervised speech and language models extract similar
  representations as human brain?
Do self-supervised speech and language models extract similar representations as human brain?
Peili Chen
Linyang He
Li Fu
Lu Fan
Edward F. Chang
Yuanning Li
SSL
24
2
0
07 Oct 2023
Decoding Emotions: A comprehensive Multilingual Study of Speech Models
  for Speech Emotion Recognition
Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Anant Singh
Akshat Gupta
31
4
0
17 Aug 2023
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis
  Distance
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance
Sourya Dipta Das
Yash Vadi
Abhishek Unnam
Kuldeep Yadav
28
1
0
09 Aug 2023
Vesper: A Compact and Effective Pretrained Model for Speech Emotion
  Recognition
Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Weidong Chen
Xiaofen Xing
Peihao Chen
Xiangmin Xu
VLM
38
35
0
20 Jul 2023
What Do Self-Supervised Speech Models Know About Words?
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
51
26
0
30 Jun 2023
GenerTTS: Pronunciation Disentanglement for Timbre and Style
  Generalization in Cross-Lingual Text-to-Speech
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Yahuan Cong
Haoyu Zhang
Hao-Ping Lin
Shichao Liu
Chunfeng Wang
Yi Ren
Xiang Yin
Zejun Ma
33
1
0
27 Jun 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
39
26
0
30 May 2023
Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
32
6
0
28 May 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic
  Speech Processing
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
30
38
0
27 Feb 2023
Phone and speaker spatial organization in self-supervised speech
  representations
Phone and speaker spatial organization in self-supervised speech representations
Pablo Riera
M. Cerdeiro
L. Pepino
Luciana Ferrer
SSL
26
1
0
24 Feb 2023
A Sidecar Separator Can Convert a Single-Talker Speech Recognition
  System to a Multi-Talker One
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One
Lingwei Meng
Jiawen Kang
Mingyu Cui
Yuejiao Wang
Xixin Wu
Helen M. Meng
25
17
0
20 Feb 2023
Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation
Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation
Tomer Wullach
Shlomo E. Chazan
30
1
0
27 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing
  Tasks
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
L2 proficiency assessment using self-supervised speech representations
L2 proficiency assessment using self-supervised speech representations
Stefano Bannò
Kate Knill
M. Matassoni
Vyas Raina
Mark Gales
SSL
34
7
0
16 Nov 2022
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
37
109
0
08 Nov 2022
Probing Statistical Representations For End-To-End ASR
Probing Statistical Representations For End-To-End ASR
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
35
2
0
03 Nov 2022
Proficiency assessment of L2 spoken English using wav2vec 2.0
Proficiency assessment of L2 spoken English using wav2vec 2.0
Stefano Bannò
M. Matassoni
20
22
0
24 Oct 2022
Evidence of Vocal Tract Articulation in Self-Supervised Learning of
  Speech
Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech
Cheol Jun Cho
Peter Wu
Abdel-rahman Mohamed
Gopala K. Anumanchipalli
29
29
0
21 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Nigel G. Ward
23
48
0
13 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural
  Networks on Phoneme Recognition
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
27
2
0
01 Oct 2022
End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering
  Sub-Challenge
End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
SSL
24
9
0
20 Jul 2022
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic
  Knowledge Distillation of Self-Supervised Speech Models
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
30
28
0
14 Jul 2022
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset
  featuring the same speakers with and without infection
COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection
Andreas Triantafyllopoulos
A. Semertzidou
Meishu Song
Florian B. Pokorny
Björn W. Schuller
52
2
0
20 Jun 2022
Automatic Pronunciation Assessment using Self-Supervised Speech
  Representation Learning
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
Eesung Kim
J. Jeon
Hyeji Seo
Ho-Young Kim
SSL
23
37
0
08 Apr 2022
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Andreas Triantafyllopoulos
Johannes Wagner
H. Wierstorf
Maximilian Schmitt
U. Reichel
F. Eyben
Felix Burkhardt
Björn W. Schuller
29
25
0
01 Apr 2022
Analyzing the factors affecting usefulness of Self-Supervised
  Pre-trained Representations for Speech Recognition
Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition
Ashish Seth
L. D. Prasad
Sreyan Ghosh
S. Umesh
33
3
0
31 Mar 2022
Span Classification with Structured Information for Disfluency Detection
  in Spoken Utterances
Span Classification with Structured Information for Disfluency Detection in Spoken Utterances
Sreyan Ghosh
Sonal Kumar
Yaman Kumar Singla
R. Shah
S. Umesh
33
6
0
30 Mar 2022
The MSXF TTS System for ICASSP 2022 ADD Challenge
The MSXF TTS System for ICASSP 2022 ADD Challenge
Chunyong Yang
Pengfei Liu
Yanli Chen
Hongbin Wang
Min Liu
13
0
0
27 Jan 2022
Automated Speech Scoring System Under The Lens: Evaluating and
  interpreting the linguistic cues for language proficiency
Automated Speech Scoring System Under The Lens: Evaluating and interpreting the linguistic cues for language proficiency
P. Bamdev
Manraj Singh Grover
Yaman Kumar Singla
Payman Vafaee
Mika Hama
R. Shah
31
12
0
30 Nov 2021
Using Sampling to Estimate and Improve Performance of Automated Scoring
  Systems with Guarantees
Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees
Yaman Kumar Singla
Sriram Krishna
R. Shah
Changyou Chen
18
6
0
17 Nov 2021
Investigating self-supervised front ends for speech spoofing
  countermeasures
Investigating self-supervised front ends for speech spoofing countermeasures
Xin Wang
Junichi Yamagishi
AAML
24
123
0
15 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
42
151
0
27 Oct 2021
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in
  Spoken Utterances
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances
Sreyan Ghosh
Samden Lepcha
S. Sakshi
R. Shah
S. Umesh
31
14
0
14 Oct 2021
AES Systems Are Both Overstable And Oversensitive: Explaining Why And
  Proposing Defenses
AES Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses
Yaman Kumar Singla
Swapnil Parekh
Somesh Singh
Junjie Li
R. Shah
Changyou Chen
AAML
41
14
0
24 Sep 2021
Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring
Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring
Yaman Kumar Singla
Avykat Gupta
Shaurya Bagga
Changyou Chen
Balaji Krishnamurthy
R. Shah
32
12
0
30 Aug 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
26
291
0
10 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and
  Channel Information? A Layer-wise and Neuron-level Analysis
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
49
12
0
01 Jul 2021
12
Next