Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.00733
Cited By
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
1 March 2023
Kai-Wei Chang
Yu-Kai Wang
Hua Shen
Iu-thing Kang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks"
35 / 35 papers shown
Title
Safe Gradient Flow for Bilevel Optimization
Sina Sharifi
Nazanin Abolfazli
E. Y. Hamedani
Mahyar Fazlyab
36
0
0
27 Jan 2025
Unified Pathological Speech Analysis with Prompt Tuning
Fei Yang
Xuenan Xu
Mengyue Wu
Kai Yu
28
0
0
05 Nov 2024
Efficient Streaming LLM for Speech Recognition
J. Jia
Gil Keren
Wei Zhou
Egor Lakomkin
Xiaohui Zhang
Chunyang Wu
Frank Seide
Jay Mahadeokar
Ozlem Kalinli
AuLLM
32
0
0
02 Oct 2024
NapTune: Efficient Model Tuning for Mood Classification using Previous Night's Sleep Measures along with Wearable Time-series
Debaditya Shome
Nasim Montazeri Ghahjaverestan
Ali Etemad
MLAU
27
0
0
07 Sep 2024
Developing an End-to-End Framework for Predicting the Social Communication Severity Scores of Children with Autism Spectrum Disorder
Jihyun Mun
Sunhee Kim
Minhwa Chung
25
0
0
30 Aug 2024
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR
Shashi Kumar
S. Madikeri
Juan Zuluaga-Gomez
Iuliia Nigmatulina
Esaú Villatoro-Tello
Sergio Burdisso
P. Motlícek
Karthik Pandia
A. Ganapathiraju
46
0
0
05 Jul 2024
Decoder-only Architecture for Streaming End-to-end Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
RALM
AuLLM
36
6
0
23 Jun 2024
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Hayato Futami
Siddhant Arora
Yosuke Kashiwagi
E. Tsunoo
Shinji Watanabe
42
0
0
18 Jun 2024
Towards audio language modeling -- an overview
Haibin Wu
Xuanjun Chen
Yi-Cheng Lin
Kai-Wei Chang
Ho-Lam Chung
Alexander H. Liu
Hung-yi Lee
AuLLM
38
28
0
20 Feb 2024
SpiRit-LM: Interleaved Spoken and Written Language Model
Tu Nguyen
Benjamin Muller
Bokai Yu
Marta R. Costa-jussá
Maha Elbayad
...
Itai Gat
Gabriel Synnaeve
Juan Pino
Benoît Sagot
Emmanuel Dupoux
AuLLM
VLM
51
34
0
08 Feb 2024
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
Kevin Everson
Yile Gu
Huck Yang
Prashanth Gurunath Shivakumar
Guan-Ting Lin
...
Shalini Ghosh
Wael Hamza
Hung-yi Lee
Ariya Rastrow
A. Stolcke
20
5
0
05 Jan 2024
Extending Whisper with prompt tuning to target-speaker ASR
Hao Ma
Zhiyuan Peng
Mingjie Shao
Jing Li
Ju Liu
VLM
35
12
0
13 Dec 2023
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition
Yukiya Hono
Koh Mitsuda
Tianyu Zhao
Kentaro Mitsui
Toshiaki Wakatsuki
Kei Sawada
AuLLM
44
8
0
06 Dec 2023
Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks
Ming-Hao Hsu
Kai-Wei Chang
Shang-Wen Li
Hung-yi Lee
34
8
0
19 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
38
47
0
10 Oct 2023
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
Siddhant Arora
Hayato Futami
Jee-weon Jung
Yifan Peng
Roshan S. Sharma
Yosuke Kashiwagi
E. Tsunoo
Karen Livescu
Shinji Watanabe
ELM
27
7
0
04 Oct 2023
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model
Kai-Wei Chang
Ming-Hsin Chen
Yun-Ping Lin
Jing Neng Hsu
Paul Kuo-Ming Huang
Chien-yu Huang
Shang-Wen Li
Hung-yi Lee
23
6
0
04 Oct 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
E. Chng
32
42
0
27 Sep 2023
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
Yu Yu
Chao-Han Huck Yang
J. Kolehmainen
Prashanth Gurunath Shivakumar
Yile Gu
...
Denis Filimonov
Shalini Ghosh
A. Stolcke
Ariya Rastrow
I. Bulyko
37
8
0
26 Sep 2023
Joint Audio and Speech Understanding
Yuan Gong
Alexander H. Liu
Hongyin Luo
Leonid Karlinsky
James R. Glass
AuLLM
28
69
0
25 Sep 2023
Instruction-Following Speech Recognition
Cheng-I Jeff Lai
Zhiyun Lu
Liangliang Cao
Ruoming Pang
AuLLM
24
6
0
18 Sep 2023
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Chien-yu Huang
Ke-Han Lu
Shi Wang
Chi-Yuan Hsiao
Chun-Yi Kuan
...
Roshan S. Sharma
Shinji Watanabe
Bhiksha Ramakrishnan
Shady Shehata
Hung-yi Lee
AuLLM
34
51
0
18 Sep 2023
Are Soft Prompts Good Zero-shot Learners for Speech Recognition?
Dianwen Ng
Chong Zhang
Ruixi Zhang
Yukun Ma
Fabian Ritter Gutierrez
Trung Hieu Nguyen
Chongjia Ni
Shengkui Zhao
E. Chng
B. Ma
VLM
40
1
0
18 Sep 2023
Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
VLM
AuLLM
RALM
40
9
0
16 Sep 2023
CPPF: A contextual and post-processing-free model for automatic speech recognition
Lei Zhang
Zhengkun Tian
Xiang Chen
Jiaming Sun
Hongyu Xiang
Ke Ding
Guanglu Wan
34
0
0
14 Sep 2023
SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Haibin Wu
Kai-Wei Chang
Yuan-Kuei Wu
Hung-yi Lee
33
22
0
03 Jun 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
48
21
0
22 May 2023
Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Chun-Wei Ho
Chao-Han Huck Yang
Sabato Marco Siniscalchi
26
1
0
19 May 2023
A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
28
12
0
18 May 2023
Listen, Think, and Understand
Yuan Gong
Hongyin Luo
Alexander H. Liu
Leonid Karlinsky
James R. Glass
ELM
MLLM
LRM
40
137
0
18 May 2023
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
134
350
0
21 May 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
127
264
0
02 Feb 2022
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
A. Nautsch
Xin Wang
Nicholas W. D. Evans
Tomi Kinnunen
Ville Vestman
Massimiliano Todisco
Héctor Delgado
Md. Sahidullah
Junichi Yamagishi
Kong Aik Lee
119
144
0
11 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
191
337
0
01 Feb 2021
Learning Efficient Representations for Keyword Spotting with Triplet Loss
R. Vygon
N. Mikhaylovskiy
DML
SSL
60
64
0
12 Jan 2021
1