ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.13623
  4. Cited By
Reinforcement Learning and Bandits for Speech and Language Processing:
  Tutorial, Review and Outlook

Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook

24 October 2022
Baihan Lin
    OffRL
    AI4TS
ArXivPDFHTML

Papers citing "Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook"

50 / 93 papers shown
Title
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
92
11
0
28 Aug 2023
Towards Healthy AI: Large Language Models Need Therapists Too
Towards Healthy AI: Large Language Models Need Therapists Too
Baihan Lin
Djallel Bouneffouf
Guillermo Cecchi
Kush R. Varshney
AI4MH
49
19
0
02 Apr 2023
Psychotherapy AI Companion with Reinforcement Learning Recommendations
  and Interpretable Policy Dynamics
Psychotherapy AI Companion with Reinforcement Learning Recommendations and Interpretable Policy Dynamics
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
OffRL
AI4TS
AI4MH
69
11
0
16 Mar 2023
A Reinforcement Learning Framework for Online Speaker Diarization
A Reinforcement Learning Framework for Online Speaker Diarization
Baihan Lin
Xinxin Zhang
OffRL
63
2
0
21 Feb 2023
A Survey on Compositional Generalization in Applications
A Survey on Compositional Generalization in Applications
Baihan Lin
Djallel Bouneffouf
Irina Rish
AI4CE
61
13
0
02 Feb 2023
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
148
1,552
0
15 Dec 2022
Working Alliance Transformer for Psychotherapy Dialogue Classification
Working Alliance Transformer for Psychotherapy Dialogue Classification
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
38
13
0
27 Oct 2022
Computational Inference in Cognitive Science: Operational, Societal and
  Ethical Considerations
Computational Inference in Cognitive Science: Operational, Societal and Ethical Considerations
Baihan Lin
AI4CE
52
8
0
24 Oct 2022
SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy
  Treatment Strategies with Deep Reinforcement Learning
SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy Treatment Strategies with Deep Reinforcement Learning
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
OffRL
55
12
0
27 Aug 2022
Knowledge Management System with NLP-Assisted Annotations: A Brief
  Survey and Outlook
Knowledge Management System with NLP-Assisted Annotations: A Brief Survey and Outlook
Baihan Lin
60
11
0
15 Jun 2022
Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling
Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling
Baihan Lin
26
4
0
26 Apr 2022
Neural Topic Modeling of Psychotherapy Sessions
Neural Topic Modeling of Psychotherapy Sessions
Baihan Lin
Djallel Bouneffouf
Guillermo Cecchi
Ravi Tejwani
BDL
68
15
0
13 Apr 2022
Deep Annotation of Therapeutic Working Alliance in Psychotherapy
Deep Annotation of Therapeutic Working Alliance in Psychotherapy
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
53
15
0
12 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
314
6,132
0
05 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
677
12,525
0
04 Mar 2022
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
134
30,021
0
01 Mar 2022
Red Teaming Language Models with Language Models
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
36
627
0
07 Feb 2022
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Ngan Le
V. Rathour
Kashu Yamazaki
Khoa Luu
Marios Savvides
VLM
AI4TS
26
161
0
25 Aug 2021
A Survey of Uncertainty in Deep Neural Networks
A Survey of Uncertainty in Deep Neural Networks
J. Gawlikowski
Cedrique Rovile Njieutcheu Tassi
Mohsin Ali
Jongseo Lee
Matthias Humt
...
R. Roscher
Muhammad Shahzad
Wen Yang
R. Bamler
Xiaoxiang Zhu
BDL
UQCV
OOD
160
1,129
0
07 Jul 2021
Optimal Epidemic Control as a Contextual Combinatorial Bandit with
  Budget
Optimal Epidemic Control as a Contextual Combinatorial Bandit with Budget
Baihan Lin
Djallel Bouneffouf
24
8
0
30 Jun 2021
Survey on reinforcement learning for language processing
Survey on reinforcement learning for language processing
Víctor Uc Cetina
Nicolás Navarro-Guerrero
A. Martín-González
C. Weber
S. Wermter
OffRL
44
103
0
12 Apr 2021
Reinforcement Learning for Emotional Text-to-Speech Synthesis with
  Improved Emotion Discriminability
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu
Berrak Sisman
Haizhou Li
41
32
0
03 Apr 2021
A bandit approach to curriculum generation for automatic speech
  recognition
A bandit approach to curriculum generation for automatic speech recognition
Anastasia Kuznetsova
Anurag Kumar
Francis M. Tyers
29
1
0
06 Feb 2021
Advances and Challenges in Conversational Recommender Systems: A Survey
Advances and Challenges in Conversational Recommender Systems: A Survey
Chongming Gao
Wenqiang Lei
Xiangnan He
Maarten de Rijke
Tat-Seng Chua
181
279
0
23 Jan 2021
Deep Reinforcement Learning for On-line Dialogue State Tracking
Deep Reinforcement Learning for On-line Dialogue State Tracking
Zhi Chen
Lu Chen
Xiang Zhou
Kai Yu
OffRL
8
5
0
22 Sep 2020
Online Semi-Supervised Learning in Contextual Bandits with Episodic
  Reward
Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward
Baihan Lin
OffRL
29
14
0
17 Sep 2020
Learning to summarize from human feedback
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
172
2,071
0
02 Sep 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using
  Reinforcement Learning
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
D. Mohan
R. Lenain
Lorenzo Foglianti
Tian Huey Teh
Marlene Staib
Alexandra Torresquintero
Jiameng Gao
AI4TS
19
11
0
07 Aug 2020
Deep Bayesian Bandits: Exploring in Online Personalized Recommendations
Deep Bayesian Bandits: Exploring in Online Personalized Recommendations
Dalin Guo
S. Ktena
Ferenc Huszár
Pranay K. Myana
Wenzhe Shi
Alykhan Tejani
OffRL
49
40
0
03 Aug 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
73
601
0
16 Jun 2020
Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior
Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior
Baihan Lin
Djallel Bouneffouf
Guillermo Cecchi
36
23
0
09 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
91
1,780
0
08 Jun 2020
Speaker Diarization as a Fully Online Learning Problem in MiniVox
Speaker Diarization as a Fully Online Learning Problem in MiniVox
Baihan Lin
Xinxin Zhang
55
16
0
08 Jun 2020
Predicting Goal-directed Human Attention Using Inverse Reinforcement
  Learning
Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning
Zhibo Yang
Lihan Huang
Yupei Chen
Zijun Wei
Seoyoung Ahn
G. Zelinsky
Dimitris Samaras
Minh Hoai
47
96
0
28 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
461
41,106
0
28 May 2020
Seamlessly Unifying Attributes and Items: Conversational Recommendation
  for Cold-Start Users
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users
Shijun Li
Wenqiang Lei
Qingyun Wu
Xiangnan He
Peng Jiang
Tat-Seng Chua
38
121
0
23 May 2020
MOReL : Model-Based Offline Reinforcement Learning
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
64
662
0
12 May 2020
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits
  and RL
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna M. Reinen
Irina Rish
OffRL
20
24
0
10 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
473
1,994
0
04 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
174
1,338
0
15 Apr 2020
Reinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and Finance
Arthur Charpentier
Romuald Elie
Carl Remlinger
OffRL
44
152
0
22 Mar 2020
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
  Learning
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
Nan Jiang
Sheng Jin
Z. Duan
Changshui Zhang
OffRL
62
49
0
08 Feb 2020
Provably Efficient Online Hyperparameter Optimization with
  Population-Based Bandits
Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits
Jack Parker-Holder
Vu Nguyen
Stephen J. Roberts
OffRL
81
83
0
06 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent
  Advances, and Future Trends
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
65
82
0
02 Jan 2020
Pre-training in Deep Reinforcement Learning for Automatic Speech
  Recognition
Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Thejan Rajapakshe
R. Rana
S. Latif
Sara Khalifa
Björn W. Schuller
VLM
OffRL
34
8
0
24 Oct 2019
Reinforcement Learning in Healthcare: A Survey
Reinforcement Learning in Healthcare: A Survey
Chao Yu
Jiming Liu
S. Nemati
LM&MA
OffRL
102
557
0
22 Aug 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
76
338
0
30 Jun 2019
A Story of Two Streams: Reinforcement Learning Models from Human
  Behavior and Neuropsychiatry
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna M. Reinen
Irina Rish
OffRL
27
36
0
21 Jun 2019
Split Q Learning: Reinforcement Learning with Two-Stream Rewards
Split Q Learning: Reinforcement Learning with Two-Stream Rewards
Baihan Lin
Djallel Bouneffouf
Guillermo Cecchi
OffRL
24
22
0
21 Jun 2019
A Survey of Reinforcement Learning Informed by Natural Language
A Survey of Reinforcement Learning Informed by Natural Language
Jelena Luketina
Nantas Nardelli
Gregory Farquhar
Jakob N. Foerster
Jacob Andreas
Edward Grefenstette
Shimon Whiteson
Tim Rocktaschel
LM&Ro
KELM
OffRL
LRM
64
279
0
10 Jun 2019
12
Next