Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.09710
Cited By
Continual Learning for Instruction Following from Realtime Feedback
19 December 2022
Alane Suhr
Yoav Artzi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Continual Learning for Instruction Following from Realtime Feedback"
18 / 18 papers shown
Title
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
Divyanshu Aggarwal
Sankarshan Damle
Navin Goyal
Satya Lokam
Sunayana Sitaram
CLL
23
0
0
21 Oct 2024
Retrospective Learning from Interactions
Zizhao Chen
Mustafa Omer Gul
Yiwei Chen
Gloria Geng
Anne Wu
Yoav Artzi
LRM
28
1
0
17 Oct 2024
Grounding Language in Multi-Perspective Referential Communication
Zineng Tang
Lingjun Mao
Alane Suhr
21
2
0
04 Oct 2024
CoGen: Learning from Feedback with Coupled Comprehension and Generation
Mustafa Omer Gul
Yoav Artzi
23
3
0
28 Aug 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhao Cao
3DV
84
46
0
23 Apr 2024
Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback
Dong Won Lee
Hae Won Park
Yoon Kim
C. Breazeal
Louis-Philippe Morency
32
0
0
17 Mar 2024
NARRATE: Versatile Language Architecture for Optimal Control in Robotics
Seif Ismail
Antonio Arbues
Ryan Cotterell
René Zurbrügg
Carmen Amo Alonso
LM&Ro
29
4
0
16 Mar 2024
Learning Communication Policies for Different Follower Behaviors in a Collaborative Reference Game
P. Sadler
Sherzod Hakimov
David Schlangen
29
1
0
07 Feb 2024
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback
Nathan Lambert
Roberto Calandra
ALM
29
31
0
31 Oct 2023
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
Joel Jang
Seungone Kim
Bill Yuchen Lin
Yizhong Wang
Jack Hessel
Luke Zettlemoyer
Hannaneh Hajishirzi
Yejin Choi
Prithviraj Ammanabrolu
MoMe
48
132
0
17 Oct 2023
Design Principles of Robust Multi-Armed Bandit Framework in Video Recommendations
Belhassen Bayar
Phanideep Gampa
Ainur Yessenalina
Zhen Wen
AAML
21
0
0
24 Sep 2023
Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive Teachers
P. Sadler
Sherzod Hakimov
David Schlangen
41
1
0
22 May 2023
Continually Improving Extractive QA via Human Feedback
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
26
12
0
21 May 2023
I2I: Initializing Adapters with Improvised Knowledge
Tejas Srinivasan
Furong Jia
Mohammad Rostami
Jesse Thomason
CLL
32
6
0
04 Apr 2023
CB2: Collaborative Natural Language Interaction Research Platform
Jacob Sharf
Mustafa Omer Gul
Yoav Artzi
LLMAG
35
1
0
14 Mar 2023
Analysis of Language Change in Collaborative Instruction Following
Anna Effenberger
Eva Yan
Rhia Singh
Alane Suhr
Yoav Artzi
39
13
0
09 Sep 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
286
1,595
0
18 Sep 2019
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin (Haas) Lawrence
Stefan Riezler
OffRL
173
56
0
03 May 2018
1