Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.03091
Cited By
Online SuBmodular + SuPermodular (BP) Maximization with Bandit Feedback
7 July 2022
Adhyyan Narang
Omid Sadeghi
Lillian J. Ratliff
Maryam Fazel
J. Bilmes
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Online SuBmodular + SuPermodular (BP) Maximization with Bandit Feedback"
3 / 3 papers shown
Title
Initializing Services in Interactive ML Systems for Diverse Users
Avinandan Bose
Mihaela Curmei
Daniel L. Jiang
Jamie Morgenstern
Sarah Dean
Lillian J. Ratliff
Maryam Fazel
21
5
0
19 Dec 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
372
12,081
0
04 Mar 2022
Matroid Bandits: Fast Combinatorial Optimization with Learning
B. Kveton
Zheng Wen
Azin Ashkan
Hoda Eydgahi
Brian Eriksson
46
119
0
20 Mar 2014
1