Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.09118
Cited By
Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation
28 July 2017
Carolin (Haas) Lawrence
Artem Sokolov
Stefan Riezler
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation"
6 / 6 papers shown
Title
Loss Functions for Discrete Contextual Pricing with Observational Data
Max Biggs
Ruijiang Gao
Wei-Ju Sun
36
10
0
18 Nov 2021
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
30
24
0
10 Aug 2021
Machine Translation System Selection from Bandit Feedback
Jason Naradowsky
Xuan Zhang
Kevin Duh
OffRL
21
8
0
22 Feb 2020
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
Julia Kreutzer
Joshua Uyheng
Stefan Riezler
33
85
0
27 May 2018
Can Neural Machine Translation be Improved with User Feedback?
Julia Kreutzer
Shahram Khadivi
E. Matusov
Stefan Riezler
19
93
0
16 Apr 2018
A Shared Task on Bandit Learning for Machine Translation
Artem Sokolov
Julia Kreutzer
Kellen Sunderland
Pavel Danchenko
Witold Szymaniak
Hagen Fürstenau
Stefan Riezler
43
16
0
27 Jul 2017
1