ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05098
  4. Cited By
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
v1v2v3 (latest)

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

14 February 2018
Jakob N. Foerster
Gregory Farquhar
Maruan Al-Shedivat
Tim Rocktaschel
Eric Xing
Shimon Whiteson
ArXiv (abs)PDFHTMLGithub (148★)

Papers citing "DiCE: The Infinitely Differentiable Monte-Carlo Estimator"

18 / 18 papers shown
Title
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
195
9
0
29 Jan 2025
Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy
Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy
Chen Wang
Kaiyi Ji
Junyi Geng
Zhongqiang Ren
Taimeng Fu
...
Yi Du
Qihang Li
Yue Yang
Xiao Lin
Zhipeng Zhao
SSL
149
10
0
28 Jan 2025
Advantage Alignment Algorithms
Advantage Alignment Algorithms
Juan Agustin Duque
Milad Aghajohari
Tim Cooijmans
Tianyu Zhang
Rameswar Panda
Gauthier Gidel
Aaron Courville
58
2
0
20 Jun 2024
Decision-Making with Auto-Encoding Variational Bayes
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
427
10,591
0
17 Feb 2020
Some Considerations on Learning to Explore via Meta-Reinforcement
  Learning
Some Considerations on Learning to Explore via Meta-Reinforcement Learning
Bradly C. Stadie
Ge Yang
Rein Houthooft
Xi Chen
Yan Duan
Yuhuai Wu
Pieter Abbeel
Ilya Sutskever
LRM
82
115
0
03 Mar 2018
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive
  Environments
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments
Maruan Al-Shedivat
Trapit Bansal
Yuri Burda
Ilya Sutskever
Igor Mordatch
Pieter Abbeel
CLL
66
354
0
10 Oct 2017
Learning with Opponent-Learning Awareness
Learning with Opponent-Learning Awareness
Jakob N. Foerster
Richard Y. Chen
Maruan Al-Shedivat
Shimon Whiteson
Pieter Abbeel
Igor Mordatch
98
539
0
13 Sep 2017
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li
Fengwei Zhou
Fei Chen
Hang Li
96
1,119
0
31 Jul 2017
Equivalence Between Policy Gradients and Soft Q-Learning
Equivalence Between Policy Gradients and Soft Q-Learning
John Schulman
Xi Chen
Pieter Abbeel
OffRL
95
346
0
21 Apr 2017
REBAR: Low-variance, unbiased gradient estimates for discrete latent
  variable models
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
George Tucker
A. Mnih
Chris J. Maddison
John Lawson
Jascha Narain Sohl-Dickstein
BDL
224
282
0
21 Mar 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
823
11,909
0
09 Mar 2017
TensorFlow: A system for large-scale machine learning
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNNAI4CE
433
18,361
0
27 May 2016
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
John Schulman
N. Heess
T. Weber
Pieter Abbeel
OffRL
144
393
0
17 Jun 2015
Automatic differentiation in machine learning: a survey
Automatic differentiation in machine learning: a survey
A. G. Baydin
Barak A. Pearlmutter
Alexey Radul
J. Siskind
PINNAI4CEODL
166
2,808
0
20 Feb 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Black Box Variational Inference
Black Box Variational Inference
Rajesh Ranganath
S. Gerrish
David M. Blei
DRLBDL
142
1,167
0
31 Dec 2013
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
Lex Weaver
Nigel Tao
119
247
0
10 Jan 2013
Automated Variational Inference in Probabilistic Programming
Automated Variational Inference in Probabilistic Programming
David Wingate
T. Weber
BDLTPM
91
138
0
07 Jan 2013
1