ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.08063
  4. Cited By
Gradient Estimation with Stochastic Softmax Tricks

Gradient Estimation with Stochastic Softmax Tricks

15 June 2020
Max B. Paulus
Dami Choi
Daniel Tarlow
Andreas Krause
Chris J. Maddison
    BDL
ArXivPDFHTML

Papers citing "Gradient Estimation with Stochastic Softmax Tricks"

43 / 43 papers shown
Title
Large (Vision) Language Models are Unsupervised In-Context Learners
Large (Vision) Language Models are Unsupervised In-Context Learners
Artyom Gadetsky
Andrei Atanov
Yulun Jiang
Zhitong Gao
Ghazal Hosseini Mighan
Amir Zamir
Maria Brbić
VLM
MLLM
LRM
202
0
0
03 Apr 2025
Soft Condorcet Optimization for Ranking of General Agents
Soft Condorcet Optimization for Ranking of General Agents
Marc Lanctot
Kate Larson
Michael Kaisers
Quentin Berthet
I. Gemp
Manfred Diaz
Roberto-Rafael Maura-Rivero
Yoram Bachrach
Anna Koop
Doina Precup
177
0
0
31 Oct 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Tingting Gao
Xi Li
MoE
97
2
0
28 Jun 2024
Fast Differentiable Sorting and Ranking
Fast Differentiable Sorting and Ranking
Mathieu Blondel
O. Teboul
Quentin Berthet
Josip Djolonga
147
231
0
20 Feb 2020
Learning with Differentiable Perturbed Optimizers
Learning with Differentiable Perturbed Optimizers
Quentin Berthet
Mathieu Blondel
O. Teboul
Marco Cuturi
Jean-Philippe Vert
Francis R. Bach
56
109
0
20 Feb 2020
Decision-Making with Auto-Encoding Variational Bayes
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
317
10,591
0
17 Feb 2020
Estimating Gradients for Discrete Random Variables by Sampling without
  Replacement
Estimating Gradients for Discrete Random Variables by Sampling without Replacement
W. Kool
H. V. Hoof
Max Welling
BDL
105
50
0
14 Feb 2020
Torch-Struct: Deep Structured Prediction Library
Torch-Struct: Deep Structured Prediction Library
Alexander M. Rush
52
63
0
03 Feb 2020
Differentiable Convex Optimization Layers
Differentiable Convex Optimization Layers
Akshay Agrawal
Brandon Amos
Shane T. Barratt
Stephen P. Boyd
Steven Diamond
Zico Kolter
81
653
0
28 Oct 2019
Structured Prediction with Projection Oracles
Structured Prediction with Projection Oracles
Mathieu Blondel
68
33
0
24 Oct 2019
Monte Carlo Gradient Estimation in Machine Learning
Monte Carlo Gradient Estimation in Machine Learning
S. Mohamed
Mihaela Rosca
Michael Figurnov
A. Mnih
67
408
0
25 Jun 2019
The Limited Multi-Label Projection Layer
The Limited Multi-Label Projection Layer
Brandon Amos
V. Koltun
J. Zico Kolter
50
36
0
20 Jun 2019
Stochastic Optimization of Sorting Networks via Continuous Relaxations
Stochastic Optimization of Sorting Networks via Continuous Relaxations
Aditya Grover
Eric Wang
Aaron Zweig
Stefano Ermon
53
173
0
21 Mar 2019
Reparameterizable Subset Sampling via Continuous Relaxations
Reparameterizable Subset Sampling via Continuous Relaxations
Sang Michael Xie
Stefano Ermon
BDL
48
99
0
29 Jan 2019
Learning with Fenchel-Young Losses
Learning with Fenchel-Young Losses
Mathieu Blondel
André F. T. Martins
Vlad Niculae
123
133
0
08 Jan 2019
Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a
  Structured Variational Autoencoder
Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder
Caio Corro
Ivan Titov
BDL
40
56
0
25 Jul 2018
Reparameterization Gradient for Non-differentiable Models
Reparameterization Gradient for Non-differentiable Models
Wonyeol Lee
Hangyeol Yu
Hongseok Yang
DRL
59
31
0
01 Jun 2018
ListOps: A Diagnostic Dataset for Latent Tree Learning
ListOps: A Diagnostic Dataset for Latent Tree Learning
Nikita Nangia
Samuel R. Bowman
45
137
0
17 Apr 2018
Learning Latent Permutations with Gumbel-Sinkhorn Networks
Learning Latent Permutations with Gumbel-Sinkhorn Networks
Gonzalo E. Mena
David Belanger
Scott W. Linderman
Jasper Snoek
72
270
0
23 Feb 2018
Learning to Explain: An Information-Theoretic Perspective on Model
  Interpretation
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation
Jianbo Chen
Le Song
Martin J. Wainwright
Michael I. Jordan
MLT
FAtt
129
572
0
21 Feb 2018
SparseMAP: Differentiable Sparse Structured Inference
SparseMAP: Differentiable Sparse Structured Inference
Vlad Niculae
André F. T. Martins
Mathieu Blondel
Claire Cardie
43
122
0
12 Feb 2018
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
89
300
0
31 Oct 2017
REBAR: Low-variance, unbiased gradient estimates for discrete latent
  variable models
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
George Tucker
A. Mnih
Chris J. Maddison
John Lawson
Jascha Narain Sohl-Dickstein
BDL
191
282
0
21 Mar 2017
OptNet: Differentiable Optimization as a Layer in Neural Networks
OptNet: Differentiable Optimization as a Layer in Neural Networks
Brandon Amos
J. Zico Kolter
150
958
0
01 Mar 2017
Categorical Reparameterization with Gumbel-Softmax
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
279
5,360
0
03 Nov 2016
The Concrete Distribution: A Continuous Relaxation of Discrete Random
  Variables
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
155
2,529
0
02 Nov 2016
The Generalized Reparameterization Gradient
The Generalized Reparameterization Gradient
Francisco J. R. Ruiz
Michalis K. Titsias
David M. Blei
BDL
58
169
0
07 Oct 2016
Rationalizing Neural Predictions
Rationalizing Neural Predictions
Tao Lei
Regina Barzilay
Tommi Jaakkola
108
811
0
13 Jun 2016
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed
  Systems
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi
Ashish Agarwal
P. Barham
E. Brevdo
Zhiwen Chen
...
Pete Warden
Martin Wattenberg
Martin Wicke
Yuan Yu
Xiaoqiang Zheng
240
11,145
0
14 Mar 2016
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label
  Classification
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
André F. T. Martins
Ramón Fernández Astudillo
156
719
0
05 Feb 2016
MuProp: Unbiased Backpropagation for Stochastic Neural Networks
MuProp: Unbiased Backpropagation for Stochastic Neural Networks
S. Gu
Sergey Levine
Ilya Sutskever
A. Mnih
BDL
46
143
0
16 Nov 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.4K
149,842
0
22 Dec 2014
A* Sampling
A* Sampling
Chris J. Maddison
Daniel Tarlow
T. Minka
75
393
0
31 Oct 2014
Neural Turing Machines
Neural Turing Machines
Alex Graves
Greg Wayne
Ivo Danihelka
95
2,325
0
20 Oct 2014
Neural Variational Inference and Learning in Belief Networks
Neural Variational Inference and Learning in Belief Networks
A. Mnih
Karol Gregor
BDL
151
729
0
31 Jan 2014
On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori
  Perturbations
On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations
Tamir Hazan
Subhransu Maji
Tommi Jaakkola
50
56
0
29 Sep 2013
Tighter Linear Program Relaxations for High Order Graphical Models
Tighter Linear Program Relaxations for High Order Graphical Models
Elad Mezuman
Daniel Tarlow
Amir Globerson
Yair Weiss
65
14
0
26 Sep 2013
Learning Graphical Model Parameters with Approximate Marginal Inference
Learning Graphical Model Parameters with Approximate Marginal Inference
Justin Domke
TPM
73
187
0
15 Jan 2013
Fast Exact Inference for Recursive Cardinality Models
Fast Exact Inference for Recursive Cardinality Models
Daniel Tarlow
Kevin Swersky
R. Zemel
Ryan P. Adams
B. Frey
TPM
59
59
0
16 Oct 2012
Learning Attitudes and Attributes from Multi-Aspect Reviews
Learning Attitudes and Attributes from Multi-Aspect Reviews
Julian McAuley
J. Leskovec
Dan Jurafsky
242
298
0
15 Oct 2012
On the Partition Function and Random Maximum A-Posteriori Perturbations
On the Partition Function and Random Maximum A-Posteriori Perturbations
Tamir Hazan
Tommi Jaakkola
76
93
0
27 Jun 2012
Sum-Product Networks: A New Deep Architecture
Sum-Product Networks: A New Deep Architecture
Hoifung Poon
Pedro M. Domingos
TPM
74
758
0
14 Feb 2012
Ranking via Sinkhorn Propagation
Ranking via Sinkhorn Propagation
Ryan P. Adams
R. Zemel
88
147
0
09 Jun 2011
1