A New One-Point Residual-Feedback Oracle For Black-Box Learning and
Control

A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control

18 June 2020

Michael M. Zavlanos

Papers citing "A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control"

14 / 14 papers shown

Title
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer Yanjun Zhao Sizhe Dang Haishan Ye Guang Dai Yi Qian Ivor W.Tsang 88 9 0 23 Feb 2024
Cooperative Multi-Agent Reinforcement Learning with Partial Observations Yan Zhang Michael M. Zavlanos OffRL 39 22 0 18 Jun 2020
Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits A. Akhavan Massimiliano Pontil Alexandre B. Tsybakov 26 40 0 14 Jun 2020
Socially-Aware Robot Planning via Bandit Human Feedback Xusheng Luo Yan Zhang Michael M. Zavlanos 38 17 0 02 Mar 2020
Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems Dhruv Malik A. Pananjady Kush S. Bhatia K. Khamaru Peter L. Bartlett Martin J. Wainwright 37 198 0 20 Dec 2018
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator Maryam Fazel Rong Ge Sham Kakade M. Mesbahi 62 597 0 15 Jan 2018
ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models Pin-Yu Chen Huan Zhang Yash Sharma Jinfeng Yi Cho-Jui Hsieh AAML 46 1,864 0 14 Aug 2017
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments Ryan J. Lowe Yi Wu Aviv Tamar J. Harb Pieter Abbeel Igor Mordatch 113 4,441 0 07 Jun 2017
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles Xiaowei Hu A. PrashanthL. András Gyorgy Csaba Szepesvári 107 66 0 22 Sep 2016
Highly-Smooth Zero-th Order Online Optimization Vianney Perchet Francis R. Bach Vianney Perchet 63 85 0 26 May 2016
An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Ohad Shamir 43 259 0 31 Jul 2015
Optimal rates for zero-order convex optimization: the power of two function evaluations John C. Duchi Michael I. Jordan Martin J. Wainwright Andre Wibisono 49 480 0 07 Dec 2013
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming Saeed Ghadimi Guanghui Lan ODL 54 1,538 0 22 Sep 2013
On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization Ohad Shamir 165 191 0 11 Sep 2012