Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.01783
Cited By
Asynchronous Methods for Deep Reinforcement Learning
4 February 2016
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Asynchronous Methods for Deep Reinforcement Learning"
50 / 1,499 papers shown
Title
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
Kalyan Cherukuri
Aarav Lala
Yash Yardi
4
0
0
17 May 2025
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
Matthew Landers
Taylor W. Killian
Thomas Hartvigsen
Afsaneh Doryab
14
0
0
17 May 2025
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
Donghoon Lee
Tung M. Luu
Younghwan Lee
Chang D. Yoo
OffRL
VLM
14
0
0
16 May 2025
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Xinyue Wang
Zhen Zhang
OffRL
CML
32
0
0
13 May 2025
Differentiable Quantum Architecture Search in Quantum-Enhanced Neural Network Parameter Generation
Samuel Yen-Chi Chen
Chen-Yu Liu
Kuan-Cheng Chen
Wei-Jia Huang
Yen-Jui Chang
Wei-Hao Huang
31
0
0
13 May 2025
Deep Reinforcement Learning for Power Grid Multi-Stage Cascading Failure Mitigation
Bo Meng
Chenghao Xu
Yongli Zhu
AI4CE
14
0
0
13 May 2025
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Chengmin Zhou
Ville Kyrki
Pasi Fränti
Laura Ruotsalainen
BDL
AI4CE
42
0
0
12 May 2025
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
Yi Zhang
Ruihong Qiu
Xuwei Xu
Jiajun Liu
Sen Wang
OffRL
34
0
0
12 May 2025
SEM: Reinforcement Learning for Search-Efficient Large Language Models
Zeyang Sha
Shiwen Cui
Weiqiang Wang
KELM
OffRL
LRM
31
0
0
12 May 2025
Towards Human-Centric Autonomous Driving: A Fast-Slow Architecture Integrating Large Language Model Guidance with Reinforcement Learning
Chengkai Xu
Jiaqi Liu
Yicheng Guo
Wenjie Qu
Peng Hang
Jian Sun
36
0
0
11 May 2025
LineFlow: A Framework to Learn Active Control of Production Lines
Kai Müller
Martin Wenzel
Tobias Windisch
AI4CE
26
0
0
10 May 2025
Bi-LSTM based Multi-Agent DRL with Computation-aware Pruning for Agent Twins Migration in Vehicular Embodied AI Networks
Yuxiang Wei
Zhuoqi Zeng
Yue Zhong
Jiawen Kang
R. W. Liu
M. S. Hossain
31
0
0
09 May 2025
Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration
Andreas Kontogiannis
Konstantinos Papathanasiou
Yi Shen
Giorgos Stamou
Michael M. Zavlanos
G. Vouros
38
0
0
08 May 2025
A critical assessment of reinforcement learning methods for microswimmer navigation in complex flows
Selim Mecanna
Aurore Loisy
Christophe Eloy
34
0
0
08 May 2025
Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization
Wenjun Cao
AAML
44
0
0
07 May 2025
DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning
Borui Wang
Kathleen McKeown
Rex Ying
OffRL
39
0
0
06 May 2025
Aerodynamic and structural airfoil shape optimisation via Transfer Learning-enhanced Deep Reinforcement Learning
David Ramos
Lucas Lacasa
E. Valero
G. Rubio
AI4CE
32
0
0
05 May 2025
Global Optimality of Single-Timescale Actor-Critic under Continuous State-Action Space: A Study on Linear Quadratic Regulator
Xuyang Chen
Jingliang Duan
Lin Zhao
62
1
0
02 May 2025
Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A Survey
Mohamad Abdul Hady
Siyi Hu
Mahardhika Pratama
Jimmy Cao
Ryszard Kowalczyk
26
0
0
29 Apr 2025
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
68
0
0
26 Apr 2025
CaRL: Learning Scalable Planning Policies with Simple Rewards
Bernhard Jaeger
D. Dauner
Jens Beißwenger
Simon Gerstenecker
Kashyap Chitta
Andreas Geiger
60
1
0
24 Apr 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&Ro
LRM
33
0
0
22 Apr 2025
Autonomous Control of Redundant Hydraulic Manipulator Using Reinforcement Learning with Action Feedback
Rohit Dhakate
Christian Brommer
C. Böhm
Stephan Weiss
J. Steinbrener
36
5
0
22 Apr 2025
MARFT: Multi-Agent Reinforcement Fine-Tuning
Junwei Liao
Muning Wen
Jun Wang
Weixi Zhang
OffRL
31
0
0
21 Apr 2025
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
44
1
0
21 Apr 2025
HF4Rec: Human-Like Feedback-Driven Optimization Framework for Explainable Recommendation
Jiakai Tang
Jingsen Zhang
Zihang Tian
Xueyang Feng
Lei Wang
Xu Chen
OffRL
207
0
0
19 Apr 2025
Evolutionary Policy Optimization
Zelal Su "Lain" Mustafaoglu
Keshav Pingali
Risto Miikkulainen
36
0
0
17 Apr 2025
pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild
Jonas Myhre Schiøtt
Viktor Sebastian Petersen
Dimitrios P. Papadopoulos
VLM
35
0
0
16 Apr 2025
A Rollout-Based Algorithm and Reward Function for Efficient Resource Allocation in Business Processes
Jeroen Middelhuis
Z. Bukhsh
Ivo Adan
R. Dijkman
29
0
0
15 Apr 2025
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Ukjo Hwang
Songnam Hong
OffRL
41
0
0
14 Apr 2025
Pay Attention to What and Where? Interpretable Feature Extractor in Vision-based Deep Reinforcement Learning
Tien Pham
Angelo Cangelosi
36
1
0
14 Apr 2025
Human-like compositional learning of visually-grounded concepts using synthetic environments
Zijun Lin
M Ganesh Kumar
Cheston Tan
OCL
CoGe
75
0
0
09 Apr 2025
GRAIN: Multi-Granular and Implicit Information Aggregation Graph Neural Network for Heterophilous Graphs
Songwei Zhao
Yuan Jiang
Zijing Zhang
Yang Yu
Hechang Chen
23
0
0
09 Apr 2025
Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments
Dolton Fernandes
Pramod Kaushik
Harsh Shukla
Bapi Raju Surampudi
21
0
0
08 Apr 2025
Exploration-Driven Generative Interactive Environments
N. Savov
Naser Kazemi
Mohammad Mahdi
Danda Pani Paudel
Xi Wang
Luc Van Gool
VGen
3DV
43
0
0
03 Apr 2025
FastFlow: Early Yet Robust Network Flow Classification using the Minimal Number of Time-Series Packets
Rushi Jayeshkumar Babaria
Minzhao Lyu
Gustavo E. A. P. A. Batista
V. Sivaraman
AI4TS
48
0
0
02 Apr 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
51
0
0
28 Mar 2025
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Weizhen Wang
Jianping He
Xiaoming Duan
39
0
0
28 Mar 2025
CRLLK: Constrained Reinforcement Learning for Lane Keeping in Autonomous Driving
Xinwei Gao
Arambam James Singh
Gangadhar Royyuru
Michael Yuhas
Arvind Easwaran
OffRL
35
0
0
28 Mar 2025
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
Yunhao Tang
Kunhao Zheng
Gabriel Synnaeve
Rémi Munos
41
2
0
25 Mar 2025
Thinking agents for zero-shot generalization to qualitatively novel tasks
Thomas Miconi
Kevin L McKee
Yicong Zheng
Jed McCaleb
LRM
AI4CE
51
0
0
25 Mar 2025
Evolutionary Policy Optimization
Jianren Wang
Yifan Su
Abhinav Gupta
Deepak Pathak
50
0
0
24 Mar 2025
AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
Le Qiu
Zelai Xu
Qixin Tan
Wenhao Tang
Chao Yu
Yu Wang
AAML
43
0
0
24 Mar 2025
Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment
Hiromu Taketsugu
Takeru Oba
Takahiro Maeda
Shohei Nobuhara
Norimichi Ukita
54
1
0
21 Mar 2025
Optimizing 2D+1 Packing in Constrained Environments Using Deep Reinforcement Learning
Victor Ulisses Pugliese
Oséias F. de A. Ferreira
Fabio A. Faria
OffRL
51
0
0
21 Mar 2025
Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic Programming
Minori Narita
Ryo Kuroiwa
J. Christopher Beck
54
0
0
20 Mar 2025
Design of Reward Function on Reinforcement Learning for Automated Driving
Takeru Goto
Yuki Kizumi
Shun Iwasaki
41
4
0
20 Mar 2025
Nonparametric Bellman Mappings for Value Iteration in Distributed Reinforcement Learning
Yuki Akiyama
Konstantinos Slavakis
39
0
0
20 Mar 2025
Optimizing Decomposition for Optimal Claim Verification
Yining Lu
Noah Ziems
Hy Dang
Meng Jiang
61
0
0
19 Mar 2025
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
91
4
0
18 Mar 2025
1
2
3
4
...
28
29
30
Next