ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.01783
  4. Cited By
Asynchronous Methods for Deep Reinforcement Learning
v1v2 (latest)

Asynchronous Methods for Deep Reinforcement Learning

4 February 2016
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
ArXiv (abs)PDFHTML

Papers citing "Asynchronous Methods for Deep Reinforcement Learning"

50 / 3,591 papers shown
Title
Neural Polar Decoders for DNA Data Storage
Neural Polar Decoders for DNA Data Storage
Ziv Aharoni
Henry D. Pfister
15
0
0
20 Jun 2025
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
Ranting Hu
OffRL
29
0
0
18 Jun 2025
Sequential Policy Gradient for Adaptive Hyperparameter Optimization
Sequential Policy Gradient for Adaptive Hyperparameter Optimization
Zheng Li
Jerry Q. Cheng
Huanying Gu
OffRL
26
0
0
18 Jun 2025
Active Adversarial Noise Suppression for Image Forgery Localization
Active Adversarial Noise Suppression for Image Forgery Localization
Rongxuan Peng
Shunquan Tan
Xianbo Mo
Alex C. Kot
Jiwu Huang
AAML
26
0
0
15 Jun 2025
Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention
Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention
Xuan Duy Ta
Bang Giang Le
Thanh Ha Le
Viet-Cuong Ta
15
0
0
13 Jun 2025
ContextBuddy: AI-Enhanced Contextual Insights for Security Alert Investigation (Applied to Intrusion Detection)
Ronal Singh
Mohan Baruwal Chhetri
Surya Nepal
Cécile Paris
60
0
0
11 Jun 2025
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning
Songze Li
Mingxuan Zhang
Kang Wei
Shouling Ji
AAML
90
0
0
11 Jun 2025
Reinforcement Learning Teachers of Test Time Scaling
Edoardo Cetin
Tianyu Zhao
Yujin Tang
OffRLReLMLRM
55
0
0
10 Jun 2025
TGRPO :Fine-tuning Vision-Language-Action Model via Trajectory-wise Group Relative Policy Optimization
TGRPO :Fine-tuning Vision-Language-Action Model via Trajectory-wise Group Relative Policy Optimization
Zengjue Chen
Runliang Niu
He Kong
Qi Wang
66
0
0
10 Jun 2025
Causal Graph Recovery in Neuroimaging through Answer Set Programming
Causal Graph Recovery in Neuroimaging through Answer Set Programming
Mohammadsajad Abavisani
Kseniya Solovyeva
David Danks
Vince D. Calhoun
Sergey Plis
CML
34
0
0
10 Jun 2025
Collaborative Learning in Agentic Systems: A Collective AI is Greater Than the Sum of Its Parts
Collaborative Learning in Agentic Systems: A Collective AI is Greater Than the Sum of Its Parts
Saptarshi Nath
Christos Peridis
Eseoghene Benjamin
Xinran Liu
Soheil Kolouri
Peter Kinnell
Zexin Li
Cong Liu
Shirin Dora
Andrea Soltoggio
38
0
0
05 Jun 2025
Simple, Good, Fast: Self-Supervised World Models Free of Baggage
Simple, Good, Fast: Self-Supervised World Models Free of Baggage
Jan Robine
Marc Höftmann
Stefan Harmeling
DRLOCL
69
1
0
03 Jun 2025
NetPress: Dynamically Generated LLM Benchmarks for Network Applications
NetPress: Dynamically Generated LLM Benchmarks for Network Applications
Yajie Zhou
Jiajun Ruan
Eric S. Wang
Sadjad Fouladi
Francis Y. Yan
Kevin Hsieh
Zaoxing Liu
34
0
0
03 Jun 2025
Bidirectional Soft Actor-Critic: Leveraging Forward and Reverse KL Divergence for Efficient Reinforcement Learning
Bidirectional Soft Actor-Critic: Leveraging Forward and Reverse KL Divergence for Efficient Reinforcement Learning
Yixian Zhang
Huaze Tang
Changxu Wei
Wenbo Ding
59
0
0
02 Jun 2025
A Hierarchical Bin Packing Framework with Dual Manipulators via Heuristic Search and Deep Reinforcement Learning
A Hierarchical Bin Packing Framework with Dual Manipulators via Heuristic Search and Deep Reinforcement Learning
Beomjoon Lee
Changjoo Nam
OffRL
41
0
0
02 Jun 2025
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
S. Wang
Le Yu
Chang Gao
Chujie Zheng
Shixuan Liu
...
Yang Yue
S. Song
Bowen Yu
Gao Huang
Junyang Lin
LRM
70
9
0
02 Jun 2025
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Hongyao Tang
J. Obando-Ceron
Pablo Samuel Castro
Aaron Courville
Glen Berseth
38
0
0
31 May 2025
Adaptive Plane Reformatting for 4D Flow MRI using Deep Reinforcement Learning
Adaptive Plane Reformatting for 4D Flow MRI using Deep Reinforcement Learning
Javier Bisbal
Julio Sotelo
Maria I Valdés
Pablo Irarrazaval
Marcelo andía
Julio García
José Rodriguez-Palomarez
Francesca Raimondi
C. Tejos
Sergio Uribe
OOD
36
0
0
31 May 2025
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting
Wei Chen
Jiahao Zhang
Haipeng Zhu
Boyan Xu
Zijian Li
Keli Zhang
Junjian Ye
Ruichu Cai
41
1
0
30 May 2025
AMOR: Adaptive Character Control through Multi-Objective Reinforcement Learning
AMOR: Adaptive Character Control through Multi-Objective Reinforcement Learning
Lucas N. Alegre
Agon Serifi
Ruben Grandia
David Müller
Espen Knoop
Moritz Bächer
56
0
0
29 May 2025
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman
Marek Cygan
Carmelo Sferrazza
Aviral Kumar
Pieter Abbeel
OffRL
96
0
0
29 May 2025
ROTATE: Regret-driven Open-ended Training for Ad Hoc Teamwork
ROTATE: Regret-driven Open-ended Training for Ad Hoc Teamwork
Caroline Wang
Arrasy Rahman
Jiaxun Cui
Yoonchang Sung
Peter Stone
66
0
0
29 May 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Ganqu Cui
Yuchen Zhang
Jiacheng Chen
Lifan Yuan
Zhi Wang
...
Lei Bai
Wanli Ouyang
Yu Cheng
Bowen Zhou
Ning Ding
LRM
73
5
0
28 May 2025
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
Brett Bissey
Kyle Gatesman
Walker Dimon
Mohammad Alam
Luis Robaina
Joseph Weissman
AAML
45
0
0
27 May 2025
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
Shijie Liu
Andrew C. Cullen
Paul Montague
S. Erfani
Benjamin I. P. Rubinstein
OffRLAAML
46
1
0
27 May 2025
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni
Zhengyuan Yang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
W. Zuo
Lijuan Wang
ReLMLRM
85
1
0
26 May 2025
Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network
Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network
Bingdong Li
Mei Jiang
Hong Qian
K. Tang
W. Hong
Peng Yang
141
0
0
26 May 2025
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
Yuzheng Hu
Fan Wu
Haotian Ye
David A. Forsyth
James Y. Zou
Nan Jiang
Jiaqi W. Ma
Han Zhao
OffRL
74
0
0
25 May 2025
Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning
Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning
Zhuochen Liu
Rahul Jain
Quan Nguyen
44
0
0
25 May 2025
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
Tao Wang
Ruipeng Zhang
Sicun Gao
OffRL
53
0
0
25 May 2025
CiRL: Open-Source Environments for Reinforcement Learning in Circular Economy and Net Zero
CiRL: Open-Source Environments for Reinforcement Learning in Circular Economy and Net Zero
Federico Zocco
Andrea Corti
Monica Malvezzi
AI4CE
35
0
0
24 May 2025
Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
Mengqi Liao
Xiangyu Xi
Ruinian Chen
Jia Leng
Yangen Hu
Ke Zeng
Shuai Liu
Huaiyu Wan
LRM
48
0
0
24 May 2025
Hybrid Latent Reasoning via Reinforcement Learning
Hybrid Latent Reasoning via Reinforcement Learning
Zhenrui Yue
Bowen Jin
Huimin Zeng
Honglei Zhuang
Zhen Qin
Jinsung Yoon
Lanyu Shang
Jiawei Han
Dong Wang
OffRLBDLLRM
70
0
0
24 May 2025
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
OffRLLRM
209
0
0
24 May 2025
Rethinking Agent Design: From Top-Down Workflows to Bottom-Up Skill Evolution
Rethinking Agent Design: From Top-Down Workflows to Bottom-Up Skill Evolution
Jiawei Du
Jinlong Wu
Yuzheng Chen
Yucheng Hu
Bing Li
Joey Tianyi Zhou
253
0
0
23 May 2025
Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data
Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data
Weichang Wu
Xiaolu Zhang
Jun Zhou
Yuchen Li
Wenwen Xia
22
0
0
22 May 2025
Sequential Monte Carlo for Policy Optimization in Continuous POMDPs
Sequential Monte Carlo for Policy Optimization in Continuous POMDPs
Hany Abdulsamad
Sahel Iqbal
Simo Särkkä
72
0
0
22 May 2025
A Temporal Difference Method for Stochastic Continuous Dynamics
A Temporal Difference Method for Stochastic Continuous Dynamics
Haruki Settai
Naoya Takeishi
Takehisa Yairi
156
0
0
21 May 2025
Building spatial world models from sparse transitional episodic memories
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Pouya Bashivan
KELM
56
0
0
19 May 2025
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
Gang Li
Ming Lin
Tomer Galanti
Zhengzhong Tu
Tianbao Yang
111
1
0
18 May 2025
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
Kalyan Cherukuri
Aarav Lala
Yash Yardi
50
0
0
17 May 2025
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
Matthew Landers
Taylor W. Killian
Thomas Hartvigsen
Afsaneh Doryab
61
0
0
17 May 2025
Zero-Shot Visual Generalization in Robot Manipulation
Zero-Shot Visual Generalization in Robot Manipulation
Sumeet Batra
Gaurav Sukhatme
77
0
0
16 May 2025
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets
Patrick Stöckermann
Henning Südfeld
Alessandro Immordino
Thomas Altenmüller
Marc Wegmann
Martin Gebser
Konstantin Schekotihin
Georg Seidel
Chew Wye Chan
Fei Fei Zhang
OffRL
36
0
0
16 May 2025
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
Donghoon Lee
Tung M. Luu
Younghwan Lee
Chang D. Yoo
OffRLVLM
72
0
0
16 May 2025
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Xinyue Wang
Zhen Zhang
OffRLCML
68
0
0
13 May 2025
Deep Reinforcement Learning for Power Grid Multi-Stage Cascading Failure Mitigation
Deep Reinforcement Learning for Power Grid Multi-Stage Cascading Failure Mitigation
Bo Meng
Chenghao Xu
Yongli Zhu
AI4CE
40
0
0
13 May 2025
Differentiable Quantum Architecture Search in Quantum-Enhanced Neural Network Parameter Generation
Differentiable Quantum Architecture Search in Quantum-Enhanced Neural Network Parameter Generation
Samuel Yen-Chi Chen
Chen-Yu Liu
Kuan-Cheng Chen
Wei-Jia Huang
Yen-Jui Chang
Wei-Hao Huang
63
1
0
13 May 2025
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Chengmin Zhou
Ville Kyrki
Pasi Fränti
Laura Ruotsalainen
BDLAI4CE
121
0
0
12 May 2025
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
Yi Zhang
Ruihong Qiu
Xuwei Xu
Jiajun Liu
Sen Wang
OffRL
74
0
0
12 May 2025
1234...707172
Next