ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.12671
  4. Cited By
How to Fine-tune the Model: Unified Model Shift and Model Bias Policy
  Optimization

How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization

22 September 2023
Hai Zhang
Hang Yu
Junqiao Zhao
Di Zhang
Chang Huang
Hongtu Zhou
Xiao Zhang
Chen Ye
ArXivPDFHTML

Papers citing "How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization"

10 / 10 papers shown
Title
DPMambaIR:All-in-One Image Restoration via Degradation-Aware Prompt State Space Model
DPMambaIR:All-in-One Image Restoration via Degradation-Aware Prompt State Space Model
Ziqiang Liu
Shuigeng Zhou
Yuchao Dai
Yang Wang
Yisheng An
Xiangmo Zhao
39
0
0
24 Apr 2025
Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling
Long Peng
Anran Wu
W. J. Li
Peizhe Xia
Xueyuan Dai
...
Haoze Sun
Renjing Pei
Yang Wang
Yang Cao
Zheng-jun Zha
59
2
0
09 Mar 2025
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
Haozhe Ma
Zhengding Luo
Thanh Vinh Vo
Kuankuan Sima
Tze-Yun Leong
31
5
0
06 Aug 2024
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
Hai Zhang
Boyuan Zheng
Anqi Guo
Tianying Ji
Anqi Guo
Junqiao Zhao
Lanqing Li
OffRL
39
0
0
20 May 2024
Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning
Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning
Lanqing Li
Hai Zhang
Xinyu Zhang
Shatong Zhu
Junqiao Zhao
Junqiao Zhao
Pheng-Ann Heng
OffRL
43
7
0
04 Feb 2024
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy
  Actor-Critic
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji
Yuping Luo
Gang Hua
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRL
OnRL
37
14
0
05 Jun 2023
Simplifying Model-based RL: Learning Representations, Latent-space
  Models, and Policies with One Objective
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
Raj Ghugare
Homanga Bharadhwaj
Benjamin Eysenbach
Sergey Levine
Ruslan Salakhutdinov
OffRL
45
25
0
18 Sep 2022
On-Policy Model Errors in Reinforcement Learning
On-Policy Model Errors in Reinforcement Learning
Lukas P. Frohlich
Maksym Lefarov
M. Zeilinger
Felix Berkenkamp
OnRL
57
6
0
15 Oct 2021
Value Penalized Q-Learning for Recommender Systems
Value Penalized Q-Learning for Recommender Systems
Chengqian Gao
Ke Xu
Kuangqi Zhou
Lanqing Li
Xueqian Wang
Bo Yuan
P. Zhao
OffRL
50
20
0
15 Oct 2021
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,661
0
05 Dec 2016
1