ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.23150
  4. Cited By
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

29 May 2025
Michal Nauman
Marek Cygan
Carmelo Sferrazza
Aviral Kumar
Pieter Abbeel
    OffRL
ArXivPDFHTML

Papers citing "Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners"

50 / 87 papers shown
Title
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang
Fangchen Liu
Letian Fu
Tingfan Wu
Mustafa Mukadam
Jitendra Malik
Ken Goldberg
Pieter Abbeel
LM&Ro
VLM
103
8
0
05 Mar 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
115
2
0
21 Feb 2025
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class
  and Backbone
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
Max Sobol Mark
Tian Gao
Georgia Gabriela Sampaio
Mohan Kumar Srirama
Archit Sharma
Chelsea Finn
Aviral Kumar
OffRL
OnRL
138
8
0
09 Dec 2024
Latent Action Pretraining from Videos
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
56
34
0
15 Oct 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Hyunseung Kim
Jun Jet Tai
K. Subramanian
Peter R. Wurman
Jaegul Choo
Peter Stone
Takuma Seno
OffRL
107
14
0
13 Oct 2024
Is Value Learning Really the Main Bottleneck in Offline RL?
Is Value Learning Really the Main Bottleneck in Offline RL?
Seohong Park
Kevin Frans
Sergey Levine
Aviral Kumar
OffRL
70
13
0
13 Jun 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
138
425
0
13 Jun 2024
Bigger, Regularized, Optimistic: scaling for compute and
  sample-efficient continuous control
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Michal Nauman
M. Ostaszewski
Krzysztof Jankowski
Piotr Milo's
Marek Cygan
OffRL
65
24
0
25 May 2024
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
126
397
0
20 May 2024
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion
  and Manipulation
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation
Carmelo Sferrazza
Dun-Ming Huang
Xingyu Lin
Youngwoon Lee
Pieter Abbeel
88
42
0
15 Mar 2024
Stop Regressing: Training Value Functions via Classification for
  Scalable Deep RL
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Jesse Farebrother
Jordi Orbay
Q. Vuong
Adrien Ali Taïga
Yevgen Chebotar
...
Sergey Levine
Pablo Samuel Castro
Aleksandra Faust
Aviral Kumar
Rishabh Agarwal
OffRL
77
60
0
06 Mar 2024
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter
  Lesson of Reinforcement Learning
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman
Michal Bortkiewicz
Piotr Milo's
Tomasz Trzciñski
M. Ostaszewski
Marek Cygan
OffRL
47
21
0
01 Mar 2024
Scaling Laws for Fine-Grained Mixture of Experts
Scaling Laws for Fine-Grained Mixture of Experts
Jakub Krajewski
Jan Ludziejewski
Kamil Adamczewski
Maciej Pióro
Michal Krutul
...
Krystian Król
Tomasz Odrzygó'zd'z
Piotr Sankowski
Marek Cygan
Sebastian Jaszczur
MoE
63
57
0
12 Feb 2024
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Jost Tobias Springenberg
A. Abdolmaleki
Jingwei Zhang
Oliver Groth
Michael Bloesch
...
Sarah Bechtle
Steven Kapturowski
Roland Hafner
N. Heess
Martin Riedmiller
OffRL
LRM
60
14
0
08 Feb 2024
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
Michal Nauman
Marek Cygan
55
3
0
30 Oct 2023
TD-MPC2: Scalable, Robust World Models for Continuous Control
TD-MPC2: Scalable, Robust World Models for Continuous Control
Nicklas Hansen
Hao Su
Xiaolong Wang
MU
87
141
0
25 Oct 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
86
1,172
0
28 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
206
11,636
0
18 Jul 2023
Polybot: Training One Policy Across Robots While Embracing Variability
Polybot: Training One Policy Across Robots While Embracing Variability
Jonathan Yang
Dorsa Sadigh
Chelsea Finn
35
38
0
07 Jul 2023
For SALE: State-Action Representation Learning for Deep Reinforcement
  Learning
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
Scott Fujimoto
Wei-Di Chang
Edward James Smith
S. Gu
Doina Precup
David Meger
OffRL
49
49
0
04 Jun 2023
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Max Schwarzer
J. Obando-Ceron
Rameswar Panda
Marc G. Bellemare
Rishabh Agarwal
Pablo Samuel Castro
OffRL
65
92
0
30 May 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
249
3,205
0
14 Apr 2023
Segment Anything
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
264
7,047
0
05 Apr 2023
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via
  Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
Weikang Wan
Haoran Geng
Yun-Hai Liu
Zikang Shan
Yaodong Yang
Li Yi
He Wang
70
98
0
02 Apr 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
647
13,788
0
15 Mar 2023
PaLM-E: An Embodied Multimodal Language Model
PaLM-E: An Embodied Multimodal Language Model
Danny Driess
F. Xia
Mehdi S. M. Sajjadi
Corey Lynch
Aakanksha Chowdhery
...
Marc Toussaint
Klaus Greff
Andy Zeng
Igor Mordatch
Peter R. Florence
LM&Ro
37
1,594
0
06 Mar 2023
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse
  Proposal Generation and Goal-Conditioned Policy
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
Yinzhen Xu
Weikang Wan
Jialiang Zhang
Haoran Liu
Zikang Shan
...
Yijia Weng
Jiayi Chen
Tengyu Liu
Li Yi
He Wang
124
117
0
02 Mar 2023
Efficient Online Reinforcement Learning with Offline Data
Efficient Online Reinforcement Learning with Offline Data
Philip J. Ball
Laura M. Smith
Ilya Kostrikov
Sergey Levine
OffRL
OnRL
75
177
0
06 Feb 2023
Mastering Diverse Domains through World Models
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
54
583
0
10 Jan 2023
RT-1: Robotics Transformer for Real-World Control at Scale
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
60
1,068
0
13 Dec 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And
  Generalizes
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
70
51
0
28 Nov 2022
AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
  Learning
AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning
Enneng Yang
Junwei Pan
Ximei Wang
Haibin Yu
Li Shen
Xihua Chen
Lei Xiao
Jie Jiang
G. Guo
68
44
0
28 Nov 2022
PaCo: Parameter-Compositional Multi-Task Reinforcement Learning
PaCo: Parameter-Compositional Multi-Task Reinforcement Learning
Lingfeng Sun
Haichao Zhang
Wei Xu
Masayoshi Tomizuka
MoE
63
39
0
21 Oct 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
84
297
0
23 Jun 2022
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Benjamin Eysenbach
Tianjun Zhang
Ruslan Salakhutdinov
Sergey Levine
SSL
OffRL
56
147
0
15 Jun 2022
Provable Benefits of Representational Transfer in Reinforcement Learning
Provable Benefits of Representational Transfer in Reinforcement Learning
Alekh Agarwal
Yuda Song
Wen Sun
Kaiwen Wang
Mengdi Wang
Xuezhou Zhang
OffRL
77
35
0
29 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
159
801
0
12 May 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
123
1,915
0
29 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
694
12,525
0
04 Mar 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit
  Regularization
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
51
67
0
09 Dec 2021
Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task
  Learning
Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning
Wenlong Huang
Igor Mordatch
Pieter Abbeel
Deepak Pathak
63
64
0
04 Nov 2021
A System for General In-Hand Object Re-Orientation
A System for General In-Hand Object Re-Orientation
Tao Chen
Jie Xu
Pulkit Agrawal
66
252
0
04 Nov 2021
Conflict-Averse Gradient Descent for Multi-task Learning
Conflict-Averse Gradient Descent for Multi-task Learning
Bo Liu
Xingchao Liu
Xiaojie Jin
Peter Stone
Qiang Liu
69
304
0
26 Oct 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron Courville
Marc G. Bellemare
OffRL
72
652
0
30 Aug 2021
RMA: Rapid Motor Adaptation for Legged Robots
RMA: Rapid Motor Adaptation for Legged Robots
Ashish Kumar
Zipeng Fu
Deepak Pathak
Jitendra Malik
119
564
0
08 Jul 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
98
804
0
12 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
77
1,608
0
02 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
681
28,659
0
26 Feb 2021
Multi-Task Reinforcement Learning with Context-based Representations
Multi-Task Reinforcement Learning with Context-based Representations
Shagun Sodhani
Amy Zhang
Joelle Pineau
53
185
0
11 Feb 2021
Tactical Optimism and Pessimism for Deep Reinforcement Learning
Tactical Optimism and Pessimism for Deep Reinforcement Learning
Theodore H. Moskovitz
Jack Parker-Holder
Aldo Pacchiano
Michael Arbel
Michael I. Jordan
44
56
0
07 Feb 2021
12
Next