Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.09901
Cited By
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Tasks
15 May 2025
Ziyuan Zhang
Darcy Wang
Ningyuan Chen
Rodrigo Mansur
Vahid Sarhangian
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Tasks"
12 / 12 papers shown
Title
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Zixuan Ke
Fangkai Jiao
Yifei Ming
Xuan-Phi Nguyen
Austin Xu
...
Chengwei Qin
Peifeng Wang
Siyang Song
Caiming Xiong
Shafiq Joty
LRM
88
15
0
12 Apr 2025
Should You Use Your Large Language Model to Explore or Exploit?
Keegan Harris
Aleksandrs Slivkins
45
2
0
31 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
318
1,611
0
22 Jan 2025
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Sharath Chandra Raparthy
Eric Hambro
Robert Kirk
Mikael Henaff
Roberta Raileanu
OffRL
155
22
0
06 Dec 2023
Reasoning with Language Model is Planning with World Model
Shibo Hao
Yi Gu
Haodi Ma
Joshua Jiahua Hong
Zhen Wang
D. Wang
Zhiting Hu
ReLM
LRM
LLMAG
123
571
0
24 May 2023
Automatic Chain of Thought Prompting in Large Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
141
618
0
07 Oct 2022
Out of One, Many: Using Language Models to Simulate Human Samples
Lisa P. Argyle
Ethan C. Busby
Nancy Fulda
Joshua R Gubler
Christopher Rytting
David Wingate
SyDa
77
588
0
14 Sep 2022
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
100
390
0
18 Aug 2022
Using cognitive psychology to understand GPT-3
Marcel Binz
Eric Schulz
ELM
LLMAG
320
474
0
21 Jun 2022
Pre-Trained Language Models for Interactive Decision-Making
Shuang Li
Xavier Puig
Chris Paxton
Yilun Du
Clinton Jia Wang
...
Anima Anandkumar
Jacob Andreas
Igor Mordatch
Antonio Torralba
Yuke Zhu
LM&Ro
93
257
0
03 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
740
9,267
0
28 Jan 2022
Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC
Aki Vehtari
Andrew Gelman
Jonah Gabry
106
4,044
0
16 Jul 2015
1