Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12599
Cited By
Kimi k1.5: Scaling Reinforcement Learning with LLMs
22 January 2025
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
Cheng Chen
Cheng Li
Chenjun Xiao
C. Du
Chonghua Liao
C. Tang
C. Wang
Dehao Zhang
Enming Yuan
Enzhe Lu
Fengxiang Tang
Flood Sung
Guangda Wei
Guokun Lai
Haiqing Guo
Han Zhu
Hao Ding
Hao Hu
Hao Yang
Hao Zhang
Haotian Yao
Haotian Zhao
Haoyu Lu
Yiming Li
Haozhen Yu
Hongcheng Gao
Huabin Zheng
Huan Yuan
Jia-Yu Chen
Jianhang Guo
Jianlin Su
J. Wang
J. Zhao
Jin Zhang
Jiaheng Liu
Junjie Yan
J. Wu
Lidong Shi
Ling Ye
Le Yu
Mengnan Dong
N. Zhang
Ningchen Ma
Qiwei Pan
Qucheng Gong
Shixuan Liu
Shengling Ma
Shupeng Wei
Sihan Cao
S. Huang
Tao Jiang
W. Gao
Weimin Xiong
Weiran He
Yifan Jiang
Wei Wu
Wenyang He
Xianghui Wei
Xianqing Jia
Xingzhe Wu
Xinran Xu
Xinxing Zu
Xinyu Zhou
Xuehai Pan
Y. Charles
Yang Li
Yihan Hu
Yi Liu
Yuxiao Chen
Yejie Wang
Yibo Liu
Yidao Qin
Y. Liu
Yiran Yang
Yiping Bao
Yulun Du
Yuxin Wu
Yuzhi Wang
Zaida Zhou
Zihan Wang
Zhu Li
Zhen Zhu
Zheng Zhang
Zhexu Wang
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Zhengyuan Yang
VLM
ALM
OffRL
AI4TS
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Kimi k1.5: Scaling Reinforcement Learning with LLMs"
12 / 112 papers shown
Title
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving
Guizhen Chen
Weiwen Xu
Hao Zhang
Hou Pong Chan
Chaoqun Liu
Lidong Bing
Deli Zhao
Anh Tuan Luu
Yu Rong
ReLM
LRM
61
3
0
27 Feb 2025
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
LRM
39
6
0
24 Feb 2025
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung
Zhiqi Gao
Yurii Kvasiuk
Tianyi Li
Moritz Münchmeyer
Maja Rudolph
Frederic Sala
Sai Chaitanya Tadepalli
AIMat
52
3
0
19 Feb 2025
Atom of Thoughts for Markov LLM Test-Time Scaling
Fengwei Teng
Zhaoyang Yu
Quan Shi
Jiayi Zhang
Chenglin Wu
Yuyu Luo
MU
LRM
56
14
0
17 Feb 2025
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanwei Li
Yu Qi
...
Shen Yan
Bo Zhang
Chaoyou Fu
Peng Gao
Hongsheng Li
MLLM
LRM
93
21
0
13 Feb 2025
Brief analysis of DeepSeek R1 and its implications for Generative AI
Sarah Mercer
Samuel Spillard
Daniel P. Martin
76
13
0
04 Feb 2025
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Zongyu Lin
Yao Tang
Xingcheng Yao
Da Yin
Ziniu Hu
Ningyu Zhang
Kai-Wei Chang
LRM
50
3
0
04 Feb 2025
Process Reinforcement through Implicit Rewards
Ganqu Cui
Lifan Yuan
Zihan Wang
Hanbin Wang
Wendi Li
...
Yu Cheng
Zhiyuan Liu
Maosong Sun
Bowen Zhou
Ning Ding
OffRL
LRM
75
57
0
03 Feb 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
131
14
0
30 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zhenru Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
ReLM
58
101
0
30 Dec 2024
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
58
2
0
23 Oct 2024
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Yotam Wolf
Binyamin Rothberg
Dorin Shteyman
Amnon Shashua
25
0
0
26 Sep 2024
Previous
1
2
3