Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.17447
Cited By
LeTS: Learning to Think-and-Search via Process-and-Outcome Reward Hybridization
23 May 2025
Qi Zhang
Shouqing Yang
Lirong Gao
Hao Chen
Xiaomeng Hu
Jinglei Chen
Jiexiang Wang
Sheng Guo
Bo Zheng
Haobo Wang
Junbo Zhao
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LeTS: Learning to Think-and-Search via Process-and-Outcome Reward Hybridization"
13 / 13 papers shown
Title
A Survey of Large Language Model Agents for Question Answering
Murong Yue
LLMAG
LM&MA
ELM
106
5
0
24 Mar 2025
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Bowen Jin
Hansi Zeng
Zhenrui Yue
Dong Wang
Sercan O. Arik
Dong Wang
Hamed Zamani
Jiawei Han
RALM
ReLM
KELM
OffRL
AI4TS
LRM
207
122
0
12 Mar 2025
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
169
1,288
0
05 Feb 2024
Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?
Hexiang Tan
Fei Sun
Wanli Yang
Yuanzhuo Wang
Qi Cao
Xueqi Cheng
126
21
0
22 Jan 2024
Understanding Retrieval Augmentation for Long-Form Question Answering
Hung-Ting Chen
Fangyuan Xu
Shane Arora
Eunsol Choi
RALM
47
38
0
18 Oct 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
267
467
0
18 Aug 2023
Influence of External Information on Large Language Models Mirrors Social Cognitive Patterns
Ning Bian
Hongyu Lin
Peilin Liu
Yaojie Lu
Chunkang Zhang
Xianpei Han
Xianpei Han
Le Sun
53
14
0
08 May 2023
Measuring and Narrowing the Compositionality Gap in Language Models
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLM
KELM
LRM
202
643
0
07 Oct 2022
Retrieval Augmentation Reduces Hallucination in Conversation
Kurt Shuster
Spencer Poff
Moya Chen
Douwe Kiela
Jason Weston
HILM
97
746
0
15 Apr 2021
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
147
2,121
0
10 Feb 2020
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
215
2,703
0
25 Sep 2018
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
242
2,692
0
09 May 2017
Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances
Marco Cuturi
OT
222
4,294
0
04 Jun 2013
1