Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.10642
Cited By
Iterative Tree Analysis for Medical Critics
18 January 2025
Zenan Huang
Mingwei Li
Zheng Zhou
Youxin Jiang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Iterative Tree Analysis for Medical Critics"
30 / 30 papers shown
Title
RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models
Hieu Tran
Zonghai Yao
Junda Wang
Yifan Zhang
Zhichao Yang
Hong-ye Yu
LRM
184
7
0
03 Dec 2024
From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Harsha Nori
Naoto Usuyama
Nicholas King
S. McKinney
Xavier Fernandes
Sheng Zhang
Eric Horvitz
LRM
LM&MA
ELM
VLM
109
13
0
06 Nov 2024
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation
Farima Fatahi Bayat
Lechen Zhang
Sheza Munir
Lu Wang
HILM
116
4
0
29 Oct 2024
Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval
Ingeol Baek
Hwan Chang
Byeongjeong Kim
Jimin Lee
Hwanhee Lee
RALM
162
5
0
17 Oct 2024
Accelerating Inference of Networks in the Frequency Domain
Chenqiu Zhao
Guanfang Dong
Anup Basu
124
20
0
06 Oct 2024
Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach
Zhouyu Jiang
Mengshu Sun
Lei Liang
Qing Cui
RALM
155
14
0
18 Jul 2024
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
111
12
0
11 Jun 2024
MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering
Robert Osazuwa Ness
Katie Matton
Hayden Helm
Sheng Zhang
Junaid Bajwa
Carey E. Priebe
Eric Horvitz
ELM
64
13
0
03 Jun 2024
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Xiangkun Hu
Dongyu Ru
Lin Qiu
Qipeng Guo
Tianhang Zhang
Yang Xu
Yun Luo
Pengfei Liu
Yue Zhang
Zheng Zhang
HILM
LRM
98
9
0
23 May 2024
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments
Samuel Schmidgall
Rojin Ziaei
Carl Harris
Eduardo Reis
Jeffrey Jopling
Michael Moor
238
55
0
13 May 2024
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
Soyeong Jeong
Jinheon Baek
Sukmin Cho
Sung Ju Hwang
Jong C. Park
RALM
129
187
0
21 Mar 2024
DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models
Weihang Su
Yichen Tang
Qingyao Ai
Zhijing Wu
Yiqun Liu
3DV
RALM
AI4TS
SyDa
93
21
0
15 Mar 2024
RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering
Zihan Zhang
Meng Fang
Ling-Hao Chen
RALM
96
14
0
26 Feb 2024
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models
Hanxing Ding
Liang Pang
Zihao Wei
Huawei Shen
Xueqi Cheng
HILM
RALM
146
18
0
16 Feb 2024
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator
Zhihao Fan
Jialong Tang
Wei Chen
Siyuan Wang
Zhongyu Wei
Jun Xi
Fei Huang
Jingren Zhou
LM&MA
148
30
0
15 Feb 2024
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Dongping Chen
Ruoxi Chen
Shilin Zhang
Yinuo Liu
Yaochen Wang
Huichi Zhou
Qihui Zhang
Yao Wan
Pan Zhou
Lichao Sun
ELM
66
123
0
07 Feb 2024
Evaluating Hallucinations in Chinese Large Language Models
Qinyuan Cheng
Tianxiang Sun
Wenwei Zhang
Siyin Wang
Xiangyang Liu
...
Junliang He
Mianqiu Huang
Zhangyue Yin
Kai Chen
Xipeng Qiu
HILM
ELM
96
27
0
05 Oct 2023
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
Tu Vu
Mohit Iyyer
Xuezhi Wang
Noah Constant
Jerry W. Wei
...
Chris Tar
Yun-hsuan Sung
Denny Zhou
Quoc Le
Thang Luong
KELM
HILM
LRM
137
219
0
05 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
131
41
0
01 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
631
4,460
0
09 Jun 2023
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
Zhihong Shao
Yeyun Gong
Yelong Shen
Minlie Huang
Nan Duan
Weizhu Chen
RALM
LRM
KELM
130
263
0
24 May 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
259
705
0
23 May 2023
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Junyi Li
Xiaoxue Cheng
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
HILM
VLM
123
254
0
19 May 2023
Active Retrieval Augmented Generation
Zhengbao Jiang
Frank F. Xu
Luyu Gao
Zhiqing Sun
Qian Liu
Jane Dwivedi-Yu
Yiming Yang
Jamie Callan
Graham Neubig
RALM
109
295
0
11 May 2023
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
Shicheng Xu
Liang Pang
Huawei Shen
Xueqi Cheng
Tat-Seng Chua
RALM
KELM
LRM
197
48
0
28 Apr 2023
Larger language models do in-context learning differently
Jerry W. Wei
Jason W. Wei
Yi Tay
Dustin Tran
Albert Webson
...
Xinyun Chen
Hanxiao Liu
Da Huang
Denny Zhou
Tengyu Ma
ReLM
LRM
125
374
0
07 Mar 2023
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
H. Trivedi
Niranjan Balasubramanian
Tushar Khot
Ashish Sabharwal
KELM
RALM
LRM
168
476
0
20 Dec 2022
SciFact-Open: Towards open-domain scientific claim verification
David Wadden
Kyle Lo
Bailey Kuehl
Arman Cohan
Iz Beltagy
Lucy Lu Wang
Hannaneh Hajishirzi
LRM
99
63
0
25 Oct 2022
What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams
Di Jin
Eileen Pan
Nassim Oufattole
W. Weng
Hanyi Fang
Peter Szolovits
FaML
ELM
LM&MA
146
820
0
28 Sep 2020
A Question-Entailment Approach to Question Answering
Asma Ben Abacha
Dina Demner-Fushman
88
196
0
23 Jan 2019
1