ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.06146
  4. Cited By
PubMedQA: A Dataset for Biomedical Research Question Answering

PubMedQA: A Dataset for Biomedical Research Question Answering

13 September 2019
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
ArXivPDFHTML

Papers citing "PubMedQA: A Dataset for Biomedical Research Question Answering"

50 / 521 papers shown
Title
WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation
WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation
Dvir Cohen
Lin Burg
Sviatoslav Pykhnivskyi
Hagit Gur
Stanislav Kovynov
Olga Atzmon
Gilad Barkan
RALM
21
0
0
13 May 2025
HealthBench: Evaluating Large Language Models Towards Improved Human Health
HealthBench: Evaluating Large Language Models Towards Improved Human Health
Rahul Arora
Jason W. Wei
Rebecca Soskin Hicks
Preston Bowman
Joaquin Quiñonero Candela
...
Meghan Shah
Andrea Vallone
Alex Beutel
Johannes Heidecke
K. Singhal
LM&MA
AI4MH
ELM
47
0
0
13 May 2025
CellVerse: Do Large Language Models Really Understand Cell Biology?
CellVerse: Do Large Language Models Really Understand Cell Biology?
Fan Zhang
Tianyu Liu
Zhihong Zhu
Hao Wu
H. Wang
Donghao Zhou
Yefeng Zheng
Kun Wang
X. Wu
Pheng-Ann Heng
ELM
36
0
0
09 May 2025
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Massimiliano Pronesti
Joao Bettencourt-Silva
Paul Flanagan
Alessandra Pascale
Oisin Redmond
Anya Belz
Yufang Hou
36
0
0
09 May 2025
LLMs Outperform Experts on Challenging Biology Benchmarks
LLMs Outperform Experts on Challenging Biology Benchmarks
Lennart Justen
ELM
23
0
0
09 May 2025
ChemRxivQuest: A Curated Chemistry Question-Answer Database Extracted from ChemRxiv Preprints
ChemRxivQuest: A Curated Chemistry Question-Answer Database Extracted from ChemRxiv Preprints
Mahmoud Amiri
Thomas Bocklitz
48
0
0
08 May 2025
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights
Ozan Gokdemir
Carlo Siebenschuh
Alexander Brace
Azton Wells
Brian Hsu
...
A. Anandkumar
Ian Foster
R. Stevens
V. Vishwanath
A. Ramanathan
VLM
37
0
0
07 May 2025
MIMIC-\RNum{4}-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp for Risk Prediction
MIMIC-\RNum{4}-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp for Risk Prediction
J. Wang
Xing Niu
Juyong Kim
Jie Shen
Tong Zhang
Jeremy C Weiss
29
0
0
01 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
69
0
0
01 May 2025
EnronQA: Towards Personalized RAG over Private Documents
EnronQA: Towards Personalized RAG over Private Documents
Michael J. Ryan
Danmei Xu
Chris Nivera
Daniel Campos
SILM
67
0
0
01 May 2025
HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection
HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection
Deanna Emery
Michael Goitia
Freddie Vargus
Iulia Neagu
HILM
VLM
56
0
0
01 May 2025
Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA
Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA
Xuanzhao Dong
Wenhui Zhu
Hao Wang
Xiwen Chen
Peijie Qiu
Rui Yin
Yi Su
Y. Wang
RALM
MedIm
52
0
0
30 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
86
0
0
29 Apr 2025
BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
Jiageng Wu
Bowen Gu
Ren Zhou
Kevin Xie
Doug Snyder
...
S.
Jonathan H. Chen
Santiago Romero-Brufau
K. J. Lin
Jie Yang
LM&MA
ELM
92
0
0
28 Apr 2025
m-KAILIN: Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training
m-KAILIN: Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training
Meng Xiao
Xunxin Cai
Chengrui Wang
Yuanchun Zhou
48
0
0
28 Apr 2025
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Wataru Kawakami
Keita Suzuki
Junichiro Iwasawa
LRM
68
0
0
25 Apr 2025
The Rise of Small Language Models in Healthcare: A Comprehensive Survey
The Rise of Small Language Models in Healthcare: A Comprehensive Survey
Muskan Garg
Shaina Raza
Shebuti Rayana
Xingyi Liu
Sunghwan Sohn
LM&MA
AILaw
87
0
0
23 Apr 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
30
0
0
23 Apr 2025
Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Jasper Götting
Pedro Medeiros
Jon G Sanders
Nathaniel Li
Long Phan
Karam Elabd
Lennart Justen
Dan Hendrycks
Seth Donoughe
ELM
49
2
0
21 Apr 2025
Med-CoDE: Medical Critique based Disagreement Evaluation Framework
Med-CoDE: Medical Critique based Disagreement Evaluation Framework
Mohit Gupta
Akiko Aizawa
R. Shah
LM&MA
ELM
30
0
0
21 Apr 2025
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?
Rahul Thapa
Andrew Li
Qingyang Wu
B. He
Yuki Sahashi
...
Angela Zhang
Ben Athiwaratkun
S. Song
David Ouyang
James Y. Zou
LM&MA
45
0
0
19 Apr 2025
CPG-EVAL: A Multi-Tiered Benchmark for Evaluating the Chinese Pedagogical Grammar Competence of Large Language Models
CPG-EVAL: A Multi-Tiered Benchmark for Evaluating the Chinese Pedagogical Grammar Competence of Large Language Models
Dong Wang
ELM
28
0
0
17 Apr 2025
A Scoping Review of Natural Language Processing in Addressing Medically Inaccurate Information: Errors, Misinformation, and Hallucination
A Scoping Review of Natural Language Processing in Addressing Medically Inaccurate Information: Errors, Misinformation, and Hallucination
Zhaoyi Sun
Wen-wai Yim
Özlem Uzuner
Fei Xia
Meliha Yetisgen
40
0
0
16 Apr 2025
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Wang Zhu
Tianqi Chen
Ching Ying Lin
Jade Law
Mazen Jizzini
Jorge J. Nieva
Ruishan Liu
Robin Jia
34
0
0
15 Apr 2025
Benchmarking Biopharmaceuticals Retrieval-Augmented Generation Evaluation
Benchmarking Biopharmaceuticals Retrieval-Augmented Generation Evaluation
Hanmeng Zhong
Linqing Chen
Weilei Wang
Wentao Wu
28
0
0
15 Apr 2025
SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging
SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging
Tan-Hanh Pham
Chris Ngo
Trong-Duong Bui
Minh Luu Quang
Tan-Huong Pham
Truong Son-Hy
27
0
0
14 Apr 2025
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
Aniket Pal
Sanket Biswas
Alloy Das
Ayush Lodh
Priyanka Banerjee
Soumitri Chattopadhyay
Dimosthenis Karatzas
Josep Lladós
C. V. Jawahar
VLM
32
0
0
12 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLM
LRM
37
1
0
12 Apr 2025
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation
Bo Zhang
Hui Ma
Dailin Li
Jian Ding
Jian Wang
Bo Xu
Hongfei Lin
KELM
42
0
0
10 Apr 2025
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan
Vinko Sabolčec
Matin Ansaripour
Ayush Kumar Tarun
Martin Jaggi
Antoine Bosselut
Imanol Schlag
34
0
0
08 Apr 2025
Cognitive Debiasing Large Language Models for Decision-Making
Cognitive Debiasing Large Language Models for Decision-Making
Yougang Lyu
Shijie Ren
Yue Feng
Zihan Wang
Z. Chen
Z. Z. Ren
Maarten de Rijke
36
0
0
05 Apr 2025
Biomedical Question Answering via Multi-Level Summarization on a Local Knowledge Graph
Biomedical Question Answering via Multi-Level Summarization on a Local Knowledge Graph
Lingxiao Guan
Y. Huang
Jie Liu
43
0
0
02 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Venkataramana Runkana
OffRL
45
1
0
02 Apr 2025
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Jian Zhao
Runze Liu
Kaiyan Zhang
Zhimu Zhou
Junqi Gao
...
Jiafei Lyu
Zhouyi Qian
Biqing Qi
Xiu Li
Bowen Zhou
OffRL
LRM
37
2
0
01 Apr 2025
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Juncheng Wu
Wenlong Deng
X. Li
Sheng Liu
Taomian Mi
...
Yihan Cao
Hui Ren
X. Li
Xiaoxiao Li
Yuyin Zhou
AI4MH
LRM
61
2
0
01 Apr 2025
RECKON: Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model
RECKON: Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model
Lin Zhang
Zhouhong Gu
Xiaoran Shi
Hongwei Feng
Yanghua Xiao
41
0
0
01 Apr 2025
WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation
WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation
Zhengyi Zhao
Shubo Zhang
Bin Liang
Binyang Li
Kam-Fai Wong
SyDa
49
0
0
31 Mar 2025
Efficient Inference for Large Reasoning Models: A Survey
Efficient Inference for Large Reasoning Models: A Survey
Y. Liu
Jiaying Wu
Yufei He
Hongcheng Gao
Hongyu Chen
Baolong Bi
Jiaheng Zhang
Zhiqi Huang
Bryan Hooi
LLMAG
LRM
65
7
0
29 Mar 2025
Patience is all you need! An agentic system for performing scientific literature review
Patience is all you need! An agentic system for performing scientific literature review
David Brett
Anniek Myatt
25
0
0
28 Mar 2025
3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark
3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark
Ivan Sviridov
Amina Miftakhova
Artemiy Tereshchenko
Galina Zubkova
Pavel Blinov
Andrey Savchenko
LM&MA
29
0
0
26 Mar 2025
TempTest: Local Normalization Distortion and the Detection of Machine-generated Text
TempTest: Local Normalization Distortion and the Detection of Machine-generated Text
Tom Kempton
Stuart Burrell
Connor Cheverall
DeLMO
111
0
0
26 Mar 2025
Experience Retrieval-Augmentation with Electronic Health Records Enables Accurate Discharge QA
Experience Retrieval-Augmentation with Electronic Health Records Enables Accurate Discharge QA
Justice Ou
Tinglin Huang
Yilun Zhao
Ziyang Yu
Peiqing Lu
Rex Ying
RALM
53
0
0
23 Mar 2025
Understanding the Effects of RLHF on the Quality and Detectability of LLM-Generated Texts
Understanding the Effects of RLHF on the Quality and Detectability of LLM-Generated Texts
Beining Xu
Arkaitz Zubiaga
DeLMO
68
0
0
23 Mar 2025
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark
Z. Li
Yiying Yang
Jiping Lang
Wenhao Jiang
Yuhang Zhao
...
Yuhua Bi
Xiaofei Zeng
Yixian Chen
Junrong Chen
Lin Yao
AI4MH
LM&MA
ELM
41
0
0
22 Mar 2025
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging
Aladin Djuhera
S. Kadhe
Farhan Ahmed
Syed Zawad
Holger Boche
MoMe
49
0
0
21 Mar 2025
Advancing Problem-Based Learning in Biomedical Engineering in the Era of Generative AI
Advancing Problem-Based Learning in Biomedical Engineering in the Era of Generative AI
Micky C. Nnamdi
J. Ben Tamo
Wenqi Shi
M. D. Wang
AI4CE
43
0
0
20 Mar 2025
CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware
CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware
Ankit Dutta
Nabarup Ghosh
Ankush Chatterjee
53
0
0
18 Mar 2025
MDTeamGPT: A Self-Evolving LLM-based Multi-Agent Framework for Multi-Disciplinary Team Medical Consultation
MDTeamGPT: A Self-Evolving LLM-based Multi-Agent Framework for Multi-Disciplinary Team Medical Consultation
Kai-xiang Chen
X. Li
Tianpei Yang
Hewei Wang
Wei Dong
Yang Gao
LLMAG
LM&MA
74
2
0
18 Mar 2025
A Survey on the Optimization of Large Language Model-based Agents
A Survey on the Optimization of Large Language Model-based Agents
Shangheng Du
Jiabao Zhao
Jinxin Shi
Zhentao Xie
Xin Jiang
Yanhong Bai
Liang He
LLMAG
LM&Ro
LM&MA
200
0
0
16 Mar 2025
Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models?
Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models?
Basab Jha
Firoj Paudel
37
0
0
16 Mar 2025
1234...91011
Next