ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.14766
  4. Cited By
Evaluating Large Language Models for Public Health Classification and Extraction Tasks

Evaluating Large Language Models for Public Health Classification and Extraction Tasks

20 February 2025
Joshua Harris
Timothy Laurence
Leo Loman
Fan Grayson
Toby Nonnenmacher
Harry Long
Loes WalsGriffith
Amy Douglas
Holly Fountain
Stelios Georgiou
Jo Hardstaff
Kathryn Hopkins
Y-Ling Chi
G. Kuyumdzhieva
Lesley Larkin
Samuel Collins
Hamish Mohammed
Thomas Finnie
Luke Hounsome
Michael Borowitz
Steven Riley
    LM&MAAI4MH
ArXiv (abs)PDFHTML

Papers citing "Evaluating Large Language Models for Public Health Classification and Extraction Tasks"

36 / 36 papers shown
Title
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
Joshua Harris
Fan Grayson
Felix Feldman
Timothy Laurence
Toby Nonnenmacher
...
Leo Loman
Selina Patel
Thomas Finnie
Samuel Collins
Michael Borowitz
AI4MHLM&MAELM
105
0
0
09 May 2025
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
110
42
0
22 Apr 2024
Evaluating Large Language Models for Health-Related Text Classification
  Tasks with Public Social Media Data
Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data
Yuting Guo
Anthony Ovadje
M. Al-garadi
Abeed Sarker
AI4MH
88
9
0
27 Mar 2024
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Hanjie Chen
Zhouxiang Fang
Yash Singla
Mark Dredze
ELMAI4MH
117
43
0
28 Feb 2024
FinBen: A Holistic Financial Benchmark for Large Language Models
FinBen: A Holistic Financial Benchmark for Large Language Models
Qianqian Xie
Weiguang Han
Zhengyu Chen
Ruoyu Xiang
Xiao Zhang
...
Yanzhao Lai
Hao Wang
Min Peng
Sophia Ananiadou
Jimin Huang
AIFin
90
48
0
20 Feb 2024
RareBench: Can LLMs Serve as Rare Diseases Specialists?
RareBench: Can LLMs Serve as Rare Diseases Specialists?
Xuanzhong Chen
Xiaohao Mao
Qihan Guo
Lun Wang
Shuyang Zhang
Ting Chen
ELMLM&MAAI4MH
88
25
0
09 Feb 2024
Evaluating and Enhancing Large Language Models Performance in
  Domain-specific Medicine: Osteoarthritis Management with DocOA
Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA
Xi Chen
M. You
Li Wang
Weizhi Liu
Yu Fu
Jie Xu
Shaoting Zhang
Gang Chen
Kang Li
Jian Li
ELMLM&MA
29
4
0
20 Jan 2024
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning
  Benchmark for Expert AGI
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue
Yuansheng Ni
Kai Zhang
Tianyu Zheng
Ruoqi Liu
...
Yibo Liu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
OSLMELMVLM
266
959
0
27 Nov 2023
LawBench: Benchmarking Legal Knowledge of Large Language Models
LawBench: Benchmarking Legal Knowledge of Large Language Models
Zhiwei Fei
Xiaoyu Shen
D. Zhu
Fengzhe Zhou
Zhuo Han
Songyang Zhang
Kai-xiang Chen
Zongwen Shen
Jidong Ge
ELMAILaw
110
47
0
28 Sep 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
196
2,322
0
12 Sep 2023
Bias and Fairness in Large Language Models: A Survey
Bias and Fairness in Large Language Models: A Survey
Isabel O. Gallegos
Ryan Rossi
Joe Barrow
Md Mehrab Tanjim
Sungchul Kim
Franck Dernoncourt
Tong Yu
Ruiyi Zhang
Nesreen Ahmed
AILaw
114
594
0
02 Sep 2023
Challenges and Applications of Large Language Models
Challenges and Applications of Large Language Models
Jean Kaddour
J. Harris
Maximilian Mozes
Herbie Bradley
Roberta Raileanu
R. McHardy
UQCVALMAAML
80
313
0
19 Jul 2023
A Survey on Evaluation of Large Language Models
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELMLM&MAALM
156
1,723
0
06 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
441
4,444
0
09 Jun 2023
Interpretable Medical Diagnostics with Structured Data Extraction by
  Large Language Models
Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models
Aleksa Bisercic
Mladen Nikolic
M. Schaar
Boris Delibasic
Pietro Lio
Andrija Petrović
80
17
0
08 Jun 2023
Can LLMs like GPT-4 outperform traditional AI tools in dementia
  diagnosis? Maybe, but not today
Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today
Zhuo Wang
R. Li
Bowen Dong
Jie Wang
Xiuxing Li
...
C. Mao
Wei Zhang
L. Dong
Jing Gao
Jianyong Wang
LM&MAELMAI4MH
72
20
0
02 Jun 2023
What can Large Language Models do in chemistry? A comprehensive
  benchmark on eight tasks
What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks
Taicheng Guo
Kehan Guo
B. Nan
Zhengwen Liang
Zhichun Guo
Nitesh Chawla
Olaf Wiest
Xiangliang Zhang
ELM
139
141
0
27 May 2023
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for
  Real-World Pharmacovigilance
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance
Karel DÓosterlinck
François Remy
Johannes Deleu
Thomas Demeester
Chris Develder
Klim Zaporojets
Aneiss Ghodsi
Simon Ellershaw
Jack R. Collins
Christopher Potts
82
11
0
22 May 2023
Summarizing, Simplifying, and Synthesizing Medical Evidence Using GPT-3
  (with Varying Success)
Summarizing, Simplifying, and Synthesizing Medical Evidence Using GPT-3 (with Varying Success)
Chantal Shaib
Millicent Li
Sebastian Antony Joseph
Iain J. Marshall
Junyi Jessy Li
Byron C. Wallace
LM&MAELM
68
67
0
10 May 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALMLM&MA
280
631
0
03 May 2023
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of
  Large Language Models
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
Tyna Eloundou
Sam Manning
Pamela Mishkin
Daniel Rock
ELM
66
401
0
17 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,761
0
15 Mar 2023
Can GPT-3 Perform Statutory Reasoning?
Can GPT-3 Perform Statutory Reasoning?
Andrew Blair-Stanek
Nils Holzenberger
Benjamin Van Durme
ELMLRM
109
100
0
13 Feb 2023
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities
Jillian Bommarito
M. Bommarito
Daniel Martin Katz
Jessica Katz
ELM
52
54
0
11 Jan 2023
GPT Takes the Bar Exam
GPT Takes the Bar Exam
M. Bommarito
Daniel Martin Katz
ELM
77
155
0
29 Dec 2022
Legal Prompting: Teaching a Language Model to Think Like a Lawyer
Legal Prompting: Teaching a Language Model to Think Like a Lawyer
Fang Yu
Lee Quartey
Frank Schilder
ELMLRM
42
68
0
02 Dec 2022
Galactica: A Large Language Model for Science
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELMReLM
117
778
0
16 Nov 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
231
3,158
0
20 Oct 2022
News Summarization and Evaluation in the Era of GPT-3
News Summarization and Evaluation in the Era of GPT-3
Tanya Goyal
Junyi Jessy Li
Greg Durrett
ELM
110
411
0
26 Sep 2022
Can large language models reason about medical questions?
Can large language models reason about medical questions?
Valentin Liévin
C. Hother
Andreas Geert Motzfeldt
Ole Winther
ELMLM&MAAI4MHLRM
101
314
0
17 Jul 2022
Large Language Models are Few-Shot Clinical Information Extractors
Large Language Models are Few-Shot Clinical Information Extractors
Monica Agrawal
S. Hegselmann
Hunter Lang
Yoon Kim
David Sontag
BDLLM&MA
241
347
0
25 May 2022
BBQ: A Hand-Built Bias Benchmark for Question Answering
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
270
425
0
15 Oct 2021
The RareDis corpus: a corpus annotated with rare diseases, their signs
  and symptoms
The RareDis corpus: a corpus annotated with rare diseases, their signs and symptoms
Claudia Martínez-de Miguel
Isabel Segura-Bedmar
E. Chacón-Solano
S. Guerrero-Aspizua
25
20
0
02 Aug 2021
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
187
4,572
0
07 Sep 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
490
20,342
0
23 Oct 2019
PubMedQA: A Dataset for Biomedical Research Question Answering
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
398
913
0
13 Sep 2019
1