ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.01992
  4. Cited By
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?

Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?

2 July 2024
Nishant Balepur
Rachel Rudinger
ArXivPDFHTML

Papers citing "Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?"

8 / 8 papers shown
Title
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
Pavel Chizhov
Mattia Nee
Pierre-Carl Langlais
Ivan P. Yamshchikov
ReLM
ELM
LRM
39
1
0
10 Apr 2025
It is Too Many Options: Pitfalls of Multiple-Choice Questions in Generative AI and Medical Education
It is Too Many Options: Pitfalls of Multiple-Choice Questions in Generative AI and Medical Education
Shrutika Singh
Anton Alyakin
Daniel Alber
Jaden Stryker
Ai Phuong S Tong
...
Mathew de la Paz
Miguel Hernandez-Rovira
Ki Yun Park
Eric Leuthardt
E. Oermann
AI4MH
AI4Ed
ELM
64
1
0
13 Mar 2025
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
Chenyue Li
Wen Deng
Mengqian Lu
Binhang Yuan
ELM
AI4Cl
LRM
90
0
0
03 Feb 2025
TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with
  Scalable Context and Symbolic Extension
TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension
Zipeng Qiu
You Peng
Guangxin He
Binhang Yuan
Chen Wang
LMTD
106
2
0
29 Nov 2024
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
76
1
0
26 Oct 2024
Plausibly Problematic Questions in Multiple-Choice Benchmarks for
  Commonsense Reasoning
Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning
Shramay Palta
Nishant Balepur
Peter Rankel
Sarah Wiegreffe
Marine Carpuat
Rachel Rudinger
ELM
31
4
0
06 Oct 2024
Leveraging Large Language Models for Multiple Choice Question Answering
Leveraging Large Language Models for Multiple Choice Question Answering
Joshua Robinson
Christopher Rytting
David Wingate
ELM
143
186
0
22 Oct 2022
Better Distractions: Transformer-based Distractor Generation and
  Multiple Choice Question Filtering
Better Distractions: Transformer-based Distractor Generation and Multiple Choice Question Filtering
J. Offerijns
Suzan Verberne
Tessa Verhoef
18
26
0
19 Oct 2020
1