ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.00937
  4. Cited By
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge

CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

2 November 2018
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
    RALM
ArXivPDFHTML

Papers citing "CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge"

50 / 377 papers shown
Title
CogLM: Tracking Cognitive Development of Large Language Models
CogLM: Tracking Cognitive Development of Large Language Models
Xinglin Wang
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
ELM
67
0
0
17 Aug 2024
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions
Chenming Tang
Zhixiang Wang
Yunfang Wu
Yunfang Wu
LRM
34
0
0
16 Aug 2024
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models
Pranshu Pandya
Agney S Talwarr
Vatsal Gupta
Tushar Kataria
Dan Roth
Vivek Gupta
LRM
67
2
0
15 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
79
9
0
09 Jul 2024
Retrieved In-Context Principles from Previous Mistakes
Retrieved In-Context Principles from Previous Mistakes
Hao Sun
Yong-jia Jiang
Bo Wang
Yingyan Hou
Yan Zhang
Pengjun Xie
Fei Huang
63
1
0
08 Jul 2024
Progress or Regress? Self-Improvement Reversal in Post-training
Progress or Regress? Self-Improvement Reversal in Post-training
Ting Wu
Xuefeng Li
Pengfei Liu
LRM
33
11
0
06 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
63
7
0
02 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
74
41
1
01 Jul 2024
CPT: Consistent Proxy Tuning for Black-box Optimization
CPT: Consistent Proxy Tuning for Black-box Optimization
Yuanyang He
Zitong Huang
Xinxing Xu
Rick Siow Mong Goh
Salman Khan
W. Zuo
Yong Liu
Chun-Mei Feng
48
0
0
01 Jul 2024
VarBench: Robust Language Model Benchmarking Through Dynamic Variable
  Perturbation
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation
Kun Qian
Shunji Wan
Claudia Tang
Youzhi Wang
Xuanming Zhang
Maximillian Chen
Zhou Yu
AAML
47
8
0
25 Jun 2024
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs
Yi Fang
Moxin Li
Wenjie Wang
Hui Lin
Fuli Feng
LRM
65
5
0
17 Jun 2024
Demonstration Notebook: Finding the Most Suited In-Context Learning
  Example from Interactions
Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions
Yiming Tang
Bin Dong
38
0
0
16 Jun 2024
GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
Zhen Xiang
Linzhi Zheng
Yanjie Li
Junyuan Hong
Qinbin Li
...
Zidi Xiong
Chulin Xie
Carl Yang
Dawn Song
Bo Li
LLMAG
45
24
0
13 Jun 2024
OLMES: A Standard for Language Model Evaluations
OLMES: A Standard for Language Model Evaluations
Yuling Gu
Oyvind Tafjord
Bailey Kuehl
Dany Haddad
Jesse Dodge
Hannaneh Hajishirzi
ELM
40
14
0
12 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
44
1
0
11 Jun 2024
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij
Felix Hofstätter
Ollie Jaffe
Samuel F. Brown
Francis Rhys Ward
ELM
47
22
0
11 Jun 2024
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Yipeng Zhang
Haitao Mi
Helen Meng
CLL
KELM
81
5
0
10 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
39
0
06 Jun 2024
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
Qi Cheng
Michael Boratko
Pranay Kumar Yelugam
T. O’Gorman
Nalini Singh
Andrew McCallum
X. Li
ELM
LRM
40
3
0
06 Jun 2024
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
Evan Becker
Stefano Soatto
45
6
0
05 Jun 2024
Break the Chain: Large Language Models Can be Shortcut Reasoners
Break the Chain: Large Language Models Can be Shortcut Reasoners
Mengru Ding
Hanmeng Liu
Zhizhang Fu
Jian Song
Wenbo Xie
Yue Zhang
KELM
LRM
36
7
0
04 Jun 2024
ACCORD: Closing the Commonsense Measurability Gap
ACCORD: Closing the Commonsense Measurability Gap
François Roewer-Després
Jinyue Feng
Zining Zhu
Frank Rudzicz
LRM
48
0
0
04 Jun 2024
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for
  Retrieval-Augmented Large Language Models
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
Yutao Zhu
Zhaoheng Huang
Zhicheng Dou
Ji-Rong Wen
RALM
56
5
0
30 May 2024
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient
  Cloud-edge Collaboration LLM Deployment
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment
Yao Yao
Z. Li
Hai Zhao
29
5
0
30 May 2024
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via
  System-Algorithm Co-design
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design
Rui Kong
Qiyang Li
Xinyu Fang
Qingtian Feng
Qingfeng He
Yazhu Dong
Weijun Wang
Yuanchun Li
Linghe Kong
Yunxin Liu
MoE
40
4
0
28 May 2024
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Cong Lu
Shengran Hu
Jeff Clune
LLMAG
47
10
0
24 May 2024
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research
Jiajie Jin
Yutao Zhu
Xinyu Yang
Chenghao Zhang
Zhicheng Dou
Chenghao Zhang
Tong Zhao
Zhao Yang
Zhicheng Dou
Ji-Rong Wen
VLM
85
51
0
22 May 2024
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer
  Selection in Large Language Models
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Zhiyuan Zeng
Xiaonan Li
...
Qinyuan Cheng
Ding Wang
Xiaofeng Mou
Xipeng Qiu
XuanJing Huang
LRM
46
4
0
21 May 2024
Quantifying In-Context Reasoning Effects and Memorization Effects in
  LLMs
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
Siyu Lou
Yuntian Chen
Xiaodan Liang
Liang Lin
Quanshi Zhang
42
2
0
20 May 2024
DaVinci at SemEval-2024 Task 9: Few-shot prompting GPT-3.5 for
  Unconventional Reasoning
DaVinci at SemEval-2024 Task 9: Few-shot prompting GPT-3.5 for Unconventional Reasoning
Suyash Vardhan Mathur
Akshett Rai Jindal
Manish Shrivastava
LRM
39
1
0
19 May 2024
Chain of Thoughtlessness? An Analysis of CoT in Planning
Chain of Thoughtlessness? An Analysis of CoT in Planning
Kaya Stechly
Karthik Valmeekam
Subbarao Kambhampati
LRM
LM&Ro
75
43
0
08 May 2024
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Qizhou Chen
Taolin Zhang
Xiaofeng He
Dongyang Li
Chengyu Wang
Longtao Huang
Hui Xue
CLL
KELM
51
10
0
06 May 2024
General Purpose Verification for Chain of Thought Prompting
General Purpose Verification for Chain of Thought Prompting
Robert Vacareanu
Anurag Pratik
Evangelia Spiliopoulou
Zheng Qi
Giovanni Paolini
Neha Ann John
Jie Ma
Yassine Benajiba
Miguel Ballesteros
LRM
35
8
0
30 Apr 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRM
KELM
ReLM
31
31
0
26 Apr 2024
Examining the robustness of LLM evaluation to the distributional
  assumptions of benchmarks
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks
Melissa Ailem
Katerina Marazopoulou
Charlotte Siska
James Bono
59
15
0
25 Apr 2024
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability
  of Large Language Models
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
38
24
0
23 Apr 2024
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Qihuang Zhong
Kang Wang
Ziyang Xu
Juhua Liu
Liang Ding
Bo Du
LRM
AIMat
63
3
0
23 Apr 2024
Towards smaller, faster decoder-only transformers: Architectural
  variants and their implications
Towards smaller, faster decoder-only transformers: Architectural variants and their implications
Sathya Krishnan Suresh
P. Shunmugapriya
24
0
0
22 Apr 2024
Learn Your Reference Model for Real Good Alignment
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
54
28
0
15 Apr 2024
Distilling Reasoning Ability from Large Language Models with Adaptive
  Thinking
Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
Xiao Chen
Sihang Zhou
K. Liang
Xinwang Liu
ReLM
LRM
37
4
0
14 Apr 2024
A Survey on the Integration of Generative AI for Critical Thinking in
  Mobile Networks
A Survey on the Integration of Generative AI for Critical Thinking in Mobile Networks
Athanasios Karapantelakis
Alexandros Nikou
Ajay Kattepur
Jean Martins
Leonid Mokrushin
S. Mohalik
Marin Orlic
Aneta Vulgarakis Feljan
29
1
0
10 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting
  Fidelity
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
29
10
0
03 Apr 2024
uTeBC-NLP at SemEval-2024 Task 9: Can LLMs be Lateral Thinkers?
uTeBC-NLP at SemEval-2024 Task 9: Can LLMs be Lateral Thinkers?
Pouya Sadeghi
Amirhossein Abaskohi
Yadollah Yaghoobzadeh
LRM
ReLM
39
1
0
03 Apr 2024
HyperCLOVA X Technical Report
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
37
8
0
02 Apr 2024
Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge
  in Datasets and Large Language Models
Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models
Zhuoqun Li
Hongyu Lin
Yaojie Lu
Hao Xiang
Xianpei Han
Le Sun
41
1
0
14 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
150
511
0
07 Mar 2024
Learning to Maximize Mutual Information for Chain-of-Thought
  Distillation
Learning to Maximize Mutual Information for Chain-of-Thought Distillation
Xin Chen
Hanxian Huang
Yanjun Gao
Yi Wang
Jishen Zhao
Ke Ding
35
12
0
05 Mar 2024
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering
Armin Toroghi
Willis Guo
Mohammad Mahdi Torabi pour
Scott Sanner
LRM
31
8
0
03 Mar 2024
Prediction-Powered Ranking of Large Language Models
Prediction-Powered Ranking of Large Language Models
Ivi Chatzi
Eleni Straitouri
Suhas Thejaswi
Manuel Gomez Rodriguez
ALM
29
5
0
27 Feb 2024
Automating Dataset Updates Towards Reliable and Timely Evaluation of
  Large Language Models
Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models
Jiahao Ying
Yixin Cao
Yushi Bai
Qianru Sun
Bo Wang
Wei Tang
Zhaojun Ding
Yizhe Yang
Xuanjing Huang
Shuicheng Yan
KELM
26
6
0
19 Feb 2024
Previous
12345678
Next