Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.05320
Cited By
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
14 January 2022
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CommonsenseQA 2.0: Exposing the Limits of AI through Gamification"
50 / 105 papers shown
Title
Causal Parrots: Large Language Models May Talk Causality But Are Not Causal
Matej Zečević
Moritz Willig
Devendra Singh Dhami
Kristian Kersting
LRM
30
102
0
24 Aug 2023
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
Tim Hartill
N. Tan
Michael Witbrock
Patricia J. Riddle
ReLM
KELM
LRM
34
2
0
02 Aug 2023
CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models
Xingbo Wang
Renfei Huang
Zhihua Jin
Tianqing Fang
Huamin Qu
VLM
ReLM
LRM
40
1
0
23 Jul 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye
Doyoung Kim
Sungdong Kim
Hyeonbin Hwang
Seungone Kim
Yongrae Jo
James Thorne
Juho Kim
Minjoon Seo
ALM
46
99
0
20 Jul 2023
Toward Grounded Commonsense Reasoning
Minae Kwon
Hengyuan Hu
Vivek Myers
Siddharth Karamcheti
Anca Dragan
Dorsa Sadigh
LM&Ro
ReLM
LRM
42
9
0
14 Jun 2023
Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases
Yuval Reif
Roy Schwartz
28
7
0
30 May 2023
UFO: Unified Fact Obtaining for Commonsense Question Answering
Zhifeng Li
Yifan Fan
Bowei Zou
Yu Hong
HILM
LRM
32
1
0
25 May 2023
Getting MoRE out of Mixture of Language Model Reasoning Experts
Chenglei Si
Weijia Shi
Chen Zhao
Luke Zettlemoyer
Jordan L. Boyd-Graber
LRM
26
24
0
24 May 2023
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
Boshi Wang
Xiang Yue
Huan Sun
ELM
LRM
46
60
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
41
6
0
21 May 2023
Evaluation of medium-large Language Models at zero-shot closed book generative question answering
René Peinl
Johannes Wirth
ELM
26
7
0
19 May 2023
KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering
Zhifeng Li
Bowei Zou
Yifan Fan
Yu Hong
24
3
0
15 May 2023
Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-Text Rationales
Brihi Joshi
Ziyi Liu
Sahana Ramnath
Aaron Chan
Zhewei Tong
Shaoliang Nie
Qifan Wang
Yejin Choi
Xiang Ren
HAI
LRM
34
29
0
11 May 2023
Decker: Double Check with Heterogeneous Knowledge for Commonsense Fact Verification
Anni Zou
Zhuosheng Zhang
Hai Zhao
HILM
37
6
0
10 May 2023
CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning
Weiqi Wang
Tianqing Fang
Baixuan Xu
Chun Yi Louis Bo
Yangqiu Song
Lei Chen
ReLM
LRM
25
34
0
08 May 2023
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
Jiacheng Liu
Wenya Wang
Dianzhuo Wang
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
VLM
43
48
0
05 May 2023
Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Kaixin Ma
Hao Cheng
Yu Zhang
Xiaodong Liu
Eric Nyberg
Jianfeng Gao
LRM
28
16
0
04 May 2023
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
Jinyang Li
Binyuan Hui
Ge Qu
Jiaxi Yang
Binhua Li
...
Guoliang Li
Kevin C. C. Chang
Fei Huang
Reynold Cheng
Yongbin Li
LMTD
59
363
0
04 May 2023
A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text
Yunxin Li
Baotian Hu
Yuxin Ding
Lin Ma
Hao Fei
25
5
0
03 May 2023
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Anjana Arunkumar
Shubham Sharma
Rakhi Agrawal
Sriramakrishnan Chandrasekaran
Chris Bryan
34
0
0
12 Apr 2023
Natural Language Reasoning, A Survey
Fei Yu
Hongbo Zhang
Prayag Tiwari
Benyou Wang
ReLM
LRM
49
53
0
26 Mar 2023
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta
Yonatan Bitton
Jack Hessel
Ludwig Schmidt
Yuval Elovici
Gabriel Stanovsky
Roy Schwartz
VLM
121
66
0
13 Mar 2023
Commonsense Reasoning for Conversational AI: A Survey of the State of the Art
Christopher Richardson
Larry Heck
LRM
30
8
0
15 Feb 2023
Benchmarks for Automated Commonsense Reasoning: A Survey
E. Davis
ELM
LRM
24
58
0
09 Feb 2023
Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Anjana Arunkumar
Swaroop Mishra
Bhavdeep Singh Sachdeva
Chitta Baral
Chris Bryan
30
0
0
09 Feb 2023
LaMPP: Language Models as Probabilistic Priors for Perception and Action
Belinda Z. Li
William Chen
Pratyusha Sharma
Jacob Andreas
24
15
0
03 Feb 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
GPT-Neo for commonsense reasoning -- a theoretical and practical lens
Rohan Kashyap
Vivek Kashyap
Narendra C.P
ReLM
ELM
LRM
38
7
0
28 Nov 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
18
7
0
31 Oct 2022
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering
Deepanway Ghosal
Navonil Majumder
Rada Mihalcea
Soujanya Poria
58
10
0
29 Oct 2022
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
W. Yu
Chenguang Zhu
Zhihan Zhang
Shuohang Wang
Zhuosheng Zhang
Yuwei Fang
Meng Jiang
LRM
ReLM
15
19
0
23 Oct 2022
Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge
Kaixin Ma
Hao Cheng
Xiaodong Liu
Eric Nyberg
Jianfeng Gao
LRM
144
15
0
22 Oct 2022
Task Compass: Scaling Multi-task Pre-training with Task Prefix
Zhuosheng Zhang
Shuohang Wang
Yichong Xu
Yuwei Fang
W. Yu
Yang Liu
Han Zhao
Chenguang Zhu
Michael Zeng
SSL
LRM
25
16
0
12 Oct 2022
Mind's Eye: Grounded Language Model Reasoning through Simulation
Ruibo Liu
Jason W. Wei
S. Gu
Te-Yen Wu
Soroush Vosoughi
Claire Cui
Denny Zhou
Andrew M. Dai
ReLM
LRM
118
79
0
11 Oct 2022
Unpacking Large Language Models with Conceptual Consistency
Pritish Sahu
Michael Cogswell
Yunye Gong
Ajay Divakaran
LRM
87
16
0
29 Sep 2022
Elaboration-Generating Commonsense Question Answering at Scale
Wenya Wang
Vivek Srikumar
Hannaneh Hajishirzi
Noah A. Smith
ELM
LRM
37
15
0
02 Sep 2022
On Reality and the Limits of Language Data: Aligning LLMs with Human Norms
Nigel Collier
Fangyu Liu
Ehsan Shareghi
16
3
0
25 Aug 2022
Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense Reasoning
Letian Peng
Z. Li
Hai Zhao
ReLM
LRM
18
1
0
23 Aug 2022
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
Yonatan Bitton
Nitzan Bitton-Guetta
Ron Yosef
Yuval Elovici
Joey Tianyi Zhou
Gabriel Stanovsky
Roy Schwartz
25
19
0
25 Jul 2022
Hidden Schema Networks
Ramses J. Sanchez
L. Conrads
Pascal Welke
K. Cvejoski
C. Ojeda
NAI
MILM
19
3
0
08 Jul 2022
DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning
Zhi Chen
Jijia Bao
Lu Chen
Yuncong Liu
Da Ma
...
Xinhsuai Dong
Fujiang Ge
Qingliang Miao
Jian-Guang Lou
Kai Yu
ALM
AI4CE
48
3
0
25 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLM
LRM
229
190
0
24 May 2022
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
59
297
0
10 May 2022
Great Truths are Always Simple: A Rather Simple Knowledge Encoder for Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models
Jinhao Jiang
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
24
15
0
04 May 2022
Inferring Implicit Relations in Complex Questions with Language Models
Uri Katz
Mor Geva
Jonathan Berant
ReLM
LRM
24
11
0
28 Apr 2022
Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning
Swarnadeep Saha
Prateek Yadav
Joey Tianyi Zhou
11
9
0
11 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
398
8,559
0
28 Jan 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
79
212
0
16 Jan 2022
CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models
Jorg Frohberg
Frank Binder
SLR
6
27
0
22 Dec 2021
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Yichong Xu
Chenguang Zhu
Shuohang Wang
Siqi Sun
Hao Cheng
Xiaodong Liu
Jianfeng Gao
Pengcheng He
Michael Zeng
Xuedong Huang
LRM
254
55
0
06 Dec 2021
Previous
1
2
3
Next