ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.06177
  4. Cited By
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text

CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text

16 August 2019
Koustuv Sinha
Shagun Sodhani
Jin Dong
Joelle Pineau
William L. Hamilton
ArXivPDFHTML

Papers citing "CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text"

50 / 52 papers shown
Title
SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas
SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas
Anjiang Wei
Yuheng Wu
Yingjia Wan
Tarun Suresh
Huanmi Tan
Zhanke Zhou
Sanmi Koyejo
Ke Wang
Alex Aiken
ReLM
LRM
7
0
0
20 May 2025
Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate
Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate
Ziyang Huang
Wangtao Sun
Jun Zhao
Kang Liu
LRM
17
0
0
16 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
91
2
0
26 Apr 2025
Do Large Language Models know who did what to whom?
Do Large Language Models know who did what to whom?
Joseph M. Denning
Xiaohan
Bryor Snefjella
Idan A. Blank
67
1
0
23 Apr 2025
Generative Evaluation of Complex Reasoning in Large Language Models
Generative Evaluation of Complex Reasoning in Large Language Models
Haowei Lin
Xinbing Wang
Ruilin Yan
Baizhou Huang
Haotian Ye
Jianhua Zhu
Zihao Wang
James Zou
Jianzhu Ma
Yitao Liang
ReLM
ELM
LRM
210
0
0
03 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models
Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models
Irtaza Khalid
Amir Masoud Nourollah
Steven Schockaert
LRM
52
0
0
30 Mar 2025
MastermindEval: A Simple But Scalable Reasoning Benchmark
Jonas Golde
Patrick Haller
Fabio Barth
Alan Akbik
LRM
ReLM
ELM
58
2
0
07 Mar 2025
Reasoning Bias of Next Token Prediction Training
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
94
2
0
21 Feb 2025
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Alexander Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
57
1
0
31 Jan 2025
FLARE: Faithful Logic-Aided Reasoning and Exploration
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLM
LRM
69
2
0
14 Oct 2024
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Rushang Karia
Daniel Bramblett
D. Dobhal
Siddharth Srivastava
ELM
LRM
35
0
0
11 Oct 2024
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik
Jason Liu
Claire Wang
Saikat Dutta
Mayur Naik
Mayur Naik
Eric Wong
37
1
0
04 Oct 2024
COOL: Efficient and Reliable Chain-Oriented Objective Logic with Neural Networks Feedback Control for Program Synthesis
COOL: Efficient and Reliable Chain-Oriented Objective Logic with Neural Networks Feedback Control for Program Synthesis
Jipeng Han
39
0
0
02 Oct 2024
Enhancing Logical Reasoning in Large Language Models through Graph-based
  Synthetic Data
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data
Jiaming Zhou
Abbas Ghaddar
Ge Zhang
Liheng Ma
Yaochen Hu
Soumyasundar Pal
Mark J. Coates
Bin Wang
Yingxue Zhang
Jianye Hao
ReLM
LRM
39
4
0
19 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
Hao Fei
Xunliang Cai
Yixin Cao
Liangcai Gao
Zhi Tang
LRM
52
3
0
19 Sep 2024
The Factorization Curse: Which Tokens You Predict Underlie the Reversal
  Curse and More
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
O. Kitouni
Niklas Nolte
Diane Bouchacourt
Adina Williams
Mike Rabbat
Mark Ibrahim
LRM
CLL
54
12
0
07 Jun 2024
LoFiT: Localized Fine-tuning on LLM Representations
LoFiT: Localized Fine-tuning on LLM Representations
Fangcong Yin
Xi Ye
Greg Durrett
38
13
0
03 Jun 2024
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability
  of Large Language Models
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
38
24
0
23 Apr 2024
Calibrating Large Language Models with Sample Consistency
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
51
23
0
21 Feb 2024
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and
  Improving LLMs
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
Siyuan Wang
Zhongyu Wei
Yejin Choi
Xiang Ren
ReLM
ELM
LRM
16
21
0
18 Feb 2024
Large Language Models can Learn Rules
Large Language Models can Learn Rules
Zhaocheng Zhu
Yuan Xue
Xinyun Chen
Denny Zhou
Jian Tang
Dale Schuurmans
Hanjun Dai
LRM
ReLM
41
63
0
10 Oct 2023
Coupling Large Language Models with Logic Programming for Robust and
  General Reasoning from Text
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Zhun Yang
Adam Ishay
Joohyung Lee
LRM
ELM
36
52
0
15 Jul 2023
SkillQG: Learning to Generate Question for Reading Comprehension
  Assessment
SkillQG: Learning to Generate Question for Reading Comprehension Assessment
Xiaoqiang Wang
Bang Liu
Siliang Tang
Lingfei Wu
25
3
0
08 May 2023
Scallop: A Language for Neurosymbolic Programming
Scallop: A Language for Neurosymbolic Programming
Ziyang Li
Jiani Huang
Mayur Naik
ReLM
LRM
NAI
24
30
0
10 Apr 2023
Natural Language Reasoning, A Survey
Natural Language Reasoning, A Survey
Fei Yu
Hongbo Zhang
Prayag Tiwari
Benyou Wang
ReLM
LRM
49
53
0
26 Mar 2023
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
D. Ribeiro
Shen Wang
Xiaofei Ma
He Zhu
Rui Dong
...
William Yang Wang
Zhiheng Huang
George Karypis
Bing Xiang
Dan Roth
LRM
ReLM
28
23
0
13 Feb 2023
Analyzing the Effectiveness of the Underlying Reasoning Tasks in
  Multi-hop Question Answering
Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering
Xanh Ho
A. Nguyen
Saku Sugawara
Akiko Aizawa
LRM
44
7
0
12 Feb 2023
Large Language Models Can Be Easily Distracted by Irrelevant Context
Large Language Models Can Be Easily Distracted by Irrelevant Context
Freda Shi
Xinyun Chen
Kanishka Misra
Nathan Scales
David Dohan
Ed H. Chi
Nathanael Scharli
Denny Zhou
ReLM
RALM
LRM
33
537
0
31 Jan 2023
Faithful Chain-of-Thought Reasoning
Faithful Chain-of-Thought Reasoning
Qing Lyu
Shreya Havaldar
Adam Stein
Li Zhang
D. Rao
Eric Wong
Marianna Apidianaki
Chris Callison-Burch
ReLM
LRM
41
208
0
31 Jan 2023
Reasoning with Language Model Prompting: A Survey
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLM
ELM
LRM
71
311
0
19 Dec 2022
Evaluating Step-by-Step Reasoning through Symbolic Verification
Evaluating Step-by-Step Reasoning through Symbolic Verification
Yi-Fan Zhang
Hanlin Zhang
Li Erran Li
Eric P. Xing
ReLM
LRM
19
8
0
16 Dec 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
127
94
0
06 Oct 2022
ChemAlgebra: Algebraic Reasoning on Chemical Reactions
ChemAlgebra: Algebraic Reasoning on Chemical Reactions
Andrea Valenti
D. Bacciu
Antonio Vergari
OOD
LRM
35
0
0
05 Oct 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
30
34
0
21 Jul 2022
On the Paradox of Learning to Reason from Data
On the Paradox of Learning to Reason from Data
Honghua Zhang
Liunian Harold Li
Tao Meng
Kai-Wei Chang
Mathias Niepert
NAI
ReLM
OOD
LRM
140
104
0
23 May 2022
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language
Soumya Sanyal
Harman Singh
Xiang Ren
ReLM
LRM
32
45
0
19 Mar 2022
Does Entity Abstraction Help Generative Transformers Reason?
Does Entity Abstraction Help Generative Transformers Reason?
Nicolas Angelard-Gontier
Siva Reddy
C. Pal
34
5
0
05 Jan 2022
Pushing the Limits of Rule Reasoning in Transformers through Natural
  Language Satisfiability
Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability
Kyle Richardson
Ashish Sabharwal
ReLM
LRM
30
24
0
16 Dec 2021
Systematic Generalization with Edge Transformers
Systematic Generalization with Edge Transformers
Leon Bergen
Timothy J. O'Donnell
Dzmitry Bahdanau
10
46
0
01 Dec 2021
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic
  benchmarking
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking
Ronen Tamari
Kyle Richardson
Aviad Sar-Shalom
Noam Kahlon
Nelson F. Liu
Reut Tsarfaty
Dafna Shahaf
43
5
0
30 Nov 2021
On Semantic Cognition, Inductive Generalization, and Language Models
On Semantic Cognition, Inductive Generalization, and Language Models
Kanishka Misra
LRM
AI4CE
24
3
0
04 Nov 2021
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Tushar Khot
Kyle Richardson
Daniel Khashabi
Ashish Sabharwal
RALM
LRM
13
14
0
16 Oct 2021
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Xingdi Yuan
34
3
0
31 Aug 2021
Improving Coherence and Consistency in Neural Sequence Models with
  Dual-System, Neuro-Symbolic Reasoning
Improving Coherence and Consistency in Neural Sequence Models with Dual-System, Neuro-Symbolic Reasoning
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
Brenden M. Lake
33
117
0
06 Jul 2021
SyGNS: A Systematic Generalization Testbed Based on Natural Language
  Semantics
SyGNS: A Systematic Generalization Testbed Based on Natural Language Semantics
Hitomi Yanaka
K. Mineshima
Kentaro Inui
NAI
AI4CE
38
11
0
02 Jun 2021
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning
  Performance of GPT-2
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor Betz
Kyle Richardson
Christian Voigt
ReLM
LRM
24
30
0
24 Mar 2021
Critical Thinking for Language Models
Critical Thinking for Language Models
Gregor Betz
Christian Voigt
Kyle Richardson
SyDa
ReLM
LRM
AI4CE
23
35
0
15 Sep 2020
Compositional Generalization in Semantic Parsing: Pre-training vs.
  Specialized Architectures
Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures
Daniel Furrer
Marc van Zee
Nathan Scales
Nathanael Scharli
CoGe
26
113
0
17 Jul 2020
LogiQA: A Challenge Dataset for Machine Reading Comprehension with
  Logical Reasoning
LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning
Jian Liu
Leyang Cui
Hanmeng Liu
Dandan Huang
Yile Wang
Yue Zhang
RALM
16
335
0
16 Jul 2020
12
Next