ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05698
  4. Cited By
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

19 February 2015
Jason Weston
Antoine Bordes
S. Chopra
Alexander M. Rush
Bart van Merriënboer
Armand Joulin
Tomáš Mikolov
    LRM
    ELM
ArXivPDFHTML

Papers citing "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks"

50 / 600 papers shown
Title
Code Pretraining Improves Entity Tracking Abilities of Language Models
Code Pretraining Improves Entity Tracking Abilities of Language Models
Najoung Kim
Sebastian Schuster
Shubham Toshniwal
38
14
0
31 May 2024
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large
  Language Models Reasoning
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning
Xinlu Zhang
Zhi Chen
Xi Ye
Xianjun Yang
Lichang Chen
William Y. Wang
Linda R. Petzold
LRM
66
11
0
30 May 2024
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World
  Simulation Benchmark for Qualitative Reasoning
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
37
6
0
23 May 2024
Elements of World Knowledge (EWOK): A cognition-inspired framework for
  evaluating basic world knowledge in language models
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models
Anna A. Ivanova
Aalok Sathe
Benjamin Lipkin
Unnathi Kumar
S. Radkani
...
Leshem Choshen
Roger Levy
Evelina Fedorenko
Josh Tenenbaum
Jacob Andreas
46
24
0
15 May 2024
Quantifying the Capabilities of LLMs across Scale and Precision
Quantifying the Capabilities of LLMs across Scale and Precision
Sher Badshah
Hassan Sajjad
40
12
0
06 May 2024
Optimising Calls to Large Language Models with Uncertainty-Based
  Two-Tier Selection
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection
Guillem Ramírez
Alexandra Birch
Ivan Titov
40
8
0
03 May 2024
UQA: Corpus for Urdu Question Answering
UQA: Corpus for Urdu Question Answering
Samee Arif
Sualeha Farid
Awais Athar
Agha Ali Raza
42
4
0
02 May 2024
Enhancing Length Extrapolation in Sequential Models with
  Pointer-Augmented Neural Memory
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
Hung Le
D. Nguyen
Kien Do
Svetha Venkatesh
T. Tran
36
0
0
18 Apr 2024
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning
  in Large Language Models
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
Wenshan Wu
Shaoguang Mao
Yadong Zhang
Yan Xia
Li Dong
Lei Cui
Furu Wei
LRM
56
20
0
04 Apr 2024
Exploring the Limitations of Large Language Models in Compositional
  Relation Reasoning
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
Jinman Zhao
Xueyan Zhang
BDL
LRM
38
4
0
05 Mar 2024
Same Task, More Tokens: the Impact of Input Length on the Reasoning
  Performance of Large Language Models
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Mosh Levy
Alon Jacoby
Yoav Goldberg
48
70
0
19 Feb 2024
Can Deception Detection Go Deeper? Dataset, Evaluation, and Benchmark
  for Deception Reasoning
Can Deception Detection Go Deeper? Dataset, Evaluation, and Benchmark for Deception Reasoning
Kang Chen
Zheng Lian
Haiyang Sun
Bin Liu
Jianhua Tao
42
0
0
18 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs
  Miss
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
119
33
0
16 Feb 2024
BDIQA: A New Dataset for Video Question Answering to Explore Cognitive
  Reasoning through Theory of Mind
BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind
Yuanyuan Mao
Xin Lin
Qin Ni
Liang He
29
3
0
12 Feb 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind
  Reasoning Capabilities of Large Language Models
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Bin Liang
Yulan He
84
20
0
08 Feb 2024
Are self-explanations from Large Language Models faithful?
Are self-explanations from Large Language Models faithful?
Andreas Madsen
Sarath Chandar
Siva Reddy
LRM
30
25
0
15 Jan 2024
Advancing Spatial Reasoning in Large Language Models: An In-Depth
  Evaluation and Enhancement Using the StepGame Benchmark
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
43
26
0
08 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
26
7
0
06 Jan 2024
IAG: Induction-Augmented Generation Framework for Answering Reasoning
  Questions
IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions
Zhebin Zhang
Xinyu Zhang
Yuanhang Ren
Saijiang Shi
Meng Han
Yongkang Wu
Ruofei Lai
Bo Zhao
RALM
LRM
27
15
0
30 Nov 2023
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large
  Language Models
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models
Youssef Benchekroun
Megi Dervishi
Mark Ibrahim
Jean-Baptiste Gaya
Xavier Martinet
Grégoire Mialon
Thomas Scialom
Emmanuel Dupoux
Dieuwke Hupkes
Pascal Vincent
LRM
22
6
0
27 Nov 2023
LogicNet: A Logical Consistency Embedded Face Attribute Learning Network
LogicNet: A Logical Consistency Embedded Face Attribute Learning Network
Haiyu Wu
Sicong Tian
Huayu Li
Kevin W. Bowyer
32
2
0
19 Nov 2023
Interpreting User Requests in the Context of Natural Language Standing
  Instructions
Interpreting User Requests in the Context of Natural Language Standing Instructions
Nikita Moghe
Patrick Xia
Jacob Andreas
J. Eisner
Benjamin Van Durme
Harsh Jhamtani
29
3
0
16 Nov 2023
Transformers in the Service of Description Logic-based Contexts
Transformers in the Service of Description Logic-based Contexts
Angelos Poulis
Eleni Tsalapati
Manolis Koubarakis
LRM
ReLM
28
0
0
15 Nov 2023
Comparing Generalization in Learning with Limited Numbers of Exemplars:
  Transformer vs. RNN in Attractor Dynamics
Comparing Generalization in Learning with Limited Numbers of Exemplars: Transformer vs. RNN in Attractor Dynamics
Rui Fukushima
Jun Tani
9
0
0
15 Nov 2023
Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic
  Systems
Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic Systems
A. Oltramari
NAI
LRM
26
4
0
13 Nov 2023
Are LLMs Rigorous Logical Reasoner? Empowering Natural Language Proof
  Generation with Contrastive Stepwise Decoding
Are LLMs Rigorous Logical Reasoner? Empowering Natural Language Proof Generation with Contrastive Stepwise Decoding
Ying Su
Xiaojin Fu
Mingwen Liu
Zhijiang Guo
LRM
41
3
0
12 Nov 2023
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing
  & Attribution in AI
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Shayne Longpre
Robert Mahari
Anthony Chen
Naana Obeng-Marnu
Damien Sileo
...
K. Bollacker
Tongshuang Wu
Luis Villa
Sandy Pentland
Sara Hooker
32
56
0
25 Oct 2023
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
33
9
0
25 Oct 2023
Can You Follow Me? Testing Situational Understanding in ChatGPT
Can You Follow Me? Testing Situational Understanding in ChatGPT
Chenghao Yang
Allyson Ettinger
LRM
LLMAG
ELM
120
4
0
24 Oct 2023
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial
  Reasoning in Text
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text
Shuaiyi Li
Yang Deng
Wai Lam
32
2
0
19 Oct 2023
Faithfulness Measurable Masked Language Models
Faithfulness Measurable Masked Language Models
Andreas Madsen
Siva Reddy
Sarath Chandar
46
3
0
11 Oct 2023
Large Language Models can Learn Rules
Large Language Models can Learn Rules
Zhaocheng Zhu
Yuan Xue
Xinyun Chen
Denny Zhou
Jian Tang
Dale Schuurmans
Hanjun Dai
LRM
ReLM
41
63
0
10 Oct 2023
What's the Magic Word? A Control Theory of LLM Prompting
What's the Magic Word? A Control Theory of LLM Prompting
Aman Bhargava
Cameron Witkowski
Manav Shah
Matt W. Thomson
LLMAG
61
30
0
02 Oct 2023
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical
  Reasoning Capabilities of Language Models
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models
Man Luo
Shrinidhi Kumbhar
Ming shen
Mihir Parmar
Neeraj Varshney
Pratyay Banerjee
Somak Aditya
Chitta Baral
ReLM
ELM
LRM
45
27
0
02 Oct 2023
A Framework for Inference Inspired by Human Memory Mechanisms
A Framework for Inference Inspired by Human Memory Mechanisms
Xiangyu Zeng
Jie Lin
Piao Hu
Ruizheng Huang
Zhicheng Zhang
27
2
0
01 Oct 2023
Legal Question-Answering in the Indian Context: Efficacy, Challenges,
  and Potential of Modern AI Models
Legal Question-Answering in the Indian Context: Efficacy, Challenges, and Potential of Modern AI Models
S. Nigam
Shubham Kumar Mishra
Ayush Kumar Mishra
Noel Shallum
Arnab Bhattacharya
AILaw
ELM
19
7
0
26 Sep 2023
In-context Interference in Chat-based Large Language Models
In-context Interference in Chat-based Large Language Models
Eric Nuertey Coleman
J. Hurtado
Vincenzo Lomonaco
KELM
28
1
0
22 Sep 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare
  Conversations Powered by Generative AI
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MA
ELM
AI4MH
43
66
0
21 Sep 2023
A Data Source for Reasoning Embodied Agents
A Data Source for Reasoning Embodied Agents
Jack Lanchantin
Sainbayar Sukhbaatar
Gabriel Synnaeve
Yuxuan Sun
Kavya Srinet
Arthur Szlam
LM&Ro
LRM
30
5
0
14 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122
  Language Variants
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
30
133
0
31 Aug 2023
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language
  Model Bias
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias
Vipul Gupta
Pranav Narayanan Venkit
Hugo Laurenccon
Shomir Wilson
R. Passonneau
48
12
0
24 Aug 2023
MDDial: A Multi-turn Differential Diagnosis Dialogue Dataset with
  Reliability Evaluation
MDDial: A Multi-turn Differential Diagnosis Dialogue Dataset with Reliability Evaluation
Srija Macherla
Man Luo
Mihir Parmar
Chitta Baral
41
4
0
16 Aug 2023
Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic
Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic
Terufumi Morishita
Gaku Morio
Atsuki Yamaguchi
Yasuhiro Sogawa
ReLM
LRM
AI4CE
ELM
35
23
0
11 Aug 2023
Universal Recurrent Event Memories for Streaming Data
Universal Recurrent Event Memories for Streaming Data
Ran Dou
José C. Príncipe
AI4TS
13
2
0
28 Jul 2023
COLLIE: Systematic Construction of Constrained Text Generation Tasks
COLLIE: Systematic Construction of Constrained Text Generation Tasks
Shunyu Yao
Howard Chen
Austin W. Hanjie
Runzhe Yang
Karthik R. Narasimhan
47
32
0
17 Jul 2023
Coupling Large Language Models with Logic Programming for Robust and
  General Reasoning from Text
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Zhun Yang
Adam Ishay
Joohyung Lee
LRM
ELM
38
52
0
15 Jul 2023
Dialogue Agents 101: A Beginner's Guide to Critical Ingredients for
  Designing Effective Conversational Systems
Dialogue Agents 101: A Beginner's Guide to Critical Ingredients for Designing Effective Conversational Systems
Shivani Kumar
S. Bhatia
Milan Aggarwal
Tanmoy Chakraborty
27
1
0
14 Jul 2023
IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation
IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation
Thiviyan Thanapalasingam
Emile van Krieken
Peter Bloem
Paul T. Groth
26
1
0
13 Jul 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with
  Typological Features
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features
Ester Hlavnova
Sebastian Ruder
35
5
0
11 Jul 2023
Previous
12345...101112
Next