ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05698
  4. Cited By
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

19 February 2015
Jason Weston
Antoine Bordes
S. Chopra
Alexander M. Rush
Bart van Merriënboer
Armand Joulin
Tomáš Mikolov
    LRM
    ELM
ArXivPDFHTML

Papers citing "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks"

50 / 599 papers shown
Title
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
33
0
0
17 Apr 2025
Optimizing Quantum Circuits via ZX Diagrams using Reinforcement Learning and Graph Neural Networks
Optimizing Quantum Circuits via ZX Diagrams using Reinforcement Learning and Graph Neural Networks
Alexander Mattick
Maniraman Periyasamy
Christian Ufrecht
Abhishek Y. Dubey
Christopher Mutschler
Axel Plinge
Daniel D. Scherer
41
0
0
04 Apr 2025
Multi-Token Attention
Multi-Token Attention
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
56
1
0
01 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models
Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models
Mengsong Wu
Tong Zhu
Han Han
Xiang Zhang
Wenbiao Shao
Wenliang Chen
LRM
50
1
0
21 Mar 2025
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios
Weidong Zhan
Yansen Wang
Nan Hu
Liming Xiao
Jingyuan Ma
...
Wenhan Ma
Rui Li
Weilin Luo
Qun Liu
Zhifang Sui
LRM
65
1
0
08 Mar 2025
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation
Albert Gong
Kamilė Stankevičiūtė
Chao-gang Wan
Anmol Kabra
Raphael Thesmar
Johann Lee
Julius Klenke
Carla P. Gomes
Kilian Q. Weinberger
RALM
LRM
62
0
0
27 Feb 2025
BIG-Bench Extra Hard
BIG-Bench Extra Hard
Mehran Kazemi
Bahare Fatemi
Hritik Bansal
John Palowitch
Chrysovalantis Anastasiou
...
Kate Olszewska
Yi Tay
Vinh Q. Tran
Quoc V. Le
Orhan Firat
ELM
LRM
122
6
0
26 Feb 2025
Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding)
Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding)
Damien Sileo
RALM
LRM
43
0
0
24 Feb 2025
Reasoning Bias of Next Token Prediction Training
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
94
2
0
21 Feb 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton Razzhigaev
Matvey Mikhalchuk
Temurbek Rahmatullaev
Elizaveta Goncharova
Polina Druzhinina
Ivan Oseledets
Andrey Kuznetsov
69
2
0
20 Feb 2025
Multilingual Non-Factoid Question Answering with Answer Paragraph Selection
Multilingual Non-Factoid Question Answering with Answer Paragraph Selection
Ritwik Mishra
Sreeram Vennam
R. Shah
Ponnurangam Kumaraguru
95
0
0
20 Feb 2025
LM2: Large Memory Models
LM2: Large Memory Models
Jikun Kang
Wenqi Wu
Filippos Christianos
Alex J. Chan
Fraser Greenlee
George Thomas
Marvin Purtorab
Andy Toulis
KELM
97
0
0
09 Feb 2025
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
59
0
0
21 Jan 2025
Think or Remember? Detecting and Directing LLMs Towards Memorization or
  Generalization
Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization
Yi-Fu Fu
Yu-Chieh Tu
Tzu-Ling Cheng
Cheng-Yu Lin
Yi-Ting Yang
Heng-Yi Liu
Keng-Te Liao
Da-Cheng Juan
Shou-de Lin
49
0
0
24 Dec 2024
Systematic Evaluation of Long-Context LLMs on Financial Concepts
Systematic Evaluation of Long-Context LLMs on Financial Concepts
Lavanya Gupta
Saket Sharma
Yiyun Zhao
73
2
0
19 Dec 2024
Can Large Language Models Reason about the Region Connection Calculus?
Can Large Language Models Reason about the Region Connection Calculus?
Anthony G Cohn
Robert E Blackwell
LRM
71
2
0
29 Nov 2024
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic
  Survey
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey
Longxuan Ma
Mingda Li
Weinan Zhang
Jiapeng Li
Ting Liu
48
16
0
14 Nov 2024
ClevrSkills: Compositional Language and Visual Reasoning in Robotics
ClevrSkills: Compositional Language and Visual Reasoning in Robotics
Sanjay Haresh
Daniel Dijkman
Apratim Bhattacharyya
Roland Memisevic
CoGe
LRM
42
1
0
13 Nov 2024
Not All Heads Matter: A Head-Level KV Cache Compression Method with
  Integrated Retrieval and Reasoning
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Yu Fu
Zefan Cai
Abedelkadir Asi
Wayne Xiong
Yue Dong
Wen Xiao
44
15
0
25 Oct 2024
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic
  Reasoning Tasks
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Shailaja Keyur Sampat
Mutsumi Nakamura
Shankar Kailas
Kartik Aggarwal
Mandy Zhou
Yezhou Yang
Chitta Baral
MLLM
CoGe
ReLM
VLM
LRM
37
0
0
17 Oct 2024
A Little Human Data Goes A Long Way
A Little Human Data Goes A Long Way
Dhananjay Ashok
Jonathan May
SyDa
41
2
0
17 Oct 2024
Transformer-based Language Models for Reasoning in the Description Logic
  ALCQ
Transformer-based Language Models for Reasoning in the Description Logic ALCQ
Angelos Poulis
Eleni Tsalapati
Manolis Koubarakis
ReLM
LRM
29
1
0
12 Oct 2024
P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant
  Human-Written Reasoning Chains
P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Simeng Han
Aaron Yu
Rui Shen
Zhenting Qi
Martin Riddell
...
Yingbo Zhou
Caiming Xiong
Dragomir R. Radev
Rex Ying
Arman Cohan
LRM
43
3
0
11 Oct 2024
Mars: Situated Inductive Reasoning in an Open-World Environment
Mars: Situated Inductive Reasoning in an Open-World Environment
Xiaojuan Tang
Jiaqi Li
Yitao Liang
Song-chun Zhu
Muhan Zhang
Zilong Zheng
LM&Ro
LRM
LLMAG
34
1
0
10 Oct 2024
Which Programming Language and What Features at Pre-training Stage
  Affect Downstream Logical Inference Performance?
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Fumiya Uchiyama
Takeshi Kojima
Andrew Gambardella
Qi Cao
Yusuke Iwasawa
Yutaka Matsuo
LRM
ReLM
39
3
0
09 Oct 2024
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
Shachar Don-Yehiya
Leshem Choshen
Omri Abend
31
2
0
15 Aug 2024
Quantum Algorithms for Compositional Text Processing
Quantum Algorithms for Compositional Text Processing
Tuomas Laakkonen
K. Meichanetzidis
Bob Coecke
CoGe
48
1
0
12 Aug 2024
Long Input Benchmark for Russian Analysis
Long Input Benchmark for Russian Analysis
I. Churin
Murat Apishev
Maria Tikhonova
Denis Shevelev
Aydar Bulatov
Yuri Kuratov
Sergej Averkiev
Alena Fenogenova
38
0
0
05 Aug 2024
Stress-Testing Long-Context Language Models with Lifelong ICL and Task
  Haystack
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack
Xiaoyue Xu
Qinyuan Ye
Xiang Ren
53
6
0
23 Jul 2024
On the Design and Analysis of LLM-Based Algorithms
On the Design and Analysis of LLM-Based Algorithms
Yanxi Chen
Yaliang Li
Bolin Ding
Jingren Zhou
51
5
0
20 Jul 2024
Attention Overflow: Language Model Input Blur during Long-Context
  Missing Items Recommendation
Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation
Damien Sileo
LRM
RALM
34
0
0
18 Jul 2024
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task
Chris Madge
Massimo Poesio
LLMAG
30
2
0
17 Jul 2024
Case2Code: Scalable Synthetic Data for Code Generation
Case2Code: Scalable Synthetic Data for Code Generation
Yunfan Shao
Linyang Li
Yichuan Ma
Peiji Li
Demin Song
...
Qipeng Guo
Hang Yan
Xipeng Qiu
Xuanjing Huang
Dahua Lin
LRM
34
2
0
17 Jul 2024
MalAlgoQA: A Pedagogical Approach for Evaluating Counterfactual
  Reasoning Abilities
MalAlgoQA: A Pedagogical Approach for Evaluating Counterfactual Reasoning Abilities
Naiming Liu
Shashank Sonkar
Myco Le
Richard Baraniuk
LRM
21
2
0
01 Jul 2024
Belief Revision: The Adaptability of Large Language Models Reasoning
Belief Revision: The Adaptability of Large Language Models Reasoning
Bryan Wilie
Samuel Cahyawijaya
Etsuko Ishii
Junxian He
Pascale Fung
KELM
LRM
39
1
0
28 Jun 2024
Spiking Convolutional Neural Networks for Text Classification
Spiking Convolutional Neural Networks for Text Classification
Changze Lv
Jianhan Xu
Xiaoqing Zheng
56
28
0
27 Jun 2024
LiveBench: A Challenging, Contamination-Limited LLM Benchmark
LiveBench: A Challenging, Contamination-Limited LLM Benchmark
Colin White
Samuel Dooley
Manley Roberts
Arka Pal
Ben Feuer
...
Willie Neiswanger
Micah Goldblum
Tom Goldstein
Willie Neiswanger
Micah Goldblum
ELM
50
7
0
27 Jun 2024
Evaluating the Ability of Large Language Models to Reason about Cardinal
  Directions
Evaluating the Ability of Large Language Models to Reason about Cardinal Directions
Anthony G Cohn
Robert E Blackwell
42
6
0
24 Jun 2024
Instruction Pre-Training: Language Models are Supervised Multitask
  Learners
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Daixuan Cheng
Yuxian Gu
Shaohan Huang
Junyu Bi
Minlie Huang
Furu Wei
SyDa
65
20
0
20 Jun 2024
Neuro-symbolic Training for Reasoning over Spatial Language
Neuro-symbolic Training for Reasoning over Spatial Language
Tanawan Premsri
Parisa Kordjamshidi
NAI
LRM
48
6
0
19 Jun 2024
LLMs Are Prone to Fallacies in Causal Inference
LLMs Are Prone to Fallacies in Causal Inference
Nitish Joshi
Abulhair Saparov
Yixin Wang
He He
50
10
0
18 Jun 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
ALM
LRM
ReLM
ELM
51
61
0
14 Jun 2024
Needle In A Multimodal Haystack
Needle In A Multimodal Haystack
Weiyun Wang
Shuibo Zhang
Yiming Ren
Yuchen Duan
Tiantong Li
...
Ping Luo
Yu Qiao
Jifeng Dai
Wenqi Shao
Wenhai Wang
VLM
59
17
0
11 Jun 2024
Discrete Dictionary-based Decomposition Layer for Structured
  Representation Learning
Discrete Dictionary-based Decomposition Layer for Structured Representation Learning
Taewon Park
Hyun-Chul Kim
Minho Lee
44
0
0
11 Jun 2024
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation
  for Understanding Spatial Reasoning Capability of Large Language Models
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models
Md Imbesat Hassan Rizvi
Xiaodan Zhu
Iryna Gurevych
LRM
29
1
0
07 Jun 2024
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners
Chengzu Li
Caiqi Zhang
Han Zhou
Nigel Collier
Anna Korhonen
Ivan Vulić
LRM
46
16
0
04 Jun 2024
Attention-based Iterative Decomposition for Tensor Product
  Representation
Attention-based Iterative Decomposition for Tensor Product Representation
Taewon Park
Inchul Choi
Minho Lee
40
1
0
03 Jun 2024
A Survey of Useful LLM Evaluation
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
34
12
0
03 Jun 2024
Code Pretraining Improves Entity Tracking Abilities of Language Models
Code Pretraining Improves Entity Tracking Abilities of Language Models
Najoung Kim
Sebastian Schuster
Shubham Toshniwal
38
14
0
31 May 2024
1234...101112
Next