ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.02393
16
5

Learning to Recover Reasoning Chains for Multi-Hop Question Answering via Cooperative Games

6 April 2020
Yufei Feng
Mo Yu
Wenhan Xiong
Xiaoxiao Guo
Junjie Huang
Shiyu Chang
Murray Campbell
Michael A. Greenspan
Xiao-Dan Zhu
    OffRL
    LRM
ArXivPDFHTML
Abstract

We propose the new problem of learning to recover reasoning chains from weakly supervised signals, i.e., the question-answer pairs. We propose a cooperative game approach to deal with this problem, in which how the evidence passages are selected and how the selected passages are connected are handled by two models that cooperate to select the most confident chains from a large set of candidates (from distant supervision). For evaluation, we created benchmarks based on two multi-hop QA datasets, HotpotQA and MedHop; and hand-labeled reasoning chains for the latter. The experimental results demonstrate the effectiveness of our proposed approach.

View on arXiv
Comments on this paper