ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09174
  4. Cited By
AbsPyramid: Benchmarking the Abstraction Ability of Language Models with
  a Unified Entailment Graph

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph

15 November 2023
Zhaowei Wang
Haochen Shi
Weiqi Wang
Tianqing Fang
Hongming Zhang
Sehyun Choi
Xin Liu
Yangqiu Song
ArXivPDFHTML

Papers citing "AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph"

12 / 12 papers shown
Title
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Yiming Du
Wenyu Huang
Danna Zheng
Zhaowei Wang
Sébastien Montella
Mirella Lapata
Kam-Fai Wong
Jeff Z. Pan
KELM
MU
86
2
0
01 May 2025
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong
Zhaoxiang Wang
Tianshi Zheng
Xiyu Ren
Yangqiu Song
64
1
0
31 Dec 2024
Intention Knowledge Graph Construction for User Intention Relation Modeling
Intention Knowledge Graph Construction for User Intention Relation Modeling
Jiaxin Bai
Zhaoxiang Wang
Junfei Cheng
Dan Yu
Zerui Huang
...
Xin Liu
Chen Luo
Yanming Zhu
Bo Li
Yangqiu Song
87
1
0
16 Dec 2024
KNOWCOMP POKEMON Team at DialAM-2024: A Two-Stage Pipeline for Detecting
  Relations in Dialogical Argument Mining
KNOWCOMP POKEMON Team at DialAM-2024: A Two-Stage Pipeline for Detecting Relations in Dialogical Argument Mining
Zihao Zheng
Zhaowei Wang
Qing Zong
Yangqiu Song
LRM
48
1
0
29 Jul 2024
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
Weiqi Wang
Yangqiu Song
LRM
35
8
0
04 Jun 2024
Editing Conceptual Knowledge for Large Language Models
Editing Conceptual Knowledge for Large Language Models
Xiaohan Wang
Shengyu Mao
Ningyu Zhang
Shumin Deng
Yunzhi Yao
Yue Shen
Lei Liang
Jinjie Gu
Huajun Chen
KELM
34
13
0
10 Mar 2024
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
366
12,003
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
215
1,661
0
15 Oct 2021
Incorporating Temporal Information in Entailment Graph Mining
Incorporating Temporal Information in Entailment Graph Mining
Liane Guillou
Sander Bijl de Vroe
Mohammad Javad Hosseini
Mark Johnson
Mark Steedman
102
10
0
20 Sep 2021
Explaining Answers with Entailment Trees
Explaining Answers with Entailment Trees
Bhavana Dalvi
Peter Alexander Jansen
Oyvind Tafjord
Zhengnan Xie
Hannah Smith
Leighanna Pipatanangkura
Peter Clark
ReLM
FAtt
LRM
248
184
0
17 Apr 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1