ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09805
  4. Cited By
DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in
  Understanding Long Documents with Tabular Data

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data

16 November 2023
Yilun Zhao
Yitao Long
Hongjun Liu
Linyong Nan
Lyuhao Chen
Ryo Kamoi
Yixin Liu
Xiangru Tang
Rui Zhang
Arman Cohan
ArXiv (abs)PDFHTMLGithub (23★)

Papers citing "DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data"

10 / 10 papers shown
Title
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4
  on mock CFA Exams
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Ethan Callanan
A. Mbakwe
Antony Papadimitriou
Yulong Pei
Mathieu Sibue
Xiaodan Zhu
Zhiqiang Ma
Xiaomo Liu
Sameena Shah
ELM
59
15
0
12 Oct 2023
Lemur: Harmonizing Natural Language and Code for Language Agents
Lemur: Harmonizing Natural Language and Code for Language Agents
Yiheng Xu
Hongjin Su
Chen Xing
Boyu Mi
Qian Liu
...
Siheng Zhao
Lingpeng Kong
Bailin Wang
Caiming Xiong
Tao Yu
91
71
0
10 Oct 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRMOSLM
217
463
0
18 Aug 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Ziyang Luo
Can Xu
Pu Zhao
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELMSyDaALM
95
687
0
14 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
391
4,388
0
09 Jun 2023
Dynamic Prompt Learning via Policy Gradient for Semi-structured
  Mathematical Reasoning
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Pan Lu
Liang Qiu
Kai-Wei Chang
Ying Nian Wu
Song-Chun Zhu
Tanmay Rajpurohit
Peter Clark
Ashwin Kalyan
ReLMLRM
159
294
0
29 Sep 2022
Unsupervised Dense Information Retrieval with Contrastive Learning
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
195
919
0
16 Dec 2021
A Diverse Corpus for Evaluating and Developing English Math Word Problem
  Solvers
A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers
Shen-Yun Miao
Chao-Chun Liang
Keh-Yih Su
68
341
0
30 Jun 2021
TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and
  Textual Content in Finance
TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance
Fengbin Zhu
Wenqiang Lei
Youcheng Huang
Chao Wang
Shuo Zhang
Jiancheng Lv
Fuli Feng
Tat-Seng Chua
AIMat
118
303
0
17 May 2021
Program Induction by Rationale Generation : Learning to Solve and
  Explain Algebraic Word Problems
Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems
Wang Ling
Dani Yogatama
Chris Dyer
Phil Blunsom
AIMat
97
734
0
11 May 2017
1