Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.00161
Cited By
v1
v2 (latest)
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
1 March 2019
Dheeru Dua
Yizhong Wang
Pradeep Dasigi
Gabriel Stanovsky
Sameer Singh
Matt Gardner
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs"
50 / 376 papers shown
Title
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
Ran Xu
Hui Liu
Sreyashi Nag
Zhenwei Dai
Yaochen Xie
...
Chen Luo
Yang Li
Joyce C. Ho
Carl Yang
Qi He
RALM
181
11
0
28 Jan 2025
Decentralized Low-Rank Fine-Tuning of Large Language Models
Sajjad Ghiasvand
Mahnoosh Alizadeh
Ramtin Pedarsani
ALM
154
2
0
26 Jan 2025
RAMQA: A Unified Framework for Retrieval-Augmented Multi-Modal Question Answering
Yang Bai
Christan Earl Grant
Daisy Zhe Wang
RALM
126
1
0
23 Jan 2025
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
237
14
0
03 Jan 2025
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey
Longxuan Ma
Mingda Li
Weinan Zhang
Jiapeng Li
Ting Liu
128
17
0
14 Nov 2024
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang
Yi Hu
Shijia Kang
Zhouchen Lin
Muhan Zhang
LRM
115
8
0
06 Nov 2024
On the Loss of Context-awareness in General Instruction Fine-tuning
Yihan Wang
Andrew Bai
Nanyun Peng
Cho-Jui Hsieh
384
2
0
05 Nov 2024
Denial-of-Service Poisoning Attacks against Large Language Models
Kuofeng Gao
Tianyu Pang
Chao Du
Yong Yang
Shu-Tao Xia
Min Lin
SILM
AAML
179
6
0
14 Oct 2024
LoRTA: Low Rank Tensor Adaptation of Large Language Models
Ignacio Hounie
Charilaos I. Kanatsoulis
Arnuv Tandon
Alejandro Ribeiro
187
0
0
05 Oct 2024
Agent-Oriented Planning in Multi-Agent Systems
Ao Li
Yuexiang Xie
Songze Li
Fugee Tsung
Bolin Ding
Yaliang Li
AIFin
412
10
0
03 Oct 2024
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Shafiq Joty
HILM
293
24
0
30 Sep 2024
Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models
Xin Sky Li
Weize Chen
Qizhi Chu
Haopeng Li
Zhaojun Sun
...
Yiwei Wei
Zhiyuan Liu
Chuan Shi
Maosong Sun
Cheng Yang
125
6
0
29 Sep 2024
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely
Siyun Zhao
Yuqing Yang
Zilong Wang
Zhiyuan He
Luna Qiu
Lili Qiu
SyDa
RALM
3DV
120
42
0
23 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
Hao Fei
Xunliang Cai
Yixin Cao
Liangcai Gao
Zhi Tang
LRM
141
7
0
19 Sep 2024
Automated Design of Agentic Systems
Shengran Hu
Cong Lu
Jeff Clune
AI4CE
148
62
0
15 Aug 2024
Cool-Fusion: Fuse Large Language Models without Training
Cong Liu
Xiaojun Quan
Yan Pan
Liangzhi Li
Weigang Wu
Xu Chen
MoMe
VLM
135
5
0
29 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
157
14
0
09 Jul 2024
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards
Zhimin Zhao
A. A. Bangash
F. Côgo
Bram Adams
Ahmed E. Hassan
210
1
0
04 Jul 2024
AgentInstruct: Toward Generative Teaching with Agentic Flows
Arindam Mitra
Luciano Del Corro
Guoqing Zheng
Shweti Mahajan
Dany Rouhana
...
Corby Rosset
Fillipe Silva
Hamed Khanpour
Yash Lara
Ahmed Awadallah
SyDa
116
35
0
03 Jul 2024
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
A. Bavaresco
Raffaella Bernardi
Leonardo Bertolazzi
Desmond Elliott
Raquel Fernández
...
David Schlangen
Alessandro Suglia
Aditya K Surikuchi
Ece Takmaz
A. Testoni
ALM
ELM
193
88
0
26 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
79
2
0
11 Jun 2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation
Chengxing Jia
Pengyuan Wang
Ziniu Li
Yi-Chen Li
Zhilong Zhang
Nan Tang
Yang Yu
OffRL
64
2
0
27 May 2024
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
215
31
0
23 Apr 2024
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
133
35
0
15 Apr 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
169
39
0
08 Apr 2024
PATCH! {P}sychometrics-{A}ssis{T}ed Ben{CH}marking of Large Language Models against Human Populations: A Case Study of Proficiency in 8th Grade Mathematics
Qixiang Fang
Daniel L. Oberski
Dong Nguyen
104
3
0
02 Apr 2024
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
Saurabh Srivastava
B. AnnaroseM
V. AntoP
Shashank Menon
Ajay Sukumar
T. AdwaithSamod
Alan Philipose
Stevin Prince
Sooraj Thomas
ELM
ReLM
LRM
79
56
0
29 Feb 2024
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
Fajie Yuan
Chengshun Shi
Shiguang Wu
Mengqi Zhang
Zhaochun Ren
Maarten de Rijke
Zhumin Chen
Jiahuan Pei
MoE
213
13
0
27 Feb 2024
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs
Cem Uluoglakci
T. Taşkaya-Temizel
HILM
69
3
0
25 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
92
1
0
20 Feb 2024
OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
Rui Ye
Wenhao Wang
Jingyi Chai
Dihan Li
Zexi Li
Yinda Xu
Yaxin Du
Yanfeng Wang
Siheng Chen
ALM
FedML
AIFin
101
98
0
10 Feb 2024
An Information-Theoretic Approach to Analyze NLP Classification Tasks
Luran Wang
Mark Gales
Vatsal Raina
50
1
0
01 Feb 2024
Demystifying Chains, Trees, and Graphs of Thoughts
Maciej Besta
Florim Memedi
Zhenyu Zhang
Robert Gerstenberger
Guangyuan Piao
...
Aleš Kubíček
H. Niewiadomski
Aidan O'Mahony
Onur Mutlu
Torsten Hoefler
AI4CE
LRM
388
33
0
25 Jan 2024
FinLLMs: A Framework for Financial Reasoning Dataset Generation with Large Language Models
Ziqiang Yuan
Kaiyuan Wang
Shoutai Zhu
Ye Yuan
Jingya Zhou
Yanlin Zhu
Wenqi Wei
76
9
0
19 Jan 2024
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang
Ashwinee Panda
Milad Nasr
Saeed Mahloujifar
Prateek Mittal
218
26
0
09 Jan 2024
TinyLlama: An Open-Source Small Language Model
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
ALM
LRM
229
409
0
04 Jan 2024
ComplexityNet: Increasing LLM Inference Efficiency by Learning Task Complexity
Henry Bae
Aghyad Deeb
Alex Fleury
Kehang Zhu
47
3
0
12 Dec 2023
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
O. Ovadia
Menachem Brief
Moshik Mishaeli
Oren Elisha
RALM
116
153
0
10 Dec 2023
CLAMP: Contrastive LAnguage Model Prompt-tuning
Piotr Teterwak
Ximeng Sun
Bryan A. Plummer
Kate Saenko
Ser-Nam Lim
MLLM
VLM
82
1
0
04 Dec 2023
Do Smaller Language Models Answer Contextualised Questions Through Memorisation Or Generalisation?
Tim Hartill
Joshua Bensemann
Michael Witbrock
Patricia Riddle
KELM
67
0
0
21 Nov 2023
On the Robustness of Question Rewriting Systems to Questions of Varying Hardness
Hai Ye
Hwee Tou Ng
Wenjuan Han
90
3
0
12 Nov 2023
A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Alon Jacovi
Avi Caciularu
Jonathan Herzig
Roee Aharoni
Bernd Bohnet
Mor Geva
ELM
126
7
0
16 Oct 2023
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLM
ELM
LRM
167
8
0
13 Oct 2023
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Wei Ping
Ming-Yu Liu
Lawrence C. McAfee
Peng Xu
Bo Li
Mohammad Shoeybi
Bryan Catanzaro
RALM
118
54
0
11 Oct 2023
SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA
Jonathan Tonglet
Manon Reusens
Philipp Borchert
Bart Baesens
85
6
0
10 Oct 2023
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
L. Yu
Weisen Jiang
Han Shi
Jincheng Yu
Zhengying Liu
Yu Zhang
James T. Kwok
Zheng Li
Adrian Weller
Weiyang Liu
OSLM
LRM
124
395
0
21 Sep 2023
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning
Rajasekhar Reddy Mekala
Yasaman Razeghi
Sameer Singh
LRM
91
11
0
16 Sep 2023
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
Tim Hartill
N. Tan
Michael Witbrock
Patricia J. Riddle
ReLM
KELM
LRM
86
2
0
02 Aug 2023
Around the GLOBE: Numerical Aggregation Question-Answering on Heterogeneous Genealogical Knowledge Graphs with Deep Neural Networks
Omri Suissa
M. Zhitomirsky-Geffet
Avshalom Elmalech
63
1
0
30 Jul 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features
Ester Hlavnova
Sebastian Ruder
84
5
0
11 Jul 2023
Previous
1
2
3
4
5
6
7
8
Next