Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.00537
Cited By
v1
v2
v3 (latest)
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"
50 / 1,500 papers shown
Title
Examining the Effect of Pre-training on Time Series Classification
Jiashu Pu
Shiwei Zhao
Ling Cheng
Yongzhu Chang
Runze Wu
Tangjie Lv
Rongsheng Zhang
AI4TS
107
0
0
11 Sep 2023
Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis
Li Du
Yequan Wang
Xingrun Xing
Yiqun Ya
Xiang Li
Xin Jiang
Xuezhi Fang
HILM
45
13
0
11 Sep 2023
DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
Zhengxiang Shi
Aldo Lipani
VLM
124
34
0
11 Sep 2023
Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens
Ronald Seoh
Haw-Shiuan Chang
Andrew McCallum
63
1
0
08 Sep 2023
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng Zhang
Aixin Sun
Yequan Wang
147
22
0
07 Sep 2023
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Guijin Son
Hanwool Albert Lee
Suwan Kim
Huiseo Kim
Jaecheol Lee
Je Won Yeom
Jihyu Jung
Jung Woo Kim
Songseong Kim
RALM
ELM
123
24
0
06 Sep 2023
AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models
Fei Tang
Wanling Gao
Luzhou Peng
Jianfeng Zhan
ELM
47
2
0
05 Sep 2023
Benchmarking Large Language Models in Retrieval-Augmented Generation
Jiawei Chen
Hongyu Lin
Xianpei Han
Le Sun
3DV
RALM
105
312
0
04 Sep 2023
Studying the impacts of pre-training using ChatGPT-generated text on downstream tasks
Sarthak Anand
56
0
0
02 Sep 2023
When Do Discourse Markers Affect Computational Sentence Understanding?
RUIQI LI
Liesbeth Allein
Damien Sileo
Marie-Francine Moens
42
1
0
01 Sep 2023
ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding
Omer Veysel Cagatan
72
2
0
30 Aug 2023
ZhuJiu: A Multi-dimensional, Multi-faceted Chinese Benchmark for Large Language Models
Baolin Zhang
Hai-Yong Xie
Pengfan Du
Junhao Chen
Pengfei Cao
Yubo Chen
Shengping Liu
Kang Liu
Jun Zhao
ELM
ALM
46
2
0
28 Aug 2023
Detecting Language Model Attacks with Perplexity
Gabriel Alon
Michael Kamfonas
AAML
120
229
0
27 Aug 2023
Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability of Language Models
Nancy Tyagi
Surjodeep Sarkar
Manas Gaur
KELM
53
1
0
25 Aug 2023
Bayesian Low-rank Adaptation for Large Language Models
Adam X. Yang
Maxime Robeyns
Xi Wang
Laurence Aitchison
AI4CE
BDL
163
55
0
24 Aug 2023
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias
Vipul Gupta
Pranav Narayanan Venkit
Hugo Laurenccon
Shomir Wilson
R. Passonneau
110
14
0
24 Aug 2023
D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Kushal Tirumala
Daniel Simig
Armen Aghajanyan
Ari S. Morcos
SyDa
66
115
0
23 Aug 2023
Using language models in the implicit automated assessment of mathematical short answer items
Christopher M. Ormerod
48
0
0
21 Aug 2023
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Neel Guha
Julian Nyarko
Daniel E. Ho
Christopher Ré
Adam Chilton
...
Spencer Williams
Sunny G. Gandhi
Tomer Zur
Varun J. Iyer
Zehua Li
AILaw
LRM
ELM
82
182
0
20 Aug 2023
GameEval: Evaluating LLMs on Conversational Games
Dan Qiao
Chenfei Wu
Yaobo Liang
Juntao Li
Nan Duan
ELM
LLMAG
74
25
0
19 Aug 2023
A Methodology for Generative Spelling Correction via Natural Spelling Errors Emulation across Multiple Domains and Languages
Nikita Martynov
Mark Baushenko
Anastasia Kozlova
Katerina Kolomeytseva
Aleksandr Abramov
Alena Fenogenova
66
4
0
18 Aug 2023
Time Travel in LLMs: Tracing Data Contamination in Large Language Models
Shahriar Golchin
Mihai Surdeanu
169
108
0
16 Aug 2023
Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey
Lovre Torbarina
Tin Ferkovic
Lukasz Roguski
Velimir Mihelčić
Bruno Šarlija
Z. Kraljevic
67
5
0
16 Aug 2023
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
80
13
0
15 Aug 2023
Correct and Optimal: the Regular Expression Inference Challenge
Mojtaba Valizadeh
P. Gorinski
Ignacio Iacobacci
Martin Berger
33
0
0
15 Aug 2023
A Survey on Model Compression for Large Language Models
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
139
233
0
15 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
146
140
0
14 Aug 2023
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers
A. Luccioni
155
21
0
14 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
129
82
0
12 Aug 2023
Metacognitive Prompting Improves Understanding in Large Language Models
Yuqing Wang
Yun Zhao
ReLM
LRM
95
34
0
10 Aug 2023
Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval
Tim Hartill
Diana Benavides-Prado
Michael Witbrock
Patricia J. Riddle
ReLM
LRM
57
2
0
09 Aug 2023
Simple synthetic data reduces sycophancy in large language models
Jerry W. Wei
Da Huang
Yifeng Lu
Denny Zhou
Quoc V. Le
114
74
0
07 Aug 2023
AgentBench: Evaluating LLMs as Agents
Xiao Liu
Hao Yu
Hanchen Zhang
Yifan Xu
Xuanyu Lei
...
Yu-Chuan Su
Huan Sun
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LLMAG
152
315
0
07 Aug 2023
Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining
Nour Eddine Zekaoui
Siham Yousfi
Maryem Rhanoui
M. Mikram
49
3
0
07 Aug 2023
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text
Nandana Mihindukulasooriya
Sanju Tiwari
Carlos F. Enguix
K. Lata
85
62
0
04 Aug 2023
Explaining Relation Classification Models with Semantic Extents
Lars Klöser
André Büsgen
Philipp Kohl
Bodo Kraft
Albert Zündorf
30
0
0
04 Aug 2023
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
I. Timiryasov
J. Tastet
87
53
0
03 Aug 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
50
0
0
27 Jul 2023
ArcGPT: A Large Language Model Tailored for Real-world Archival Applications
Shitou Zhang
Jingrui Hou
Siyuan Peng
Z. Li
Qibiao Hu
Peijie Wang
KELM
RALM
LLMAG
74
3
0
27 Jul 2023
ARB: Advanced Reasoning Benchmark for Large Language Models
Tomohiro Sawada
Daniel Paleka
Alexander Havrilla
Pranav Tadepalli
Paula Vidas
Alexander Kranias
John J. Nay
Kshitij Gupta
Aran Komatsuzaki
ELM
LRM
81
39
0
25 Jul 2023
Making Pre-trained Language Models both Task-solvers and Self-calibrators
Yangyi Chen
Xingyao Wang
Heng Ji
54
0
0
21 Jul 2023
A Dataset and Strong Baselines for Classification of Czech News Texts
Hynek Kydlívcek
Jindrich Libovický
42
1
0
20 Jul 2023
Instruction-following Evaluation through Verbalizer Manipulation
Shiyang Li
Jun Yan
Hai Wang
Zheng Tang
Xiang Ren
Vijay Srinivasan
Hongxia Jin
102
27
0
20 Jul 2023
Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model
Shima Foolad
Kourosh Kiani
38
1
0
19 Jul 2023
Retentive Network: A Successor to Transformer for Large Language Models
Yutao Sun
Li Dong
Shaohan Huang
Shuming Ma
Yuqing Xia
Jilong Xue
Jianyong Wang
Furu Wei
LRM
187
347
0
17 Jul 2023
Soft Prompt Tuning for Augmenting Dense Retrieval with Large Language Models
Zhiyuan Peng
Xuyang Wu
Qifan Wang
Yihan Fang
VLM
RALM
94
12
0
17 Jul 2023
Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling
Longyue Wang
Zefeng Du
Donghua Liu
Cai Deng
Dian Yu
Haiyun Jiang
Yan Wang
Leyang Cui
Shuming Shi
Zhaopeng Tu
CoGe
95
6
0
16 Jul 2023
MorphPiece : A Linguistic Tokenizer for Large Language Models
Jeffrey Hsu
61
4
0
14 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
107
45
0
12 Jul 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
253
621
0
12 Jul 2023
Previous
1
2
3
...
11
12
13
...
28
29
30
Next