Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.00537
Cited By
v1
v2
v3 (latest)
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"
50 / 1,500 papers shown
Title
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models
Yuqing Wang
Yun Zhao
LRM
AAML
ELM
86
2
0
16 Jun 2024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
David Brandfonbrener
Hanlin Zhang
Andreas Kirsch
Jonathan Richard Schwarz
Sham Kakade
108
7
0
15 Jun 2024
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
141
8
0
14 Jun 2024
ReMI: A Dataset for Reasoning with Multiple Images
Mehran Kazemi
Nishanth Dikkala
Ankit Anand
Petar Dević
Ishita Dasgupta
...
Bahare Fatemi
Pranjal Awasthi
Dee Guo
Sreenivas Gollapudi
Ahmed Qureshi
LRM
VLM
110
17
0
13 Jun 2024
ECBD: Evidence-Centered Benchmark Design for NLP
Yu Lu Liu
Su Lin Blodgett
Jackie Chi Kit Cheung
Q. Vera Liao
Alexandra Olteanu
Ziang Xiao
91
12
0
13 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
72
2
0
11 Jun 2024
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
Md Tanvirul Alam
Dipkamal Bhusal
Le Nguyen
Nidhi Rastogi
ELM
54
21
0
11 Jun 2024
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You
Yichao Fu
Zheng Wang
Amir Yazdanbakhsh
Yingyan Celine Lin
131
4
0
11 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
83
28
0
10 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
81
2
0
10 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
171
1
0
08 Jun 2024
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings
MohammadAli SadraeiJavaeri
Ehsaneddin Asgari
A. Mchardy
Hamid R. Rabiee
VLM
AAML
68
0
0
07 Jun 2024
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Hongyu Li
Liang Ding
Meng Fang
Dacheng Tao
CLL
KELM
84
19
0
07 Jun 2024
BERTs are Generative In-Context Learners
David Samuel
85
8
0
07 Jun 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
Naibin Gu
Peng Fu
Xiyu Liu
Bowen Shen
Zheng Lin
Weiping Wang
69
10
0
06 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
89
2
0
06 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip Torr
João F. Henriques
Jakob N. Foerster
54
1
0
05 Jun 2024
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Kenneth Enevoldsen
Márton Kardos
Niklas Muennighoff
Kristoffer Nielbo
98
11
0
04 Jun 2024
FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models
Tao Fan
Guoqiang Ma
Yan Kang
Hanlin Gu
Yuanfeng Song
Lixin Fan
Kai Chen
Qiang Yang
106
12
0
04 Jun 2024
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Yubo Wang
Xueguang Ma
Ge Zhang
Yuansheng Ni
Abhranil Chandra
...
Kai Wang
Alex Zhuang
Rongqi Fan
Xiang Yue
Wenhu Chen
LRM
ELM
156
465
0
03 Jun 2024
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Zhuo Chen
Rumen Dangovski
Charlotte Loh
Owen Dugan
Di Luo
Marin Soljacic
MQ
85
9
0
31 May 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Chanjun Park
Hyeonwoo Kim
Dahyun Kim
Seonghwan Cho
Sanghoon Kim
Sukyung Lee
Yungi Kim
Hwalsuk Lee
ELM
ALM
93
16
0
31 May 2024
A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models
Eduard Frankford
Ingo Höhn
Clemens Sauerwein
Ruth Breu
ELM
83
2
0
30 May 2024
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers
Dylan Zhang
Justin Wang
Francois Charton
40
0
0
30 May 2024
Cascade-Aware Training of Language Models
Congchao Wang
Sean Augenstein
Keith Rush
Wittawat Jitkrittum
Harikrishna Narasimhan
A. S. Rawat
A. Menon
Alec Go
81
4
0
29 May 2024
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models
Aparna Elangovan
Ling Liu
Lei Xu
S. Bodapati
Dan Roth
ELM
100
10
0
28 May 2024
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Phakphum Artkaew
LRM
58
0
0
28 May 2024
IAPT: Instruction-Aware Prompt Tuning for Large Language Models
Wei-wei Zhu
Aaron Xuxiang Tian
Congrui Yin
Yuan Ni
Xiaoling Wang
Guotong Xie
85
0
0
28 May 2024
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
Akiyoshi Tomihari
Issei Sato
70
4
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
300
205
0
27 May 2024
Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration
Junjie Gao
Chongjian Wang
Zhongjun Ding
Shuangmin Chen
Shiqing Xin
Changhe Tu
Wenping Wang
3DPC
89
1
0
25 May 2024
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Jialin Zhao
Yingtao Zhang
Xinghang Li
Huaping Liu
C. Cannistraci
59
1
0
24 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELM
ALM
196
63
3
23 May 2024
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Ziqiao Ma
Zekun Wang
Joyce Chai
134
4
0
22 May 2024
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus
Eduard Poesina
Cornelia Caragea
Radu Tudor Ionescu
78
6
0
20 May 2024
Your Transformer is Secretly Linear
Anton Razzhigaev
Matvey Mikhalchuk
Elizaveta Goncharova
Nikolai Gerasimenko
Ivan Oseledets
Denis Dimitrov
Andrey Kuznetsov
81
6
0
19 May 2024
Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
Pengxiang Lan
Enneng Yang
Yuting Liu
Guibing Guo
Linying Jiang
Jianzhe Zhao
Xingwei Wang
VLM
AAML
76
1
0
19 May 2024
Large Language Models Lack Understanding of Character Composition of Words
Andrew Shin
Kunitake Kaneko
100
11
0
18 May 2024
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Arnav Chavan
Nahush Lele
Deepak Gupta
70
2
0
17 May 2024
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset
Jie Zhu
Junhui Li
Yalong Wen
Lifan Guo
ELM
ALM
77
8
0
17 May 2024
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations
Jiahao Zhao
Jingwei Zhu
Minghuan Tan
Min Yang
Di Yang
Chenhao Zhang
Guancheng Ye
Chengming Li
Xiping Hu
ELM
117
0
0
16 May 2024
PL-MTEB: Polish Massive Text Embedding Benchmark
Rafal Po'swiata
Slawomir Dadas
Michal Perelkiewicz
60
8
0
16 May 2024
α
α
α
VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning
Rafael Kourdis
Gabriel Gordon-Hall
P. Gorinski
37
0
0
13 May 2024
LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language
Cagri Toraman
VLM
112
5
0
13 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
136
100
0
13 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions
Polina Tsvilodub
Paul Marty
Sonia Ramotowska
Jacopo Romoli
Michael Franke
60
2
0
09 May 2024
Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias
Shan Chen
Jack Gallifant
Mingye Gao
Pedro Moreira
Nikolaj Munch
...
Hugo J. W. L. Aerts
Brian Anthony
Leo Anthony Celi
William G. La Cava
Danielle S. Bitterman
80
12
0
09 May 2024
Zero-shot LLM-guided Counterfactual Generation for Text
Amrita Bhattacharjee
Raha Moraffah
Joshua Garland
Huan Liu
91
7
0
08 May 2024
TREC iKAT 2023: A Test Collection for Evaluating Conversational and Interactive Knowledge Assistants
Mohammad Aliannejadi
Zahra Abbasiantaeb
Shubham Chatterjee
Jeffery Dalton
Leif Azzopardi
63
15
0
04 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
102
7
0
04 May 2024
Previous
1
2
3
...
5
6
7
...
28
29
30
Next