ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.00537
  4. Cited By
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
v1v2v3 (latest)

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
    ELM
ArXiv (abs)PDFHTML

Papers citing "SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"

50 / 1,500 papers shown
Title
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness
  Evaluation in Large Language Models
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models
Yuqing Wang
Yun Zhao
LRMAAMLELM
86
2
0
16 Jun 2024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language
  Model Pre-training
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
David Brandfonbrener
Hanlin Zhang
Andreas Kirsch
Jonathan Richard Schwarz
Sham Kakade
108
7
0
15 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MAAI4MHELM
141
8
0
14 Jun 2024
ReMI: A Dataset for Reasoning with Multiple Images
ReMI: A Dataset for Reasoning with Multiple Images
Mehran Kazemi
Nishanth Dikkala
Ankit Anand
Petar Dević
Ishita Dasgupta
...
Bahare Fatemi
Pranjal Awasthi
Dee Guo
Sreenivas Gollapudi
Ahmed Qureshi
LRMVLM
110
17
0
13 Jun 2024
ECBD: Evidence-Centered Benchmark Design for NLP
ECBD: Evidence-Centered Benchmark Design for NLP
Yu Lu Liu
Su Lin Blodgett
Jackie Chi Kit Cheung
Q. Vera Liao
Alexandra Olteanu
Ziang Xiao
91
12
0
13 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding
Paraphrasing in Affirmative Terms Improves Negation Understanding
MohammadHossein Rezaei
Eduardo Blanco
72
2
0
11 Jun 2024
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
Md Tanvirul Alam
Dipkamal Bhusal
Le Nguyen
Nidhi Rastogi
ELM
54
21
0
11 Jun 2024
When Linear Attention Meets Autoregressive Decoding: Towards More
  Effective and Efficient Linearized Large Language Models
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You
Yichao Fu
Zheng Wang
Amir Yazdanbakhsh
Yingyan Celine Lin
131
4
0
11 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELMCLL
83
28
0
10 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language
  Models
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
81
2
0
10 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
171
1
0
08 Jun 2024
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with
  Superposition of Multi Token Embeddings
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings
MohammadAli SadraeiJavaeri
Ehsaneddin Asgari
A. Mchardy
Hamid R. Rabiee
VLMAAML
68
0
0
07 Jun 2024
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Hongyu Li
Liang Ding
Meng Fang
Dacheng Tao
CLLKELM
84
19
0
07 Jun 2024
BERTs are Generative In-Context Learners
BERTs are Generative In-Context Learners
David Samuel
85
8
0
07 Jun 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
Naibin Gu
Peng Fu
Xiyu Liu
Bowen Shen
Zheng Lin
Weiping Wang
69
10
0
06 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight
  Pruning: A Theoretical Perspective
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
89
2
0
06 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip Torr
João F. Henriques
Jakob N. Foerster
54
1
0
05 Jun 2024
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of
  Multilingual and Monolingual Text Embedding
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Kenneth Enevoldsen
Márton Kardos
Niklas Muennighoff
Kristoffer Nielbo
98
11
0
04 Jun 2024
FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language
  Models
FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models
Tao Fan
Guoqiang Ma
Yan Kang
Hanlin Gu
Yuanfeng Song
Lixin Fan
Kai Chen
Qiang Yang
106
12
0
04 Jun 2024
MMLU-Pro: A More Robust and Challenging Multi-Task Language
  Understanding Benchmark
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Yubo Wang
Xueguang Ma
Ge Zhang
Yuansheng Ni
Abhranil Chandra
...
Kai Wang
Alex Zhuang
Rongqi Fan
Xiang Yue
Wenhu Chen
LRMELM
156
465
0
03 Jun 2024
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed
  Tensor Adaptation
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Zhuo Chen
Rumen Dangovski
Charlotte Loh
Owen Dugan
Di Luo
Marin Soljacic
MQ
85
9
0
31 May 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with
  Ko-H5 Benchmark
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Chanjun Park
Hyeonwoo Kim
Dahyun Kim
Seonghwan Cho
Sanghoon Kim
Sukyung Lee
Yungi Kim
Hwalsuk Lee
ELMALM
93
16
0
31 May 2024
A Survey Study on the State of the Art of Programming Exercise
  Generation using Large Language Models
A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models
Eduard Frankford
Ingo Höhn
Clemens Sauerwein
Ruth Breu
ELM
83
2
0
30 May 2024
From Symbolic Tasks to Code Generation: Diversification Yields Better
  Task Performers
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers
Dylan Zhang
Justin Wang
Francois Charton
40
0
0
30 May 2024
Cascade-Aware Training of Language Models
Cascade-Aware Training of Language Models
Congchao Wang
Sean Augenstein
Keith Rush
Wittawat Jitkrittum
Harikrishna Narasimhan
A. S. Rawat
A. Menon
Alec Go
81
4
0
29 May 2024
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation
  for Generative Large Language Models
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models
Aparna Elangovan
Ling Liu
Lei Xu
S. Bodapati
Dan Roth
ELM
100
10
0
28 May 2024
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Phakphum Artkaew
LRM
58
0
0
28 May 2024
IAPT: Instruction-Aware Prompt Tuning for Large Language Models
IAPT: Instruction-Aware Prompt Tuning for Large Language Models
Wei-wei Zhu
Aaron Xuxiang Tian
Congrui Yin
Yuan Ni
Xiaoling Wang
Guotong Xie
85
0
0
28 May 2024
Understanding Linear Probing then Fine-tuning Language Models from NTK
  Perspective
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
Akiyoshi Tomihari
Issei Sato
70
4
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
300
205
0
27 May 2024
Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration
Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration
Junjie Gao
Chongjian Wang
Zhongjun Ding
Shuangmin Chen
Shiqing Xin
Changhe Tu
Wenping Wang
3DPC
89
1
0
25 May 2024
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Jialin Zhao
Yingtao Zhang
Xinghang Li
Huaping Liu
C. Cannistraci
59
1
0
24 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELMALM
196
63
3
23 May 2024
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Ziqiao Ma
Zekun Wang
Joyce Chai
134
4
0
22 May 2024
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI:
  The First Romanian Natural Language Inference Corpus
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus
Eduard Poesina
Cornelia Caragea
Radu Tudor Ionescu
78
6
0
20 May 2024
Your Transformer is Secretly Linear
Your Transformer is Secretly Linear
Anton Razzhigaev
Matvey Mikhalchuk
Elizaveta Goncharova
Nikolai Gerasimenko
Ivan Oseledets
Denis Dimitrov
Andrey Kuznetsov
81
6
0
19 May 2024
Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
Pengxiang Lan
Enneng Yang
Yuting Liu
Guibing Guo
Linying Jiang
Jianzhe Zhao
Xingwei Wang
VLMAAML
76
1
0
19 May 2024
Large Language Models Lack Understanding of Character Composition of
  Words
Large Language Models Lack Understanding of Character Composition of Words
Andrew Shin
Kunitake Kaneko
100
11
0
18 May 2024
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Arnav Chavan
Nahush Lele
Deepak Gupta
70
2
0
17 May 2024
Benchmarking Large Language Models on CFLUE -- A Chinese Financial
  Language Understanding Evaluation Dataset
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset
Jie Zhu
Junhui Li
Yalong Wen
Lifan Guo
ELMALM
77
8
0
17 May 2024
CPsyExam: A Chinese Benchmark for Evaluating Psychology using
  Examinations
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations
Jiahao Zhao
Jingwei Zhu
Minghuan Tan
Min Yang
Di Yang
Chenhao Zhang
Guancheng Ye
Chengming Li
Xiping Hu
ELM
117
0
0
16 May 2024
PL-MTEB: Polish Massive Text Embedding Benchmark
PL-MTEB: Polish Massive Text Embedding Benchmark
Rafal Po'swiata
Slawomir Dadas
Michal Perelkiewicz
60
8
0
16 May 2024
$α$VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning
αααVIL: Learning to Leverage Auxiliary Tasks for Multitask Learning
Rafael Kourdis
Gabriel Gordon-Hall
P. Gorinski
37
0
0
13 May 2024
LlamaTurk: Adapting Open-Source Generative Large Language Models for
  Low-Resource Language
LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language
Cagri Toraman
VLM
112
5
0
13 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
136
100
0
13 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the
  Inferences of Plain and Embedded Disjunctions
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions
Polina Tsvilodub
Paul Marty
Sonia Ramotowska
Jacopo Romoli
Michael Franke
60
2
0
09 May 2024
Cross-Care: Assessing the Healthcare Implications of Pre-training Data
  on Language Model Bias
Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias
Shan Chen
Jack Gallifant
Mingye Gao
Pedro Moreira
Nikolaj Munch
...
Hugo J. W. L. Aerts
Brian Anthony
Leo Anthony Celi
William G. La Cava
Danielle S. Bitterman
80
12
0
09 May 2024
Zero-shot LLM-guided Counterfactual Generation for Text
Zero-shot LLM-guided Counterfactual Generation for Text
Amrita Bhattacharjee
Raha Moraffah
Joshua Garland
Huan Liu
91
7
0
08 May 2024
TREC iKAT 2023: A Test Collection for Evaluating Conversational and
  Interactive Knowledge Assistants
TREC iKAT 2023: A Test Collection for Evaluating Conversational and Interactive Knowledge Assistants
Mohammad Aliannejadi
Zahra Abbasiantaeb
Shubham Chatterjee
Jeffery Dalton
Leif Azzopardi
63
15
0
04 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
102
7
0
04 May 2024
Previous
123...567...282930
Next