ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.00537
  4. Cited By
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
v1v2v3 (latest)

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
    ELM
ArXiv (abs)PDFHTML

Papers citing "SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"

50 / 1,500 papers shown
Title
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Yiming Wang
Yu Lin
Xiaodong Zeng
Guannan Zhang
MoMe
137
21
0
20 Nov 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
89
0
0
16 Nov 2023
ARES: An Automated Evaluation Framework for Retrieval-Augmented
  Generation Systems
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
Jon Saad-Falcon
Omar Khattab
Christopher Potts
Matei A. Zaharia
RALM
108
120
0
16 Nov 2023
Memory Augmented Language Models through Mixture of Word Experts
Memory Augmented Language Models through Mixture of Word Experts
Cicero Nogueira dos Santos
James Lee-Thorp
Isaac Noble
Chung-Ching Chang
David C. Uthus
MoE
105
8
0
15 Nov 2023
CLEAN-EVAL: Clean Evaluation on Contaminated Large Language Models
CLEAN-EVAL: Clean Evaluation on Contaminated Large Language Models
Wenhong Zhu
Hong-ping Hao
Zhiwei He
Yun-Ze Song
Yumeng Zhang
Hanxu Hu
Yiran Wei
Rui Wang
Hongyuan Lu
AAMLELM
52
12
0
15 Nov 2023
MELA: Multilingual Evaluation of Linguistic Acceptability
MELA: Multilingual Evaluation of Linguistic Acceptability
Ziyin Zhang
Yikang Liu
Wei-Ping Huang
Junyu Mao
Rui Wang
Hai Hu
74
3
0
15 Nov 2023
CLIMB: Curriculum Learning for Infant-inspired Model Building
CLIMB: Curriculum Learning for Infant-inspired Model Building
Richard Diehl Martinez
Zébulon Goriely
Hope McGovern
Christopher Davis
Andrew Caines
P. Buttery
Lisa Beinborn
79
13
0
15 Nov 2023
Low-Rank Adaptation for Multilingual Summarization: An Empirical Study
Low-Rank Adaptation for Multilingual Summarization: An Empirical Study
Chenxi Whitehouse
Fantine Huot
Jasmijn Bastings
Mostafa Dehghani
Chu-Cheng Lin
Mirella Lapata
62
8
0
14 Nov 2023
Do large language models and humans have similar behaviors in causal
  inference with script knowledge?
Do large language models and humans have similar behaviors in causal inference with script knowledge?
Xudong Hong
Margarita Ryzhova
Daniel Adrian Biondi
Ram Sarkar
77
5
0
13 Nov 2023
How are Prompts Different in Terms of Sensitivity?
How are Prompts Different in Terms of Sensitivity?
Sheng Lu
Hendrik Schuff
Iryna Gurevych
87
19
0
13 Nov 2023
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in
  Video-Language Models
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models
.Ilker Kesen
Andrea Pedrotti
Mustafa Dogan
Michele Cafagna
Emre Can Acikgoz
...
Iacer Calixto
Anette Frank
Albert Gatt
Aykut Erdem
Erkut Erdem
94
19
0
13 Nov 2023
L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational
  Language Models
L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational Language Models
Aidin Shiri
Kaushik Roy
Amit P. Sheth
Manas Gaur
KELM
52
4
0
11 Nov 2023
Argumentation Element Annotation Modeling using XLNet
Argumentation Element Annotation Modeling using XLNet
Christopher M. Ormerod
Amy Burkhardt
Mackenzie Young
Susan Lottridge
41
4
0
10 Nov 2023
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in
  Large Language Models
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models
Shahriar Golchin
Mihai Surdeanu
77
26
0
10 Nov 2023
Efficiently Adapting Pretrained Language Models To New Languages
Efficiently Adapting Pretrained Language Models To New Languages
Zoltan Csaki
Pian Pawakapan
Urmish Thakker
Qiantong Xu
CLL
97
18
0
09 Nov 2023
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for
  Human-Aligned LLMs
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs
Shuyi Xie
Wenlin Yao
Yong Dai
Shaobo Wang
Donlin Zhou
...
Zhichao Hu
Dong Yu
Zhengyou Zhang
Jing Nie
Yuhong Liu
ELMALM
98
4
0
09 Nov 2023
You Only Forward Once: Prediction and Rationalization in A Single
  Forward Pass
You Only Forward Once: Prediction and Rationalization in A Single Forward Pass
Han Jiang
Junwen Duan
Zhe Qu
Jianxin Wang
95
2
0
04 Nov 2023
Not all layers are equally as important: Every Layer Counts BERT
Not all layers are equally as important: Every Layer Counts BERT
Lucas Georges Gabriel Charpentier
David Samuel
89
18
0
03 Nov 2023
Post Turing: Mapping the landscape of LLM Evaluation
Post Turing: Mapping the landscape of LLM Evaluation
Alexey Tikhonov
Ivan P. Yamshchikov
ELM
95
4
0
03 Nov 2023
The language of prompting: What linguistic properties make a prompt
  successful?
The language of prompting: What linguistic properties make a prompt successful?
Alina Leidinger
R. Rooij
Ekaterina Shutova
96
44
0
03 Nov 2023
Too Much Information: Keeping Training Simple for BabyLMs
Too Much Information: Keeping Training Simple for BabyLMs
Lukas Edman
Lisa Bylinina
74
4
0
03 Nov 2023
Comparing Optimization Targets for Contrast-Consistent Search
Comparing Optimization Targets for Contrast-Consistent Search
Hugo Fry
S. Fallows
Ian Fan
Jamie Wright
Nandi Schoots
29
2
0
01 Nov 2023
Defining a New NLP Playground
Defining a New NLP Playground
Sha Li
Chi Han
Pengfei Yu
Carl Edwards
Manling Li
...
Yi R. Fung
Charles Yu
Joel R. Tetreault
Eduard H. Hovy
Heng Ji
117
5
0
31 Oct 2023
Increasing The Performance of Cognitively Inspired Data-Efficient
  Language Models via Implicit Structure Building
Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building
Omar Momen
David Arps
Laura Kallmeyer
AI4CE
74
2
0
31 Oct 2023
Leveraging Word Guessing Games to Assess the Intelligence of Large
  Language Models
Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Tian Liang
Zhiwei He
Jen-tse Huang
Wenxuan Wang
Wenxiang Jiao
Rui Wang
Yujiu Yang
Zhaopeng Tu
Shuming Shi
Xing Wang
LLMAG
127
5
0
31 Oct 2023
Does GPT-4 pass the Turing test?
Does GPT-4 pass the Turing test?
Cameron R. Jones
Benjamin K. Bergen
ELM
121
37
0
31 Oct 2023
MoCa: Measuring Human-Language Model Alignment on Causal and Moral
  Judgment Tasks
MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Allen Nie
Yuhui Zhang
Atharva Amdekar
Chris Piech
Tatsunori Hashimoto
Tobias Gerstenberg
80
40
0
30 Oct 2023
Improving Input-label Mapping with Demonstration Replay for In-context
  Learning
Improving Input-label Mapping with Demonstration Replay for In-context Learning
Zhuocheng Gong
Jiahao Liu
Qifan Wang
Jingang Wang
Xunliang Cai
Dongyan Zhao
Rui Yan
70
2
0
30 Oct 2023
Mean BERTs make erratic language teachers: the effectiveness of latent
  bootstrapping in low-resource settings
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
David Samuel
54
4
0
30 Oct 2023
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and
  Scalable Large Mixture-of-Experts Models
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models
Zhixu Du
Shiyu Li
Yuhao Wu
Xiangyu Jiang
Jingwei Sun
Qilin Zheng
Yongkai Wu
Ang Li
Hai Helen Li
Yiran Chen
MoE
102
14
0
29 Oct 2023
On General Language Understanding
On General Language Understanding
David Schlangen
105
1
0
27 Oct 2023
NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination
  for each Benchmark
NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark
Oscar Sainz
Jon Ander Campos
Iker García-Ferrero
Julen Etxaniz
Oier López de Lacalle
Eneko Agirre
80
185
0
27 Oct 2023
TarGEN: Targeted Data Generation with Large Language Models
TarGEN: Targeted Data Generation with Large Language Models
Himanshu Gupta
Kevin Scaria
Ujjwala Anantheswaran
Shreyas Verma
Mihir Parmar
Saurabh Arjun Sawant
Chitta Baral
Swaroop Mishra
SyDa
70
9
0
27 Oct 2023
Lil-Bevo: Explorations of Strategies for Training Language Models in
  More Humanlike Ways
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Venkata S Govindarajan
Juan Diego Rodriguez
Kaj Bostrom
Kyle Mahowald
65
1
0
26 Oct 2023
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
65
1
0
26 Oct 2023
BabyStories: Can Reinforcement Learning Teach Baby Language Models to
  Write Better Stories?
BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?
Xingmeng Zhao
Tongnian Wang
Sheri Osborn
Anthony Rios
53
6
0
25 Oct 2023
Do Stochastic Parrots have Feelings Too? Improving Neural Detection of
  Synthetic Text via Emotion Recognition
Do Stochastic Parrots have Feelings Too? Improving Neural Detection of Synthetic Text via Emotion Recognition
Alan Cowap
Yvette Graham
Jennifer Foster
DeLMO
56
0
0
24 Oct 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
  Large Language Model Compression
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression
Jiduan Liu
Jiahao Liu
Qifan Wang
Jingang Wang
Xunliang Cai
Dongyan Zhao
Ran Wang
Rui Yan
61
4
0
24 Oct 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without
  Full Large Language Model
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
95
5
0
24 Oct 2023
CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
Mete Ismayilzada
Debjit Paul
Syrielle Montariol
Mor Geva
Antoine Bosselut
LRM
91
5
0
23 Oct 2023
Unveiling A Core Linguistic Region in Large Language Models
Unveiling A Core Linguistic Region in Large Language Models
Jun Zhao
Zhihao Zhang
Yide Ma
Qi Zhang
Tao Gui
Luhui Gao
Xuanjing Huang
119
6
0
23 Oct 2023
ALCUNA: Large Language Models Meet New Knowledge
ALCUNA: Large Language Models Meet New Knowledge
Xunjian Yin
Baizhou Huang
Xiaojun Wan
98
27
0
23 Oct 2023
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for
  Social Media NLP Research
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research
Dimosthenis Antypas
Asahi Ushio
Francesco Barbieri
Leonardo Neves
Kiamehr Rezaee
Luis Espinosa-Anke
Jiaxin Pei
Jose Camacho-Collados
66
10
0
23 Oct 2023
An International Consortium for Evaluations of Societal-Scale Risks from
  Advanced AI
An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Ross Gruetzemacher
Alan Chan
Kevin Frazier
Christy Manning
Stepán Los
...
Clíodhna Ní Ghuidhir
Mark M. Bailey
Daniel Eth
Toby D. Pilditch
Kyle A. Kilian
51
6
0
22 Oct 2023
Orthogonal Subspace Learning for Language Model Continual Learning
Orthogonal Subspace Learning for Language Model Continual Learning
Xiao Wang
Tianze Chen
Qiming Ge
Han Xia
Rong Bao
Rui Zheng
Qi Zhang
Tao Gui
Xuanjing Huang
CLL
155
114
0
22 Oct 2023
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Wei-wei Zhu
Xiaoling Wang
Huanran Zheng
Mosha Chen
Buzhou Tang
ELMLM&MA
69
36
0
22 Oct 2023
Foundation Model's Embedded Representations May Detect Distribution
  Shift
Foundation Model's Embedded Representations May Detect Distribution Shift
Max Vargas
Adam Tsou
A. Engel
Tony Chiang
70
1
0
20 Oct 2023
Explainable Depression Symptom Detection in Social Media
Explainable Depression Symptom Detection in Social Media
Eliseo Bao Souto
Anxo Perez
Javier Parapar
79
8
0
20 Oct 2023
Bridging Information-Theoretic and Geometric Compression in Language
  Models
Bridging Information-Theoretic and Geometric Compression in Language Models
Emily Cheng
Corentin Kervadec
Marco Baroni
78
21
0
20 Oct 2023
Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised
  Language Understanding
Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised Language Understanding
Jianing Wang
Qiushi Sun
Nuo Chen
Chengyu Wang
Jun Huang
Ming Gao
Xiang Li
UQLM
66
4
0
19 Oct 2023
Previous
123...91011...282930
Next