ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.00537
  4. Cited By
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
    ELM
ArXivPDFHTML

Papers citing "SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"

50 / 512 papers shown
Title
Setting the Record Straight on Transformer Oversmoothing
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
28
5
0
09 Jan 2024
The Butterfly Effect of Altering Prompts: How Small Changes and
  Jailbreaks Affect Large Language Model Performance
The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model Performance
A. Salinas
Fred Morstatter
45
49
0
08 Jan 2024
D3Former: Jointly Learning Repeatable Dense Detectors and
  Feature-enhanced Descriptors via Saliency-guided Transformer
D3Former: Jointly Learning Repeatable Dense Detectors and Feature-enhanced Descriptors via Saliency-guided Transformer
Junjie Gao
Pengfei Wang
Qiujie Dong
Qiong Zeng
Shiqing Xin
Caiming Zhang
19
0
0
20 Dec 2023
ALMANACS: A Simulatability Benchmark for Language Model Explainability
ALMANACS: A Simulatability Benchmark for Language Model Explainability
Edmund Mills
Shiye Su
Stuart J. Russell
Scott Emmons
54
7
0
20 Dec 2023
GSQA: An End-to-End Model for Generative Spoken Question Answering
GSQA: An End-to-End Model for Generative Spoken Question Answering
Min-Han Shih
Ho-Lam Chung
Yu-Chi Pai
Ming-Hao Hsu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
ELM
AuLLM
33
2
0
15 Dec 2023
Labels Need Prompts Too: Mask Matching for Natural Language
  Understanding Tasks
Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks
Bo Li
Wei Ye
Quan-ding Wang
Wen Zhao
Shikun Zhang
VLM
35
1
0
14 Dec 2023
Mutual Enhancement of Large and Small Language Models with Cross-Silo
  Knowledge Transfer
Mutual Enhancement of Large and Small Language Models with Cross-Silo Knowledge Transfer
Yongheng Deng
Ziqing Qiao
Ju Ren
Yang Liu
Yaoxue Zhang
23
11
0
10 Dec 2023
CLadder: Assessing Causal Reasoning in Language Models
CLadder: Assessing Causal Reasoning in Language Models
Zhijing Jin
Yuen Chen
Felix Leeb
Luigi Gresele
Ojasv Kamal
...
Kevin Blin
Fernando Gonzalez Adauto
Max Kleiman-Weiner
Mrinmaya Sachan
Bernhard Schölkopf
ReLM
ELM
LRM
48
64
0
07 Dec 2023
ArcMMLU: A Library and Information Science Benchmark for Large Language
  Models
ArcMMLU: A Library and Information Science Benchmark for Large Language Models
Shitou Zhang
Zuchao Li
Xingshen Liu
Liming Yang
Ping Wang
ELM
19
0
0
30 Nov 2023
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Sasha Luccioni
Yacine Jernite
Emma Strubell
44
161
0
28 Nov 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
45
0
0
16 Nov 2023
MELA: Multilingual Evaluation of Linguistic Acceptability
MELA: Multilingual Evaluation of Linguistic Acceptability
Ziyin Zhang
Yikang Liu
Wei Huang
Junyu Mao
Rui Wang
Hai Hu
27
3
0
15 Nov 2023
Argumentation Element Annotation Modeling using XLNet
Argumentation Element Annotation Modeling using XLNet
Christopher M. Ormerod
Amy Burkhardt
Mackenzie Young
Susan Lottridge
28
2
0
10 Nov 2023
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
19
1
0
26 Oct 2023
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Wei-wei Zhu
Xiaoling Wang
Huanran Zheng
Mosha Chen
Buzhou Tang
ELM
LM&MA
21
33
0
22 Oct 2023
Explainable Depression Symptom Detection in Social Media
Explainable Depression Symptom Detection in Social Media
Eliseo Bao Souto
Anxo Perez
Javier Parapar
30
5
0
20 Oct 2023
Bridging Information-Theoretic and Geometric Compression in Language
  Models
Bridging Information-Theoretic and Geometric Compression in Language Models
Emily Cheng
Corentin Kervadec
Marco Baroni
34
17
0
20 Oct 2023
Chain-of-Thought Tuning: Masked Language Models can also Think Step By
  Step in Natural Language Understanding
Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding
Caoyun Fan
Jidong Tian
Yitian Li
Wenqing Chen
Hao He
Yaohui Jin
LRM
32
3
0
18 Oct 2023
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
Kai Lv
Hang Yan
Qipeng Guo
Haijun Lv
Xipeng Qiu
ODL
27
20
0
16 Oct 2023
Decomposed Prompt Tuning via Low-Rank Reparameterization
Decomposed Prompt Tuning via Low-Rank Reparameterization
Yao Xiao
Lu Xu
Jiaxi Li
Wei Lu
Xiaoli Li
VLM
25
6
0
16 Oct 2023
Diversifying the Mixture-of-Experts Representation for Language Models
  with Orthogonal Optimizer
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
36
7
0
15 Oct 2023
Towards Informative Few-Shot Prompt with Maximum Information Gain for
  In-Context Learning
Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning
Hongfu Liu
Ye Wang
39
9
0
13 Oct 2023
GLoRE: Evaluating Logical Reasoning of Large Language Models
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLM
ELM
LRM
70
8
0
13 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
28
3
0
12 Oct 2023
LawBench: Benchmarking Legal Knowledge of Large Language Models
LawBench: Benchmarking Legal Knowledge of Large Language Models
Zhiwei Fei
Xiaoyu Shen
D. Zhu
Fengzhe Zhou
Zhuo Han
Songyang Zhang
Kai-xiang Chen
Zongwen Shen
Jidong Ge
ELM
AILaw
34
36
0
28 Sep 2023
VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
Yunshui Li
Binyuan Hui
Zhaochao Yin
Wanwei He
Run Luo
Yuxing Long
Min Yang
Fei Huang
Yongbin Li
26
1
0
14 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Using language models in the implicit automated assessment of
  mathematical short answer items
Using language models in the implicit automated assessment of mathematical short answer items
Christopher M. Ormerod
36
0
0
21 Aug 2023
GameEval: Evaluating LLMs on Conversational Games
GameEval: Evaluating LLMs on Conversational Games
Dan Qiao
Chenfei Wu
Yaobo Liang
Juntao Li
Nan Duan
ELM
LLMAG
26
20
0
19 Aug 2023
A Methodology for Generative Spelling Correction via Natural Spelling
  Errors Emulation across Multiple Domains and Languages
A Methodology for Generative Spelling Correction via Natural Spelling Errors Emulation across Multiple Domains and Languages
Nikita Martynov
Mark Baushenko
Anastasia Kozlova
Katerina Kolomeytseva
Aleksandr Abramov
Alena Fenogenova
38
2
0
18 Aug 2023
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers
A. Luccioni
53
19
0
14 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following
  Inspired by Real-World Use
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
31
77
0
12 Aug 2023
Simple synthetic data reduces sycophancy in large language models
Simple synthetic data reduces sycophancy in large language models
Jerry W. Wei
Da Huang
Yifeng Lu
Denny Zhou
Quoc V. Le
33
69
0
07 Aug 2023
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation
  from Text
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text
Nandana Mihindukulasooriya
Sanju Tiwari
Carlos F. Enguix
K. Lata
31
53
0
04 Aug 2023
Instruction-following Evaluation through Verbalizer Manipulation
Instruction-following Evaluation through Verbalizer Manipulation
Shiyang Li
Jun Yan
Hai Wang
Zheng Tang
Xiang Ren
Vijay Srinivasan
Hongxia Jin
36
25
0
20 Jul 2023
Retentive Network: A Successor to Transformer for Large Language Models
Retentive Network: A Successor to Transformer for Large Language Models
Yutao Sun
Li Dong
Shaohan Huang
Shuming Ma
Yuqing Xia
Jilong Xue
Jianyong Wang
Furu Wei
LRM
78
304
0
17 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For
  Transformer-based Language Models
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
22
41
0
12 Jul 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with
  Typological Features
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features
Ester Hlavnova
Sebastian Ruder
35
5
0
11 Jul 2023
Full Parameter Fine-tuning for Large Language Models with Limited
  Resources
Full Parameter Fine-tuning for Large Language Models with Limited Resources
Kai Lv
Yuqing Yang
Tengxiao Liu
Qi-jie Gao
Qipeng Guo
Xipeng Qiu
47
127
0
16 Jun 2023
GKD: A General Knowledge Distillation Framework for Large-scale
  Pre-trained Language Model
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Yang Yang
...
Jiahao Liu
Jingang Wang
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
ALM
MoE
33
11
0
11 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis,
  and LLMs Evaluations
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Ganqu Cui
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
39
73
0
07 Jun 2023
Causal interventions expose implicit situation models for commonsense
  language understanding
Causal interventions expose implicit situation models for commonsense language understanding
Takateru Yamakoshi
James L. McClelland
A. Goldberg
Robert D. Hawkins
25
6
0
06 Jun 2023
Measuring the Robustness of NLP Models to Domain Shifts
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
21
7
0
31 May 2023
From Adversarial Arms Race to Model-centric Evaluation: Motivating a
  Unified Automatic Robustness Evaluation Framework
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Yangyi Chen
Hongcheng Gao
Ganqu Cui
Lifan Yuan
Dehan Kong
...
Longtao Huang
H. Xue
Zhiyuan Liu
Maosong Sun
Heng Ji
AAML
ELM
30
6
0
29 May 2023
Entailment as Robust Self-Learner
Entailment as Robust Self-Learner
Jiaxin Ge
Hongyin Luo
Yoon Kim
James R. Glass
42
3
0
26 May 2023
Parameter-Efficient Fine-Tuning without Introducing New Latency
Parameter-Efficient Fine-Tuning without Introducing New Latency
Baohao Liao
Yan Meng
Christof Monz
18
49
0
26 May 2023
Large Language Models are Few-Shot Health Learners
Large Language Models are Few-Shot Health Learners
Xin Liu
Daniel J. McDuff
G. Kovács
I. Galatzer-Levy
Jacob Sunshine
Jiening Zhan
M. Poh
Shun Liao
P. Achille
Shwetak N. Patel
LM&MA
AI4MH
39
103
0
24 May 2023
Large Language Models for User Interest Journeys
Large Language Models for User Interest Journeys
Konstantina Christakopoulou
Alberto Lalama
Cj Adams
Iris Qu
Yifat Amir
...
Dina Bseiso
Sarah Scodel
Lucas Dixon
Ed H. Chi
Minmin Chen
21
25
0
24 May 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
34
9
0
24 May 2023
EvEval: A Comprehensive Evaluation of Event Semantics for Large Language
  Models
EvEval: A Comprehensive Evaluation of Event Semantics for Large Language Models
Zhengwei Tao
Zhi Jin
Xiaoying Bai
Haiyan Zhao
Yanlin Feng
Jia Li
Wenpeng Hu
34
4
0
24 May 2023
Previous
123456...91011
Next