ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.00537
  4. Cited By
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

2 May 2019
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
    ELM
ArXivPDFHTML

Papers citing "SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems"

50 / 513 papers shown
Title
mGPT: Few-Shot Learners Go Multilingual
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
49
149
0
15 Apr 2022
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context
  NLP Models
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models
Phyllis Ang
Bhuwan Dhingra
Lisa Wu Wills
30
6
0
15 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
99
802
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
50
32
0
13 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning
  Tasks
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
Ashwin Kalyan
AIMat
ReLM
ELM
LRM
30
102
0
12 Apr 2022
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained
  Language Models For Classification Tasks
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks
Haoran Yang
Piji Li
Wai Lam
31
3
0
10 Apr 2022
Testing the limits of natural language models for predicting human
  language judgments
Testing the limits of natural language models for predicting human language judgments
Tal Golan
Matthew Siegelman
N. Kriegeskorte
Christopher A. Baldassano
22
15
0
07 Apr 2022
Fusing finetuned models for better pretraining
Fusing finetuned models for better pretraining
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
54
87
0
06 Apr 2022
Fact Checking with Insufficient Evidence
Fact Checking with Insufficient Evidence
Pepa Atanasova
J. Simonsen
Christina Lioma
Isabelle Augenstein
37
14
0
05 Apr 2022
PERFECT: Prompt-free and Efficient Few-shot Learning with Language
  Models
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Rabeeh Karimi Mahabadi
Luke Zettlemoyer
James Henderson
Marzieh Saeidi
Lambert Mathias
Ves Stoyanov
Majid Yazdani
VLM
34
69
0
03 Apr 2022
A Survey on Aspect-Based Sentiment Classification
A Survey on Aspect-Based Sentiment Classification
Gianni Brauwers
Flavius Frasincar
LLMAG
39
110
0
27 Mar 2022
A Theoretically Grounded Benchmark for Evaluating Machine Commonsense
A Theoretically Grounded Benchmark for Evaluating Machine Commonsense
Henrique M. Dinis Santos
Ke Shen
Alice M. Mulvehill
Yasaman Razeghi
D. McGuinness
Mayank Kejriwal
ELM
LRM
31
4
0
23 Mar 2022
Towards Explainable Evaluation Metrics for Natural Language Generation
Towards Explainable Evaluation Metrics for Natural Language Generation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
AAML
ELM
30
20
0
21 Mar 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation
  of Dialog: Research Directions and Challenges
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Shikib Mehri
Jinho Choi
L. F. D’Haro
Jan Deriu
M. Eskénazi
...
David Traum
Yi-Ting Yeh
Zhou Yu
Yizhe Zhang
Chen Zhang
30
21
0
18 Mar 2022
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text
  Processing
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing
Ilias Chalkidis
Tommaso Pasini
Shenmin Zhang
Letizia Tomada
Sebastian Felix Schwemer
Anders Søgaard
AILaw
40
54
0
14 Mar 2022
SciNLI: A Corpus for Natural Language Inference on Scientific Text
SciNLI: A Corpus for Natural Language Inference on Scientific Text
Mobashir Sadat
Cornelia Caragea
AILaw
32
35
0
13 Mar 2022
CoDA21: Evaluating Language Understanding Capabilities of NLP Models
  With Context-Definition Alignment
CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment
Lutfi Kerem Senel
Timo Schick
Hinrich Schütze
ELM
ALM
23
5
0
11 Mar 2022
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both
  Language and Vision-and-Language Tasks
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks
Zhengkun Zhang
Wenya Guo
Xiaojun Meng
Yasheng Wang
Yadao Wang
Xin Jiang
Qun Liu
Zhenglu Yang
34
15
0
08 Mar 2022
Divide and Conquer: Text Semantic Matching with Disentangled Keywords
  and Intents
Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents
Yicheng Zou
Hongwei Liu
Tao Gui
Junzhe Wang
Qi Zhang
M. Tang
Haixiang Li
Dan Wang
DRL
35
29
0
06 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
363
12,003
0
04 Mar 2022
Mukayese: Turkish NLP Strikes Back
Mukayese: Turkish NLP Strikes Back
Ali Safaya
Emirhan Kurtulucs
Arda Goktougan
Deniz Yuret
28
22
0
02 Mar 2022
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification
  and Reasoning Abilities of Language Models
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models
Daniel Gao
Yantao Jia
Lei Li
Chengzhen Fu
Zhicheng Dou
Hao Jiang
Xinyu Zhang
Lei Chen
Bo Zhao
KELM
22
8
0
28 Feb 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
70
1,403
0
25 Feb 2022
Prompt-Learning for Short Text Classification
Prompt-Learning for Short Text Classification
Yi Zhu
Xinke Zhou
Jipeng Qiang
Yun Li
Yunhao Yuan
Xindong Wu
VLM
18
34
0
23 Feb 2022
$\mathcal{Y}$-Tuning: An Efficient Tuning Paradigm for Large-Scale
  Pre-Trained Models via Label Representation Learning
Y\mathcal{Y}Y-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning
Yitao Liu
Chen An
Xipeng Qiu
29
17
0
20 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
329
0
18 Feb 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
24
182
0
17 Feb 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Guanzheng Chen
Fangyu Liu
Zaiqiao Meng
Shangsong Liang
26
88
0
16 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance
  from Small Scale Experiments
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
19
17
0
13 Feb 2022
Slovene SuperGLUE Benchmark: Translation and Evaluation
Slovene SuperGLUE Benchmark: Translation and Evaluation
Aleš Žagar
Marko Robnik-Šikonja
25
10
0
10 Feb 2022
TimeLMs: Diachronic Language Models from Twitter
TimeLMs: Diachronic Language Models from Twitter
Daniel Loureiro
Francesco Barbieri
Leonardo Neves
Luis Espinosa Anke
Jose Camacho-Collados
41
249
0
08 Feb 2022
Conversational Agents: Theory and Applications
Conversational Agents: Theory and Applications
M. Wahde
M. Virgolin
LLMAG
26
24
0
07 Feb 2022
GatorTron: A Large Clinical Language Model to Unlock Patient Information
  from Unstructured Electronic Health Records
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records
Xi Yang
Aokun Chen
Nima M. Pournejatian
Hoo-Chang Shin
Kaleb E. Smith
...
Duane A. Mitchell
W. Hogan
E. Shenkman
Jiang Bian
Yonghui Wu
AI4MH
LM&MA
37
508
0
02 Feb 2022
Correcting diacritics and typos with a ByT5 transformer model
Correcting diacritics and typos with a ByT5 transformer model
Lukas Stankevicius
M. Lukoševičius
J. Kapočiūtė-Dzikienė
Monika Briediene
Tomas Krilavičius
28
20
0
31 Jan 2022
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
  Languages
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
Emanuele Bugliarello
Fangyu Liu
Jonas Pfeiffer
Siva Reddy
Desmond Elliott
E. Ponti
Ivan Vulić
MLLM
VLM
ELM
50
62
0
27 Jan 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an
  Application to Question Answering Systems
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
29
6
0
15 Jan 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to
  Power Next-Generation AI Scale
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Samyam Rajbhandari
Conglong Li
Z. Yao
Minjia Zhang
Reza Yazdani Aminabadi
A. A. Awan
Jeff Rasley
Yuxiong He
47
284
0
14 Jan 2022
SCROLLS: Standardized CompaRison Over Long Language Sequences
SCROLLS: Standardized CompaRison Over Long Language Sequences
Uri Shaham
Elad Segal
Maor Ivgi
Avia Efrat
Ori Yoran
...
Ankit Gupta
Wenhan Xiong
Mor Geva
Jonathan Berant
Omer Levy
RALM
37
133
0
10 Jan 2022
Few-shot Learning with Multilingual Language Models
Few-shot Learning with Multilingual Language Models
Xi Lin
Todor Mihaylov
Mikel Artetxe
Tianlu Wang
Shuohui Chen
...
Luke Zettlemoyer
Zornitsa Kozareva
Mona T. Diab
Ves Stoyanov
Xian Li
BDL
ELM
LRM
64
285
0
20 Dec 2021
Pushing the Limits of Rule Reasoning in Transformers through Natural
  Language Satisfiability
Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability
Kyle Richardson
Ashish Sabharwal
ReLM
LRM
30
24
0
16 Dec 2021
Fine-Tuning Large Neural Language Models for Biomedical Natural Language
  Processing
Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing
Robert Tinn
Hao Cheng
Yu Gu
Naoto Usuyama
Xiaodong Liu
Tristan Naumann
Jianfeng Gao
Hoifung Poon
LM&MA
17
111
0
15 Dec 2021
Measuring Context-Word Biases in Lexical Semantic Datasets
Measuring Context-Word Biases in Lexical Semantic Datasets
Qianchu Liu
Diana McCarthy
Anna Korhonen
31
2
0
13 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
40
690
0
08 Dec 2021
How not to Lie with a Benchmark: Rearranging NLP Leaderboards
How not to Lie with a Benchmark: Rearranging NLP Leaderboards
Tatiana Shavrina
Valentin Malykh
ALM
ELM
423
10
0
02 Dec 2021
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic
  benchmarking
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking
Ronen Tamari
Kyle Richardson
Aviad Sar-Shalom
Noam Kahlon
Nelson F. Liu
Reut Tsarfaty
Dafna Shahaf
43
5
0
30 Nov 2021
AI and the Everything in the Whole Wide World Benchmark
AI and the Everything in the Whole Wide World Benchmark
Inioluwa Deborah Raji
Emily M. Bender
Amandalynne Paullada
Emily L. Denton
A. Hanna
30
291
0
26 Nov 2021
Few-shot Named Entity Recognition with Cloze Questions
Few-shot Named Entity Recognition with Cloze Questions
V. Gatta
V. Moscato
Marco Postiglione
Giancarlo Sperlí
27
4
0
24 Nov 2021
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with
  Gradient-Disentangled Embedding Sharing
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Pengcheng He
Jianfeng Gao
Weizhu Chen
35
1,120
0
18 Nov 2021
Few-Shot Self-Rationalization with Natural Language Prompts
Few-Shot Self-Rationalization with Natural Language Prompts
Ana Marasović
Iz Beltagy
Doug Downey
Matthew E. Peters
LRM
26
106
0
16 Nov 2021
Adversarially Constructed Evaluation Sets Are More Challenging, but May
  Not Be Fair
Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Jason Phang
Angelica Chen
William Huang
Samuel R. Bowman
AAML
28
13
0
16 Nov 2021
Previous
123...10116789
Next