ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11299
  4. Cited By
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models

Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

25 September 2019
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
    MoE
ArXivPDFHTML

Papers citing "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models"

34 / 134 papers shown
Title
AMMUS : A Survey of Transformer-based Pretrained Models in Natural
  Language Processing
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
26
258
0
12 Aug 2021
Robust Transfer Learning with Pretrained Language Models through
  Adapters
Robust Transfer Learning with Pretrained Language Models through Adapters
Wenjuan Han
Bo Pang
Ying Nian Wu
14
54
0
05 Aug 2021
Noise Stability Regularization for Improving BERT Fine-tuning
Noise Stability Regularization for Improving BERT Fine-tuning
Hang Hua
Xingjian Li
Dejing Dou
Chengzhong Xu
Jiebo Luo
19
43
0
10 Jul 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis
The MultiBERTs: BERT Reproductions for Robustness Analysis
Thibault Sellam
Steve Yadlowsky
Jason W. Wei
Naomi Saphra
Alexander DÁmour
...
Iulia Turc
Jacob Eisenstein
Dipanjan Das
Ian Tenney
Ellie Pavlick
22
93
0
30 Jun 2021
A Closer Look at How Fine-tuning Changes BERT
A Closer Look at How Fine-tuning Changes BERT
Yichu Zhou
Vivek Srikumar
24
63
0
27 Jun 2021
Well-tuned Simple Nets Excel on Tabular Datasets
Well-tuned Simple Nets Excel on Tabular Datasets
Arlind Kadra
Marius Lindauer
Frank Hutter
Josif Grabocka
13
185
0
21 Jun 2021
Variational Information Bottleneck for Effective Low-Resource
  Fine-Tuning
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Rabeeh Karimi Mahabadi
Yonatan Belinkov
James Henderson
DRL
16
71
0
10 Jun 2021
On the Effectiveness of Adapter-based Tuning for Pretrained Language
  Model Adaptation
On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation
Ruidan He
Linlin Liu
Hai Ye
Qingyu Tan
Bosheng Ding
Liying Cheng
Jia-Wei Low
Lidong Bing
Luo Si
24
196
0
06 Jun 2021
Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene
Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene
Ruikun Luo
Guanhuan Huang
Xiaojun Quan
CLL
8
10
0
04 Jun 2021
PTR: Prompt Tuning with Rules for Text Classification
PTR: Prompt Tuning with Rules for Text Classification
Xu Han
Weilin Zhao
Ning Ding
Zhiyuan Liu
Maosong Sun
VLM
33
513
0
24 May 2021
Joint Text and Label Generation for Spoken Language Understanding
Joint Text and Label Generation for Spoken Language Understanding
Yang Li
Ben Athiwaratkun
Cicero Nogueira dos Santos
Bing Xiang
17
0
0
11 May 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
30
183
0
29 Apr 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
49
296
0
15 Mar 2021
SocialNLP EmotionGIF 2020 Challenge Overview: Predicting Reaction GIF
  Categories on Social Media
SocialNLP EmotionGIF 2020 Challenge Overview: Predicting Reaction GIF Categories on Social Media
Boaz Shmueli
Lun-Wei Ku
Soumya Ray
26
3
0
24 Feb 2021
Studying Catastrophic Forgetting in Neural Ranking Models
Studying Catastrophic Forgetting in Neural Ranking Models
Jesús Lovón-Melgarejo
Laure Soulier
K. Pinel-Sauvagnat
L. Tamine
CLL
31
13
0
18 Jan 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,918
0
31 Dec 2020
Finding Sparse Structures for Domain Specific Neural Machine Translation
Finding Sparse Structures for Domain Specific Neural Machine Translation
Jianze Liang
Chengqi Zhao
Mingxuan Wang
Xipeng Qiu
Lei Li
CLL
19
4
0
19 Dec 2020
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
11
383
0
14 Dec 2020
Efficient Estimation of Influence of a Training Instance
Efficient Estimation of Influence of a Training Instance
Sosuke Kobayashi
Sho Yokoi
Jun Suzuki
Kentaro Inui
TDI
27
15
0
08 Dec 2020
CPM: A Large-scale Generative Chinese Pre-trained Language Model
CPM: A Large-scale Generative Chinese Pre-trained Language Model
Zhengyan Zhang
Xu Han
Hao Zhou
Pei Ke
Yuxian Gu
...
Wentao Han
Jie Tang
Juan-Zi Li
Xiaoyan Zhu
Maosong Sun
18
113
0
01 Dec 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
219
608
0
13 Oct 2020
Incorporating BERT into Parallel Sequence Decoding with Adapters
Incorporating BERT into Parallel Sequence Decoding with Adapters
Junliang Guo
Zhirui Zhang
Linli Xu
Hao-Ran Wei
Boxing Chen
Enhong Chen
43
69
0
13 Oct 2020
AUBER: Automated BERT Regularization
AUBER: Automated BERT Regularization
Hyun Dong Lee
Seongmin Lee
U. Kang
6
7
0
30 Sep 2020
Augmented Natural Language for Generative Sequence Labeling
Augmented Natural Language for Generative Sequence Labeling
Ben Athiwaratkun
Cicero Nogueira dos Santos
Jason Krone
Bing Xiang
VLM
6
60
0
15 Sep 2020
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the
  COVID-19 Open Research Dataset
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
Edwin Zhang
Nikhil Gupta
Raphael Tang
Xiao Han
Ronak Pradeep
...
Yue Zhang
Rodrigo Nogueira
Kyunghyun Cho
Hui Fang
Jimmy J. Lin
15
59
0
14 Jul 2020
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for
  Improved Generalization
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Sang Michael Xie
Tengyu Ma
Percy Liang
30
13
0
29 Jun 2020
Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
Elisa Kreiss
Zijian Wang
Christopher Potts
6
1
0
17 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
30
441
0
10 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
14
352
0
08 Jun 2020
Selecting Informative Contexts Improves Language Model Finetuning
Selecting Informative Contexts Improves Language Model Finetuning
Richard Antonello
Nicole M. Beckage
Javier S. Turek
Alexander G. Huth
18
10
0
01 May 2020
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
  Forgetting
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
Sanyuan Chen
Yutai Hou
Yiming Cui
Wanxiang Che
Ting Liu
Xiangzhan Yu
KELM
CLL
8
212
0
27 Apr 2020
A Primer in BERTology: What we know about how BERT works
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
30
1,455
0
27 Feb 2020
Renofeation: A Simple Transfer Learning Method for Improved Adversarial
  Robustness
Renofeation: A Simple Transfer Learning Method for Improved Adversarial Robustness
Ting-Wu Chin
Cha Zhang
Diana Marculescu
AAML
16
1
0
07 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
Previous
123