Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

15 February 2020

Papers citing "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping"

50 / 137 papers shown

Title
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU Fenia Christopoulou Gerasimos Lampouras Ignacio Iacobacci 48 3 0 22 Oct 2022
Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks Laura Aina Nikos Voskarides Roi Blanco 22 0 0 21 Oct 2022
lo-fi: distributed fine-tuning without communication Mitchell Wortsman Suchin Gururangan Shen Li Ali Farhadi Ludwig Schmidt Michael G. Rabbat Ari S. Morcos 32 24 0 19 Oct 2022
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping Chenghao Yang Xuezhe Ma 35 6 0 19 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling Haw-Shiuan Chang Ruei-Yao Sun Kathryn Ricci Andrew McCallum 43 14 0 10 Oct 2022
Efficient Few-Shot Learning Without Prompts Lewis Tunstall Nils Reimers Unso Eun Seo Jo Luke Bates Daniel Korat Moshe Wasserblat Oren Pereg VLM 36 182 0 22 Sep 2022
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting Berend Gort Xiao-Yang Liu Xinghang Sun Jiechao Gao Shuai Chen Chris Wang 32 13 0 12 Sep 2022
Efficient Methods for Natural Language Processing: A Survey Marcos Vinícius Treviso Ji-Ung Lee Tianchu Ji Betty van Aken Qingqing Cao ... Emma Strubell Niranjan Balasubramanian Leon Derczynski Iryna Gurevych Roy Schwartz 33 109 0 31 Aug 2022
Combating high variance in Data-Scarce Implicit Hate Speech Classification Debaditya Pal Kaustubh Chaudhari Harsh Sharma 25 1 0 29 Aug 2022
Mere Contrastive Learning for Cross-Domain Sentiment Analysis Yun Luo Fang Guo Zihan Liu Yue Zhang 39 15 0 18 Aug 2022
Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI S. Budennyy V. Lazarev N. Zakharenko A. Korovin Olga Plosskaya ... Ivan Oseledets I. Barsola Ilya M. Egorov A. Kosterina L. Zhukov 39 91 0 31 Jul 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization Shijie Wu Benjamin Van Durme Mark Dredze 33 6 0 12 Jul 2022
Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models Liu Zhendong Wenyu Jiang Yan Zhang Chongjun Wang CML 11 0 0 22 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers Z. Yao Reza Yazdani Aminabadi Minjia Zhang Xiaoxia Wu Conglong Li Yuxiong He VLM MQ 73 444 0 04 Jun 2022
Can Foundation Models Help Us Achieve Perfect Secrecy? Simran Arora Christopher Ré FedML 24 6 0 27 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts Akari Asai Mohammadreza Salehi Matthew E. Peters Hannaneh Hajishirzi 130 100 0 24 May 2022
Few-Shot Natural Language Inference Generation with PDD: Prompt and Dynamic Demonstration Kaijian Li Shansan Gong Kenny Q. Zhu 27 0 0 21 May 2022
PreQuEL: Quality Estimation of Machine Translation Outputs in Advance Shachar Don-Yehiya Leshem Choshen Omri Abend 33 10 0 18 May 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning Orion Weller Kevin Seppi Matt Gardner 22 21 0 17 May 2022
How to Fine-tune Models with Few Samples: Update, Data Augmentation, and Test-time Augmentation Yujin Kim Jaehoon Oh Sungnyun Kim Se-Young Yun 29 6 0 13 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis Sandra Wankmüller 33 2 0 03 May 2022
Embedding Hallucination for Few-Shot Language Fine-tuning Yiren Jian Chongyang Gao Soroush Vosoughi 28 4 0 03 May 2022
Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce Data Annotation Required in Visual Commonsense Tasks Navid Rezaei Marek Reformat VLM 17 2 0 25 Apr 2022
mGPT: Few-Shot Learners Go Multilingual Oleh Shliazhko Alena Fenogenova Maria Tikhonova Vladislav Mikhailov Anastasia Kozlova Tatiana Shavrina 49 149 0 15 Apr 2022
Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments Christopher Hidey Fei Liu Rahul Goel 29 4 0 10 Apr 2022
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models Rabeeh Karimi Mahabadi Luke Zettlemoyer James Henderson Marzieh Saeidi Lambert Mathias Ves Stoyanov Majid Yazdani VLM 34 69 0 03 Apr 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Mitchell Wortsman Gabriel Ilharco S. Gadre Rebecca Roelofs Raphael Gontijo-Lopes ... Hongseok Namkoong Ali Farhadi Y. Carmon Simon Kornblith Ludwig Schmidt MoMe 54 922 1 10 Mar 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet? Guanzheng Chen Fangyu Liu Zaiqiao Meng Shangsong Liang 26 88 0 16 Feb 2022
A Differential Entropy Estimator for Training Neural Networks Georg Pichler Pierre Colombo Malik Boudiaf Günther Koliander Pablo Piantanida 25 21 0 14 Feb 2022
Adaptive Fine-Tuning of Transformer-Based Language Models for Named Entity Recognition Felix Stollenwerk 12 3 0 05 Feb 2022
Diversity Enhanced Active Learning with Strictly Proper Scoring Rules Wei Tan Lan Du Wray L. Buntine 16 30 0 27 Oct 2021
SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection Qin Liu H. Deng C. Lian Xiaoyang Chen Deqiang Xiao ... Xu Chen Tianshu Kuang J. Gateno P. Yap J. Xia 21 25 0 07 Oct 2021
UoB at SemEval-2021 Task 5: Extending Pre-Trained Language Models to Include Task and Domain-Specific Information for Toxic Span Prediction E. Yan Harish Tayyar Madabushi 26 2 0 07 Oct 2021
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier Linyang Li Demin Song Ruotian Ma Xipeng Qiu Xuanjing Huang 29 21 0 06 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization Yelysei Bondarenko Markus Nagel Tijmen Blankevoort MQ 25 133 0 27 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models Yuan Yao Ao Zhang Zhengyan Zhang Zhiyuan Liu Tat-Seng Chua Maosong Sun MLLM VPVLM VLM 208 221 0 24 Sep 2021
Weakly Supervised Explainable Phrasal Reasoning with Neural Fuzzy Logic Zijun Wu Zi Xuan Zhang Atharva Naik Zhijian Mei Mauajama Firdaus Lili Mou LRM NAI 44 14 0 18 Sep 2021
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning Runxin Xu Fuli Luo Zhiyuan Zhang Chuanqi Tan Baobao Chang Songfang Huang Fei Huang LRM 151 178 0 13 Sep 2021
Subword Mapping and Anchoring across Languages Giorgos Vernikos Andrei Popescu-Belis 70 12 0 09 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning Prasetya Ajie Utama N. Moosavi Victor Sanh Iryna Gurevych AAML 61 35 0 09 Sep 2021
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets Cheng-Han Chiang Hung-yi Lee SyDa 34 24 0 08 Sep 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice Rishabh Agarwal Max Schwarzer Pablo Samuel Castro Aaron Courville Marc G. Bellemare OffRL 59 637 0 30 Aug 2021
Rethinking Why Intermediate-Task Fine-Tuning Works Ting-Yun Chang Chi-Jen Lu LRM 19 29 0 26 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach Benjamin Ampel Sagar Samtani Steven Ullman Hsinchun Chen 25 35 0 03 Aug 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Pengfei Liu Weizhe Yuan Jinlan Fu Zhengbao Jiang Hiroaki Hayashi Graham Neubig VLM SyDa 58 3,838 0 28 Jul 2021
FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark Liang Xu Xiaojing Lu Chenyang Yuan Xuanwei Zhang Huilin Xu ... Guoao Wei X. Pan Xin Tian Libo Qin Hai Hu ELM 24 56 0 15 Jul 2021
Noise Stability Regularization for Improving BERT Fine-tuning Hang Hua Xingjian Li Dejing Dou Chengzhong Xu Jiebo Luo 19 43 0 10 Jul 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis Thibault Sellam Steve Yadlowsky Jason W. Wei Naomi Saphra Alexander DÁmour ... Iulia Turc Jacob Eisenstein Dipanjan Das Ian Tenney Ellie Pavlick 24 93 0 30 Jun 2021
A Closer Look at How Fine-tuning Changes BERT Yichu Zhou Vivek Srikumar 26 63 0 27 Jun 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models Robert L Logan IV Ivana Balavzević Eric Wallace Fabio Petroni Sameer Singh Sebastian Riedel VPVLM 39 207 0 24 Jun 2021