ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11299
  4. Cited By
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models

Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

25 September 2019
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
    MoE
ArXivPDFHTML

Papers citing "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models"

50 / 134 papers shown
Title
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
Brandon Smith
Mohamed Reda Bouadjenek
Tahsin Alamgir Kheya
Phillip Dawson
S. Aryal
ALM
ELM
26
0
0
14 May 2025
Memorization and Knowledge Injection in Gated LLMs
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
104
0
0
30 Apr 2025
Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
Swati Rallapalli
Shannon Gallagher
Andrew O. Mellinger
Jasmine Ratchford
Anusha Sinha
Tyler Brooks
William R. Nichols
Nick Winski
Bryan Brown
48
0
0
10 Mar 2025
IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment
IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment
Yiming Zhang
Zheng Chang
Wentao Cai
MengXing Ren
Kang Yuan
Yining Sun
Zenghui Ding
LM&MA
36
3
0
06 Jan 2025
A Practical Guide to Fine-tuning Language Models with Limited Data
A Practical Guide to Fine-tuning Language Models with Limited Data
Márton Szép
Daniel Rueckert
Rüdiger von Eisenhart-Rothe
Florian Hinterwimmer
SyDa
ALM
44
2
0
14 Nov 2024
Merge to Learn: Efficiently Adding Skills to Language Models with Model
  Merging
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Jacob Morrison
Noah A. Smith
Hannaneh Hajishirzi
Pang Wei Koh
Jesse Dodge
Pradeep Dasigi
KELM
MoMe
CLL
38
1
0
16 Oct 2024
Deep Transfer Learning: Model Framework and Error Analysis
Deep Transfer Learning: Model Framework and Error Analysis
Yuling Jiao
Huazhen Lin
Yuchen Luo
Jerry Zhijian Yang
44
1
0
12 Oct 2024
Generalizing to any diverse distribution: uniformity, gentle finetuning
  and rebalancing
Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing
Andreas Loukas
Karolis Martinkus
Ed Wagstaff
Kyunghyun Cho
OOD
23
1
0
08 Oct 2024
Resource Allocation for Stable LLM Training in Mobile Edge Computing
Resource Allocation for Stable LLM Training in Mobile Edge Computing
Chang Liu
Jun Zhao
30
3
0
30 Sep 2024
Enhancing Parameter Efficiency and Generalization in Large-Scale Models:
  A Regularized and Masked Low-Rank Adaptation Approach
Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach
Yuzhu Mao
Siqi Ping
Zihao Zhao
Yang Liu
Wenbo Ding
37
1
0
16 Jul 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model
  Merging
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
40
13
0
11 Jul 2024
Information Guided Regularization for Fine-tuning Language Models
Information Guided Regularization for Fine-tuning Language Models
Mandar Sharma
Nikhil Muralidhar
Shengzhe Xu
Raquib Bin Yousuf
Naren Ramakrishnan
35
0
0
20 Jun 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability
  of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher
Ján Cegin
Róbert Belanec
Jakub Simko
Ivan Srba
M. Bieliková
39
1
0
18 Jun 2024
Yo'LLaVA: Your Personalized Language and Vision Assistant
Yo'LLaVA: Your Personalized Language and Vision Assistant
Thao Nguyen
Haotian Liu
Yuheng Li
Mu Cai
Utkarsh Ojha
Yong Jae Lee
VLM
MLLM
53
15
0
13 Jun 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
MoMe
24
18
0
28 May 2024
Learning to Rebalance Multi-Modal Optimization by Adaptively Masking
  Subnetworks
Learning to Rebalance Multi-Modal Optimization by Adaptively Masking Subnetworks
Yang Yang
Hongpeng Pan
Qingjun Jiang
Yi Tian Xu
Jinghui Tang
29
4
0
12 Apr 2024
Modality Translation for Object Detection Adaptation Without Forgetting
  Prior Knowledge
Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge
H. R. Medeiros
Masih Aminbeidokhti
F. Guerrero-Peña
David Latortue
Eric Granger
M. Pedersoli
VLM
45
2
0
01 Apr 2024
Improving Pre-trained Language Model Sensitivity via Mask Specific
  losses: A case study on Biomedical NER
Improving Pre-trained Language Model Sensitivity via Mask Specific losses: A case study on Biomedical NER
Micheal Abaho
Danushka Bollegala
Gary Leeming
Dan Joyce
Iain E Buchan
25
0
0
26 Mar 2024
MyVLM: Personalizing VLMs for User-Specific Queries
MyVLM: Personalizing VLMs for User-Specific Queries
Yuval Alaluf
Elad Richardson
Sergey Tulyakov
Kfir Aberman
Daniel Cohen-Or
MLLM
VLM
38
18
0
21 Mar 2024
Generalizable and Stable Finetuning of Pretrained Language Models on
  Low-Resource Texts
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
Sai Ashish Somayajula
Youwei Liang
Abhishek Singh
Li Zhang
Pengtao Xie
26
2
0
19 Mar 2024
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large
  Language Models
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Didi Zhu
Zhongyi Sun
Zexi Li
Tao Shen
Ke Yan
Shouhong Ding
Kun Kuang
Chao Wu
CLL
KELM
MoMe
63
22
0
19 Feb 2024
The Right Model for the Job: An Evaluation of Legal Multi-Label
  Classification Baselines
The Right Model for the Job: An Evaluation of Legal Multi-Label Classification Baselines
Martina Forster
Claudia Schulz
Prudhvi Nokku
Melicaalsadat Mirsafian
Jaykumar Kasundra
Stavroula Skylaki
AILaw
ELM
22
1
0
22 Jan 2024
AutoFT: Learning an Objective for Robust Fine-Tuning
AutoFT: Learning an Objective for Robust Fine-Tuning
Caroline Choi
Yoonho Lee
Annie S. Chen
Allan Zhou
Aditi Raghunathan
Chelsea Finn
OOD
44
0
0
18 Jan 2024
Generative AI in EU Law: Liability, Privacy, Intellectual Property, and
  Cybersecurity
Generative AI in EU Law: Liability, Privacy, Intellectual Property, and Cybersecurity
Claudio Novelli
F. Casolari
Philipp Hacker
Giorgio Spedicato
Luciano Floridi
AILaw
SILM
47
43
0
14 Jan 2024
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Damjan Kalajdzievski
CLL
36
8
0
11 Jan 2024
Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial
  Robustness
Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness
Sibo Wang
Jie M. Zhang
Zheng Yuan
Shiguang Shan
VLM
36
18
0
09 Jan 2024
Dynamic Corrective Self-Distillation for Better Fine-Tuning of
  Pretrained Models
Dynamic Corrective Self-Distillation for Better Fine-Tuning of Pretrained Models
Ibtihel Amara
Vinija Jain
Aman Chadha
32
0
0
12 Dec 2023
Weakly Supervised Detection of Hallucinations in LLM Activations
Weakly Supervised Detection of Hallucinations in LLM Activations
Miriam Rateike
C. Cintas
John Wamburu
Tanya Akumu
Skyler Speakman
28
11
0
05 Dec 2023
Insights into Classifying and Mitigating LLMs' Hallucinations
Insights into Classifying and Mitigating LLMs' Hallucinations
Alessandro Bruno
P. Mazzeo
Aladine Chetouani
M. Tliba
M. A. Kerkouri
HILM
43
10
0
14 Nov 2023
PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven
  Perturbed Gradient Descent
PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent
Guang-Da Liu
Zhiyu Xue
Xitong Zhang
K. Johnson
Rongrong Wang
20
5
0
26 Oct 2023
Controlled Randomness Improves the Performance of Transformer Models
Controlled Randomness Improves the Performance of Transformer Models
Tobias Deuβer
Cong Zhao
Wolfgang Krämer
David Leonhard
Christian Bauckhage
R. Sifa
21
1
0
20 Oct 2023
Rethinking the Construction of Effective Metrics for Understanding the
  Mechanisms of Pretrained Language Models
Rethinking the Construction of Effective Metrics for Understanding the Mechanisms of Pretrained Language Models
You Li
Jinhui Yin
Yuming Lin
23
0
0
19 Oct 2023
Merging Experts into One: Improving Computational Efficiency of Mixture
  of Experts
Merging Experts into One: Improving Computational Efficiency of Mixture of Experts
Shwai He
Run-Ze Fan
Liang Ding
Li Shen
Tianyi Zhou
Dacheng Tao
MoE
MoMe
32
14
0
15 Oct 2023
Randomized Sparse Neural Galerkin Schemes for Solving Evolution
  Equations with Deep Networks
Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks
Jules Berman
Benjamin Peherstorfer
26
13
0
07 Oct 2023
Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive
  Synthesis using Large Language Models and Satisfiability Solving
Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving
Matthias Zeller
Susmit Jha
Patrick Lincoln
Jens Behley
Alvaro Velasquez
Rickard Ewetz
C. Stachniss
LRM
17
7
0
28 Sep 2023
Investigating the Catastrophic Forgetting in Multimodal Large Language
  Models
Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Yuexiang Zhai
Shengbang Tong
Xiao Li
Mu Cai
Qing Qu
Yong Jae Lee
Y. Ma
VLM
MLLM
CLL
77
78
0
19 Sep 2023
Towards Robust Natural-Looking Mammography Lesion Synthesis on
  Ipsilateral Dual-Views Breast Cancer Analysis
Towards Robust Natural-Looking Mammography Lesion Synthesis on Ipsilateral Dual-Views Breast Cancer Analysis
Thanh-Huy Nguyen
Q. Kha
Thai Ngoc Toan Truong
Ba Thinh Lam
Ba-Hung Ngo
Quang Vinh Dinh
N. Le
16
13
0
07 Sep 2023
MerA: Merging Pretrained Adapters For Few-Shot Learning
MerA: Merging Pretrained Adapters For Few-Shot Learning
Shwai He
Run-Ze Fan
Liang Ding
Li Shen
Tianyi Zhou
Dacheng Tao
MoMe
36
10
0
30 Aug 2023
Bayesian Low-rank Adaptation for Large Language Models
Bayesian Low-rank Adaptation for Large Language Models
Adam X. Yang
Maxime Robeyns
Xi Wang
Laurence Aitchison
AI4CE
BDL
16
44
0
24 Aug 2023
Investigating the Learning Behaviour of In-context Learning: A
  Comparison with Supervised Learning
Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning
Xindi Wang
Yufei Wang
Can Xu
Xiubo Geng
Bowen Zhang
Chongyang Tao
Frank Rudzicz
Robert E. Mercer
Daxin Jiang
20
11
0
28 Jul 2023
Revisiting Fine-Tuning Strategies for Self-supervised Medical Imaging
  Analysis
Revisiting Fine-Tuning Strategies for Self-supervised Medical Imaging Analysis
Muhammad Osama Khan
Yi Fang
27
3
0
20 Jul 2023
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual
  Learning
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning
Zhenyi Wang
Enneng Yang
Li Shen
Heng-Chiao Huang
KELM
MU
31
47
0
16 Jul 2023
Preserving Commonsense Knowledge from Pre-trained Language Models via
  Causal Inference
Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
Junhao Zheng
Qianli Ma
Shengjie Qiu
Yue Wu
Peitian Ma
Junlong Liu
Hu Feng
Xichen Shang
Haibin Chen
AAML
KELM
CML
CLL
81
15
0
19 Jun 2023
Advancing Italian Biomedical Information Extraction with
  Transformers-based Models: Methodological Insights and Multicenter Practical
  Application
Advancing Italian Biomedical Information Extraction with Transformers-based Models: Methodological Insights and Multicenter Practical Application
Claudio Crema
T. M. Buonocore
Silvia Fostinelli
Enea Parimbelli
F. Verde
...
Marco Capelli
Alfredo Costa
G. Binetti
Riccardo Bellazzi
A. Redolfi
14
5
0
08 Jun 2023
Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs.
  Continual Pre-training
Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training
Haode Zhang
Haowen Liang
Li-Ming Zhan
Xiao-Ming Wu
Albert Y. S. Lam
VLM
16
8
0
08 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
36
158
0
02 Jun 2023
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language
  Models
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models
Guande He
Jianfei Chen
Jun Zhu
33
20
0
30 May 2023
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net
  Estimation and Optimization
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong
Heming Xia
Damai Dai
Runxin Xu
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
17
0
0
24 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
32
81
0
19 May 2023
MLHOps: Machine Learning for Healthcare Operations
MLHOps: Machine Learning for Healthcare Operations
Kristoffer Larsen
Vallijah Subasri
A. Krishnan
Cláudio Tinoco Mesquita
Diana Paez
Laleh Seyyed-Kalantari
Amalia Peix
LM&MA
AI4TS
VLM
27
2
0
04 May 2023
123
Next