MoTE: Mixture of Task-specific Experts for Pre-Trained ModelBased Class-incremental Learning

21 May 2025

Main:29 Pages

12 Figures

Bibliography:7 Pages

7 Tables

Abstract

Class-incremental learning (CIL) requires deep learning models to continuously acquire new knowledge from streaming data while preserving previously learned information. Recently, CIL based on pre-trained models (PTMs) has achieved remarkable success. However, prompt-based approaches suffer from prompt overwriting, while adapter-based methods face challenges such as dimensional misalignment between tasks. While the idea of expert fusion in Mixture of Experts (MoE) can help address dimensional inconsistency, both expert and routing parameters are prone to being overwritten in dynamic environments, making MoE challenging to apply directly in CIL. To tackle these issues, we propose a mixture of task-specific experts (MoTE) framework that effectively mitigates the miscalibration caused by inconsistent output dimensions across tasks. Inspired by the weighted feature fusion and sparse activation mechanisms in MoE, we introduce task-aware expert filtering and reliable expert joint inference during the inference phase, mimicking the behavior of routing layers without inducing catastrophic forgetting. Extensive experiments demonstrate the superiority of our method without requiring an exemplar set. Furthermore, the number of tasks in MoTE scales linearly with the number of adapters. Building on this, we further explore the trade-off between adapter expansion and model performance and propose the Adapter-Limited MoTE. The code is available atthis https URL.

View on arXiv

@article{li2025_2506.11038,
  title={ MoTE: Mixture of Task-specific Experts for Pre-Trained ModelBased Class-incremental Learning },
  author={ Linjie Li and Zhenyu Wu and Yang Ji },
  journal={arXiv preprint arXiv:2506.11038},
  year={ 2025 }
}

Comments on this paper