ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09153
  4. Cited By
An Empirical Investigation of the Role of Pre-training in Lifelong
  Learning

An Empirical Investigation of the Role of Pre-training in Lifelong Learning

16 December 2021
Sanket Vaibhav Mehta
Darshan Patil
Sarath Chandar
Emma Strubell
    CLL
ArXivPDFHTML

Papers citing "An Empirical Investigation of the Role of Pre-training in Lifelong Learning"

33 / 33 papers shown
Title
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
61
0
0
02 Apr 2025
Achieving Upper Bound Accuracy of Joint Training in Continual Learning
Achieving Upper Bound Accuracy of Joint Training in Continual Learning
Saleh Momeni
Bing Liu
CLL
84
1
0
17 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
64
14
0
04 Feb 2025
Generate to Discriminate: Expert Routing for Continual Learning
Generate to Discriminate: Expert Routing for Continual Learning
Yewon Byun
Sanket Vaibhav Mehta
Saurabh Garg
Emma Strubell
Michael Oberst
Bryan Wilder
Zachary Chase Lipton
78
0
0
31 Dec 2024
Buffer-based Gradient Projection for Continual Federated Learning
Buffer-based Gradient Projection for Continual Federated Learning
Shenghong Dai
Jy-yong Sohn
Yicong Chen
S. Alam
Ravikumar Balakrishnan
Suman Banerjee
N. Himayat
Kangwook Lee
FedML
75
2
0
03 Sep 2024
An Investigation of Warning Erroneous Chat Translations in Cross-lingual
  Communication
An Investigation of Warning Erroneous Chat Translations in Cross-lingual Communication
Yunmeng Li
Jun Suzuki
Makoto Morishita
Kaori Abe
Kentaro Inui
65
1
0
28 Aug 2024
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning
Liyuan Wang
Jingyi Xie
Xingxing Zhang
Hang Su
Jun Zhu
CLL
47
5
0
07 Jul 2024
Simple and Scalable Strategies to Continually Pre-train Large Language
  Models
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
44
52
0
13 Mar 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
Beyza Ermis
CLL
KELM
LRM
58
25
0
27 Feb 2024
Towards a General Framework for Continual Learning with Pre-training
Towards a General Framework for Continual Learning with Pre-training
Liyuan Wang
Jingyi Xie
Xingxing Zhang
Hang Su
Jun Zhu
CLL
29
3
0
21 Oct 2023
Continual Generalized Intent Discovery: Marching Towards Dynamic and
  Open-world Intent Recognition
Continual Generalized Intent Discovery: Marching Towards Dynamic and Open-world Intent Recognition
Xiaoshuai Song
Yutao Mou
Keqing He
Yueyan Qiu
Pei Wang
Weiran Xu
25
2
0
16 Oct 2023
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
45
1
0
31 Jul 2023
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
Lingfeng Shen
Weiting Tan
Boyuan Zheng
Daniel Khashabi
VLM
39
6
0
18 May 2023
Remind of the Past: Incremental Learning with Analogical Prompts
Remind of the Past: Incremental Learning with Analogical Prompts
Zhiheng Ma
Xiaopeng Hong
Beinan Liu
Yabin Wang
Pinyue Guo
Huiyun Li
CLL
34
1
0
24 Mar 2023
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Mingxu Tao
Yansong Feng
Dongyan Zhao
CLL
KELM
32
10
0
02 Mar 2023
Lightweight Transformers for Clinical Natural Language Processing
Lightweight Transformers for Clinical Natural Language Processing
Omid Rohanian
Mohammadmahdi Nouriborji
Hannah Jauncey
Samaneh Kouchaki
Isaric Clinical Characterisation Group
Lei A. Clifton
L. Merson
David A. Clifton
MedIm
LM&MA
21
12
0
09 Feb 2023
DSI++: Updating Transformer Memory with New Documents
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
32
39
0
19 Dec 2022
PIVOT: Prompting for Video Continual Learning
PIVOT: Prompting for Video Continual Learning
Andrés Villa
Juan Carlos León Alcázar
Motasem Alfarra
Kumail Alhamoud
J. Hurtado
Fabian Caba Heilbron
Alvaro Soto
Guohao Li
VLM
CLL
40
45
0
09 Dec 2022
SQuAT: Sharpness- and Quantization-Aware Training for BERT
SQuAT: Sharpness- and Quantization-Aware Training for BERT
Zheng Wang
Juncheng Billy Li
Shuhui Qu
Florian Metze
Emma Strubell
MQ
24
7
0
13 Oct 2022
Schedule-Robust Online Continual Learning
Schedule-Robust Online Continual Learning
Ruohan Wang
Marco Ciccone
Giulia Luise
A. Yapp
Massimiliano Pontil
C. Ciliberto
CLL
32
4
0
11 Oct 2022
Causes of Catastrophic Forgetting in Class-Incremental Semantic
  Segmentation
Causes of Catastrophic Forgetting in Class-Incremental Semantic Segmentation
Tobias Kalb
Jürgen Beyerer
CLL
35
8
0
16 Sep 2022
Progressive Latent Replay for efficient Generative Rehearsal
Progressive Latent Replay for efficient Generative Rehearsal
Stanislaw Pawlak
Filip Szatkowski
Michal Bortkiewicz
Jan Dubiñski
Tomasz Trzciñski
24
2
0
04 Jul 2022
Transfer without Forgetting
Transfer without Forgetting
Matteo Boschini
Lorenzo Bonicelli
Angelo Porrello
Giovanni Bellitto
M. Pennisi
S. Palazzo
C. Spampinato
Simone Calderara
CLL
22
46
0
01 Jun 2022
The Effect of Task Ordering in Continual Learning
The Effect of Task Ordering in Continual Learning
Samuel J. Bell
Neil D. Lawrence
CLL
48
17
0
26 May 2022
Fine-tuned Language Models are Continual Learners
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
145
117
0
24 May 2022
Simpler is Better: off-the-shelf Continual Learning Through Pretrained
  Backbones
Simpler is Better: off-the-shelf Continual Learning Through Pretrained Backbones
Francesco Pelosin
VLM
16
11
0
03 May 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual
  Learning
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VLM
VPVLM
36
455
0
10 Apr 2022
Adversarial Continual Learning
Adversarial Continual Learning
Sayna Ebrahimi
Franziska Meier
Roberto Calandra
Trevor Darrell
Marcus Rohrbach
CLL
VLM
152
198
0
21 Mar 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train
  10,000-Layer Vanilla Convolutional Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
227
348
0
14 Jun 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
353
11,684
0
09 Mar 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
278
31,267
0
16 Jan 2013
1