ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.05641
  4. Cited By
Net2Net: Accelerating Learning via Knowledge Transfer
v1v2v3v4 (latest)

Net2Net: Accelerating Learning via Knowledge Transfer

18 November 2015
Tianqi Chen
Ian Goodfellow
Jonathon Shlens
ArXiv (abs)PDFHTML

Papers citing "Net2Net: Accelerating Learning via Knowledge Transfer"

50 / 343 papers shown
Title
Curriculum-Guided Layer Scaling for Language Model Pretraining
Curriculum-Guided Layer Scaling for Language Model Pretraining
Karanpartap Singh
Neil Band
Ehsan Adeli
ALMLRM
41
0
0
13 Jun 2025
TaskVAE: Task-Specific Variational Autoencoders for Exemplar Generation in Continual Learning for Human Activity Recognition
TaskVAE: Task-Specific Variational Autoencoders for Exemplar Generation in Continual Learning for Human Activity Recognition
B. Kann
Sandra Castellanos-Paez
Romain Rombourg
P. Lalanda
CLLVLM
34
0
0
10 May 2025
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
Jiacheng Wang
Hongtao Lv
Lei Liu
FedML
54
0
0
10 May 2025
Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
Xiyuan Wei
Ming Lin
Fanjiang Ye
Fengguang Song
Liangliang Cao
My T. Thai
Tianbao Yang
LLMSV
110
0
0
10 May 2025
A Framework for Elastic Adaptation of User Multiple Intents in Sequential Recommendation
A Framework for Elastic Adaptation of User Multiple Intents in Sequential Recommendation
Zhikai Wang
Yanyan Shen
AI4TS
211
0
0
30 Apr 2025
A multilevel approach to accelerate the training of Transformers
A multilevel approach to accelerate the training of Transformers
Guillaume Lauga
Maël Chaumette
Edgar Desainte-Maréville
Étienne Lasalle
Arthur Lebeurrier
AI4CE
127
0
0
24 Apr 2025
Noisy Deep Ensemble: Accelerating Deep Ensemble Learning via Noise Injection
Noisy Deep Ensemble: Accelerating Deep Ensemble Learning via Noise Injection
Shunsuke Sakai
Shunsuke Tsuge
Tatsuhito Hasegawa
44
0
0
08 Apr 2025
Les Dissonances: Cross-Tool Harvesting and Polluting in Multi-Tool Empowered LLM Agents
Les Dissonances: Cross-Tool Harvesting and Polluting in Multi-Tool Empowered LLM Agents
Zichuan Li
Jian Cui
Xiaojing Liao
Luyi Xing
LLMAG
63
0
0
04 Apr 2025
SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
Jaewoo Song
Fangzhen Lin
MQ
102
0
0
07 Mar 2025
LESA: Learnable LLM Layer Scaling-Up
LESA: Learnable LLM Layer Scaling-Up
Yifei Yang
Zouying Cao
Xinbei Ma
Yao Yao
L. Qin
Zhongfu Chen
Hai Zhao
179
0
0
20 Feb 2025
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
51
0
0
07 Nov 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
107
2
0
30 Oct 2024
Towards a More Complete Theory of Function Preserving Transforms
Towards a More Complete Theory of Function Preserving Transforms
Michael Painter
27
0
0
14 Oct 2024
Neural Metamorphosis
Neural Metamorphosis
Xingyi Yang
Xinchao Wang
77
2
0
10 Oct 2024
Growing Efficient Accurate and Robust Neural Networks on the Edge
Growing Efficient Accurate and Robust Neural Networks on the Edge
Vignesh Sundaresha
Naresh Shanbhag
79
0
0
10 Oct 2024
Gap Preserving Distillation by Building Bidirectional Mappings with A
  Dynamic Teacher
Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher
Yong Guo
Shulian Zhang
Haolin Pan
Jing Liu
Yulun Zhang
Jian Chen
87
0
0
05 Oct 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRMAI4CE
90
7
0
27 Sep 2024
A Survey on Neural Architecture Search Based on Reinforcement Learning
A Survey on Neural Architecture Search Based on Reinforcement Learning
Wenzhu Shao
49
0
0
26 Sep 2024
Scaling Smart: Accelerating Large Language Model Pre-training with Small
  Model Initialization
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
Mohammad Samragh
Iman Mirzadeh
Keivan Alizadeh Vahid
Fartash Faghri
Minsik Cho
Moin Nabi
Devang Naik
Mehrdad Farajtabar
LRMAI4CE
104
9
0
19 Sep 2024
Efficient Training of Large Vision Models via Advanced Automated
  Progressive Learning
Efficient Training of Large Vision Models via Advanced Automated Progressive Learning
Changlin Li
Jiawei Zhang
Sihao Lin
Zongxin Yang
Junwei Liang
Xiaodan Liang
Xiaojun Chang
VLM
70
0
0
06 Sep 2024
Growing Deep Neural Network Considering with Similarity between Neurons
Growing Deep Neural Network Considering with Similarity between Neurons
Taigo Sakai
Kazuhiro Hotta
66
0
0
23 Aug 2024
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
Zhen Tan
Daize Dong
Xinyu Zhao
Jie Peng
Yu Cheng
Tianlong Chen
MoE
91
4
0
03 Jul 2024
52B to 1T: Lessons Learned via Tele-FLM Series
52B to 1T: Lessons Learned via Tele-FLM Series
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Chao Wang
...
Yequan Wang
Zhongjiang He
Zhongyuan Wang
Xuelong Li
Tiejun Huang
ALMLRM
94
3
0
03 Jul 2024
Federating to Grow Transformers with Constrained Resources without Model
  Sharing
Federating to Grow Transformers with Constrained Resources without Model Sharing
Shikun Shen
Yifei Zou
Yuan Yuan
Yanwei Zheng
Peng Li
Xiuzhen Cheng
Dongxiao Yu
79
0
0
19 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELMCLL
83
28
0
10 Jun 2024
Landscape-Aware Growing: The Power of a Little LAG
Landscape-Aware Growing: The Power of a Little LAG
Stefani Karp
Nikunj Saunshi
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
80
1
0
04 Jun 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via
  Adaptive Heads Fusion
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
69
1
0
03 Jun 2024
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them
  Optimally
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally
Manon Verbockhaven
Sylvain Chevallier
Guillaume Charpiat
71
4
0
30 May 2024
Stacking Your Transformers: A Closer Look at Model Growth for Efficient
  LLM Pre-Training
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Wenyu Du
Tongxu Luo
Zihan Qiu
Zeyu Huang
Songlin Yang
Reynold Cheng
Yike Guo
Jie Fu
82
15
0
24 May 2024
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Pranshu Malviya
Jerry Huang
A. Baratin
Quentin Fournier
Sarath Chandar
81
0
0
24 May 2024
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li
Lingzhi Gao
Chao Wu
AI4CEDiffM
131
4
0
23 May 2024
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual
  Backbone Training
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training
Yulin Wang
Yang Yue
Rui Lu
Yizeng Han
Shiji Song
Gao Huang
VLM
114
12
0
14 May 2024
A survey of dynamic graph neural networks
A survey of dynamic graph neural networks
Yanping Zheng
Lu Yi
Zhewei Wei
AI4TSAI4CE
63
13
0
28 Apr 2024
Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped
  Robot
Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot
Neil Guan
Shangqun Yu
Shifan Zhu
Donghyun Kim
92
0
0
23 Apr 2024
FedTrans: Efficient Federated Learning via Multi-Model Transformation
FedTrans: Efficient Federated Learning via Multi-Model Transformation
Yuxuan Zhu
Jiachen Liu
Mosharaf Chowdhury
Fan Lai
89
0
0
21 Apr 2024
A Multi-Level Framework for Accelerating Training Transformer Models
A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou
Han Zhang
Yangdong Deng
AI4CE
72
1
0
07 Apr 2024
Continual Few-shot Event Detection via Hierarchical Augmentation
  Networks
Continual Few-shot Event Detection via Hierarchical Augmentation Networks
Chenlong Zhang
Pengfei Cao
Yubo Chen
Kang Liu
Qing Cui
Mengshu Sun
Jun Zhao
73
3
0
26 Mar 2024
Self-generated Replay Memories for Continual Neural Machine Translation
Self-generated Replay Memories for Continual Neural Machine Translation
Michele Resta
Davide Bacciu
CLL
74
4
0
19 Mar 2024
ECToNAS: Evolutionary Cross-Topology Neural Architecture Search
ECToNAS: Evolutionary Cross-Topology Neural Architecture Search
Elisabeth J. Schiessler
R. Aydin
C. Cyron
68
0
0
08 Mar 2024
Class-incremental Learning for Time Series: Benchmark and Evaluation
Class-incremental Learning for Time Series: Benchmark and Evaluation
Zhongzheng Qiao
Quang Pham
Zhen Cao
Hoang H Le
Ponnuthurai Nagaratnam Suganthan
Xudong Jiang
Savitha Ramasamy
101
10
0
19 Feb 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
52
0
0
19 Feb 2024
Enabling Multi-Agent Transfer Reinforcement Learning via Scenario
  Independent Representation
Enabling Multi-Agent Transfer Reinforcement Learning via Scenario Independent Representation
Ayesha Siddika Nipu
Siming Liu
Anthony Harris
56
2
0
13 Feb 2024
Efficient Stagewise Pretraining via Progressive Subnetworks
Efficient Stagewise Pretraining via Progressive Subnetworks
Abhishek Panigrahi
Nikunj Saunshi
Kaifeng Lyu
Sobhan Miryoosefi
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
65
6
0
08 Feb 2024
Enhancing Human Experience in Human-Agent Collaboration: A
  Human-Centered Modeling Approach Based on Positive Human Gain
Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain
Yiming Gao
Feiyu Liu
Liang Wang
Zhenjie Lian
Dehua Zheng
...
Jing Dai
Qiang Fu
Wei Yang
Lanxiao Huang
Wei Liu
82
1
0
28 Jan 2024
Preparing Lessons for Progressive Training on Language Models
Preparing Lessons for Progressive Training on Language Models
Yu Pan
Ye Yuan
Yichun Yin
Jiaxin Shi
Zenglin Xu
Ming Zhang
Lifeng Shang
Xin Jiang
Qun Liu
70
9
0
17 Jan 2024
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep
  Neural Networks
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep Neural Networks
Haihang Wu
Wei Wang
T. Malepathirana
Damith A. Senanayake
D. Oetomo
Saman K. Halgamuge
58
2
0
06 Jan 2024
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts
  for Instruction Tuning on General Tasks
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
Haoyuan Wu
Haisheng Zheng
Zhuolun He
Bei Yu
MoEALM
104
16
0
05 Jan 2024
ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic
  Tensor Selection
ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection
Kai Huang
Boyuan Yang
Wei Gao
93
21
0
21 Dec 2023
Initializing Models with Larger Ones
Initializing Models with Larger Ones
Zhiqiu Xu
Yanjie Chen
Kirill Vishniakov
Yida Yin
Zhiqiang Shen
Trevor Darrell
Lingjie Liu
Zhuang Liu
93
21
0
30 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
50
0
0
27 Nov 2023
1234567
Next