Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.02485
Cited By
Better Supervisory Signals by Observing Learning Paths
4 March 2022
Yi Ren
Shangmin Guo
Danica J. Sutherland
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Better Supervisory Signals by Observing Learning Paths"
15 / 15 papers shown
Title
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
A. S. Rawat
Veeranjaneyulu Sadhanala
Afshin Rostamizadeh
Ayan Chakrabarti
Wittawat Jitkrittum
...
Rakesh Shivanna
Sashank J. Reddi
A. Menon
Rohan Anil
Sanjiv Kumar
28
2
0
24 Oct 2024
Understanding Simplicity Bias towards Compositional Mappings via Learning Dynamics
Yi Ren
Danica J. Sutherland
CoGe
32
3
0
15 Sep 2024
How to Train the Teacher Model for Effective Knowledge Distillation
Shayan Mohajer Hamidi
Xizhen Deng
Renhao Tan
Linfeng Ye
Ahmed H. Salamah
32
2
0
25 Jul 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Core Francisco Park
Maya Okawa
Andrew Lee
Ekdeep Singh Lubana
Hidenori Tanaka
62
7
0
27 Jun 2024
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
Akiyoshi Tomihari
Issei Sato
30
4
0
27 May 2024
lpNTK: Better Generalisation with Less Data via Sample Interaction During Learning
Shangmin Guo
Yi Ren
Stefano V.Albrecht
Kenny Smith
26
3
0
16 Jan 2024
Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information
Linfeng Ye
Shayan Mohajer Hamidi
Renhao Tan
En-Hui Yang
VLM
37
12
0
16 Jan 2024
AdaFlood: Adaptive Flood Regularization
Wonho Bae
Yi Ren
Mohamad Osama Ahmed
Frederick Tung
Danica J. Sutherland
Gabriel L. Oliveira
AI4CE
37
1
0
06 Nov 2023
Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings
Yi Ren
Samuel Lavoie
Mikhail Galkin
Danica J. Sutherland
Aaron Courville
35
15
0
28 Oct 2023
Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
Rongzhi Zhang
Jiaming Shen
Tianqi Liu
Jia-Ling Liu
Michael Bendersky
Marc Najork
Chao Zhang
48
18
0
08 May 2023
Learning Trajectories are Generalization Indicators
Jingwen Fu
Zhizheng Zhang
Dacheng Yin
Yan Lu
Nanning Zheng
AI4CE
28
3
0
25 Apr 2023
Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation
Yilin Yang
Kamil Adamczewski
Danica J. Sutherland
Xiaoxiao Li
Mijung Park
33
14
0
03 Mar 2023
On student-teacher deviations in distillation: does it pay to disobey?
Vaishnavh Nagarajan
A. Menon
Srinadh Bhojanapalli
H. Mobahi
Surinder Kumar
41
9
0
30 Jan 2023
Supervision Complexity and its Role in Knowledge Distillation
Hrayr Harutyunyan
A. S. Rawat
A. Menon
Seungyeon Kim
Surinder Kumar
22
12
0
28 Jan 2023
Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Runfa Chen
Yu Rong
Shangmin Guo
Jiaqi Han
Fuchun Sun
Tingyang Xu
Wenbing Huang
ViT
15
20
0
15 Mar 2022
1