ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.01851
  4. Cited By
Self-Knowledge Distillation in Natural Language Processing

Self-Knowledge Distillation in Natural Language Processing

2 August 2019
Sangchul Hahn
Heeyoul Choi
ArXivPDFHTML

Papers citing "Self-Knowledge Distillation in Natural Language Processing"

50 / 61 papers shown
Title
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
Junjie Yang
Junhao Song
Xudong Han
Ziqian Bi
Tianyang Wang
...
Yujie Zhang
Qian Niu
Benji Peng
Keyu Chen
Ming Liu
VLM
49
0
0
18 Apr 2025
Not All LoRA Parameters Are Essential: Insights on Inference Necessity
Not All LoRA Parameters Are Essential: Insights on Inference Necessity
Guanhua Chen
Yutong Yao
Ci-Jun Gao
Lidia S. Chao
Feng Wan
Derek F. Wong
41
0
0
30 Mar 2025
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
Rui Liu
Yu-cui Shen
Peng Gao
Pratap Tokekar
Ming C. Lin
64
0
0
25 Feb 2025
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
Kaito Takanami
Takashi Takahashi
Ayaka Sakata
45
1
0
27 Jan 2025
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning
  Small Language Models
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning Small Language Models
Y. Fu
Yin Yu
Xiaotian Han
Runchao Li
Xianxuan Long
Haotian Yu
Pan Li
SyDa
74
0
0
25 Nov 2024
SIKeD: Self-guided Iterative Knowledge Distillation for mathematical
  reasoning
SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning
Shivam Adarsh
Kumar Shridhar
Caglar Gulcehre
Nicholas Monath
Mrinmaya Sachan
LRM
31
2
0
24 Oct 2024
Collaborative Knowledge Distillation via a Learning-by-Education Node
  Community
Collaborative Knowledge Distillation via a Learning-by-Education Node Community
Anestis Kaimakamidis
Ioannis Mademlis
Ioannis Pitas
30
0
0
30 Sep 2024
Mitigating the Negative Impact of Over-association for Conversational
  Query Production
Mitigating the Negative Impact of Over-association for Conversational Query Production
Ante Wang
Linfeng Song
Zijun Min
Ge Xu
Xiaoli Wang
Junfeng Yao
Jinsong Su
38
0
0
29 Sep 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
58
24
0
19 Aug 2024
Tackling Noisy Clients in Federated Learning with End-to-end Label
  Correction
Tackling Noisy Clients in Federated Learning with End-to-end Label Correction
Xuefeng Jiang
Sheng Sun
Jia Li
Jingjing Xue
Runhan Li
Zhiyuan Wu
Gang Xu
Yuwei Wang
Min Liu
FedML
40
11
0
08 Aug 2024
Instance Temperature Knowledge Distillation
Instance Temperature Knowledge Distillation
Zhengbo Zhang
Yuxi Zhou
Jia Gong
Jun Liu
Zhigang Tu
54
2
0
27 Jun 2024
Decoupled Alignment for Robust Plug-and-Play Adaptation
Decoupled Alignment for Robust Plug-and-Play Adaptation
Haozheng Luo
Jiahao Yu
Wenxin Zhang
Jialong Li
Jerry Yao-Chieh Hu
Xingyu Xing
Han Liu
56
11
0
03 Jun 2024
Beyond MOS: Subjective Image Quality Score Preprocessing Method Based on
  Perceptual Similarity
Beyond MOS: Subjective Image Quality Score Preprocessing Method Based on Perceptual Similarity
Lei Wang
Desen Yuan
49
2
0
30 Apr 2024
CTSM: Combining Trait and State Emotions for Empathetic Response Model
CTSM: Combining Trait and State Emotions for Empathetic Response Model
Yufeng Wang
Chao Chen
Zhou Yang
Shuhui Wang
Xiangwen Liao
46
6
0
22 Mar 2024
Non-Exchangeable Conformal Language Generation with Nearest Neighbors
Non-Exchangeable Conformal Language Generation with Nearest Neighbors
Dennis Ulmer
Chrysoula Zerva
André F. T. Martins
43
11
0
01 Feb 2024
Learning with Noisy Low-Cost MOS for Image Quality Assessment via
  Dual-Bias Calibration
Learning with Noisy Low-Cost MOS for Image Quality Assessment via Dual-Bias Calibration
Lei Wang
Qingbo Wu
Desen Yuan
K. Ngan
Hongliang Li
Fanman Meng
Linfeng Xu
31
5
0
27 Nov 2023
ViPE: Visualise Pretty-much Everything
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi
Adhiraj Ghosh
Hendrik P. A. Lensch
DiffM
36
1
0
16 Oct 2023
Data Upcycling Knowledge Distillation for Image Super-Resolution
Data Upcycling Knowledge Distillation for Image Super-Resolution
Yun-feng Zhang
Wei Li
Simiao Li
Hanting Chen
Zhaopeng Tu
Wenjun Wang
Bingyi Jing
Hai-lin Wang
Jie Hu
37
3
0
25 Sep 2023
Teacher-Student Architecture for Knowledge Distillation: A Survey
Teacher-Student Architecture for Knowledge Distillation: A Survey
Chengming Hu
Xuan Li
Danyang Liu
Haolun Wu
Xi Chen
Ju Wang
Xue Liu
26
16
0
08 Aug 2023
Incorporating Graph Information in Transformer-based AMR Parsing
Incorporating Graph Information in Transformer-based AMR Parsing
Pavlo Vasylenko
Pere-Lluís Huguet Cabot
Abelardo Carlos Martínez Lorenzo
Roberto Navigli
47
15
0
23 Jun 2023
UADB: Unsupervised Anomaly Detection Booster
UADB: Unsupervised Anomaly Detection Booster
Hangting Ye
Zhining Liu
Xinyi Shen
Wei Cao
Shun Zheng
Xiaofan Gui
Huishuai Zhang
Yi Chang
Jiang Bian
37
2
0
03 Jun 2023
Distilling Robustness into Natural Language Inference Models with
  Domain-Targeted Augmentation
Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation
Joe Stacey
Marek Rei
37
2
0
22 May 2023
Pseudo-Label Training and Model Inertia in Neural Machine Translation
Pseudo-Label Training and Model Inertia in Neural Machine Translation
B. Hsu
Anna Currey
Xing Niu
Maria Nuadejde
Georgiana Dinu
ODL
58
2
0
19 May 2023
Heterogeneous-Branch Collaborative Learning for Dialogue Generation
Heterogeneous-Branch Collaborative Learning for Dialogue Generation
Yiwei Li
Shaoxiong Feng
Bin Sun
Kan Li
32
3
0
21 Mar 2023
Improving Video Retrieval by Adaptive Margin
Improving Video Retrieval by Adaptive Margin
Feng He
Qi Wang
Zhifan Feng
Wenbin Jiang
Yajuan Lü
Yong Zhu
Xiao Tan
93
20
0
09 Mar 2023
Topics in Contextualised Attention Embeddings
Topics in Contextualised Attention Embeddings
Mozhgan Talebpour
A. G. S. D. Herrera
Shoaib Jameel
36
2
0
11 Jan 2023
Filtering, Distillation, and Hard Negatives for Vision-Language
  Pre-Training
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Filip Radenovic
Abhimanyu Dubey
Abhishek Kadian
Todor Mihaylov
Simon Vandenhende
Yash J. Patel
Y. Wen
Vignesh Ramanathan
D. Mahajan
VLM
40
82
0
05 Jan 2023
Adaptive Contrastive Learning on Multimodal Transformer for Review
  Helpfulness Predictions
Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Predictions
Thong Nguyen
Xiaobao Wu
Anh Tuan Luu
Cong-Duy Nguyen
Zhen Hai
Lidong Bing
49
13
0
07 Nov 2022
Teacher-Student Architecture for Knowledge Learning: A Survey
Teacher-Student Architecture for Knowledge Learning: A Survey
Chengming Hu
Xuan Li
Dan Liu
Xi Chen
Ju Wang
Xue Liu
29
35
0
28 Oct 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation
  Learning for Action Recognition
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition
Duc-Quang Vu
T. Phung
Jia-Ching Wang
27
9
0
03 Sep 2022
Towards Federated Learning against Noisy Labels via Local
  Self-Regularization
Towards Federated Learning against Noisy Labels via Local Self-Regularization
Xue Jiang
Sheng Sun
Yuwei Wang
Min Liu
32
37
0
25 Aug 2022
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model
  Adaptation
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
VLM
CLL
34
41
0
22 Aug 2022
Label Semantic Knowledge Distillation for Unbiased Scene Graph
  Generation
Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation
Lin Li
Long Chen
Hanrong Shi
Wenxiao Wang
Jian Shao
Yi Yang
Jun Xiao
VLM
37
22
0
07 Aug 2022
Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object
  Detection
Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection
Shuang Wu
Wenjie Pei
Dianwen Mei
Fanglin Chen
Jiandong Tian
Guangming Lu
VLM
ObjD
39
31
0
22 Jul 2022
End-to-end Spoken Conversational Question Answering: Task, Dataset and
  Model
End-to-end Spoken Conversational Question Answering: Task, Dataset and Model
Chenyu You
Nuo Chen
Fenglin Liu
Shen Ge
Xian Wu
Yuexian Zou
AuLLM
22
42
0
29 Apr 2022
Robust Cross-Modal Representation Learning with Progressive
  Self-Distillation
Robust Cross-Modal Representation Learning with Progressive Self-Distillation
A. Andonian
Shixing Chen
Raffay Hamid
VLM
34
54
0
10 Apr 2022
Adaptive Mixing of Auxiliary Losses in Supervised Learning
Adaptive Mixing of Auxiliary Losses in Supervised Learning
D. Sivasubramanian
Ayush Maheshwari
Pradeep Shenoy
A. Prathosh
Ganesh Ramakrishnan
29
5
0
07 Feb 2022
Adaptive Image Inpainting
Adaptive Image Inpainting
Maitreya Suin
Kuldeep Purohit
A. N. Rajagopalan
15
0
0
01 Jan 2022
Conditional Generative Data-free Knowledge Distillation
Conditional Generative Data-free Knowledge Distillation
Xinyi Yu
Ling Yan
Yang Yang
Libo Zhou
Linlin Ou
20
8
0
31 Dec 2021
Unified Instance and Knowledge Alignment Pretraining for Aspect-based
  Sentiment Analysis
Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis
Juhua Liu
Qihuang Zhong
Liang Ding
Hua Jin
Bo Du
Dacheng Tao
36
29
0
26 Oct 2021
Language Modelling via Learning to Rank
Language Modelling via Learning to Rank
A. Frydenlund
Gagandeep Singh
Frank Rudzicz
47
7
0
13 Oct 2021
Improving Question Answering Performance Using Knowledge Distillation
  and Active Learning
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
32
15
0
26 Sep 2021
Adversarial Training with Contrastive Learning in NLP
Adversarial Training with Contrastive Learning in NLP
Daniela N. Rim
DongNyeong Heo
Heeyoul Choi
AAML
28
13
0
19 Sep 2021
Cross-Lingual Text Classification of Transliterated Hindi and Malayalam
Cross-Lingual Text Classification of Transliterated Hindi and Malayalam
Jitin Krishnan
Antonios Anastasopoulos
Hemant Purohit
Huzefa Rangwala
24
15
0
31 Aug 2021
Learning from Matured Dumb Teacher for Fine Generalization
Learning from Matured Dumb Teacher for Fine Generalization
Heeseung Jung
Kangil Kim
Hoyong Kim
Jong-Hun Shin
8
2
0
12 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK
  Framework: A Self-Distillation Approach
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach
Benjamin Ampel
Sagar Samtani
Steven Ullman
Hsinchun Chen
25
35
0
03 Aug 2021
Learning ULMFiT and Self-Distillation with Calibration for Medical
  Dialogue System
Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System
Shuang Ao
Xeno Acharya
10
1
0
20 Jul 2021
Confidence Conditioned Knowledge Distillation
Confidence Conditioned Knowledge Distillation
Sourav Mishra
Suresh Sundaram
15
1
0
06 Jul 2021
Mixed Cross Entropy Loss for Neural Machine Translation
Mixed Cross Entropy Loss for Neural Machine Translation
Haoran Li
Wei Lu
24
16
0
30 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
47
424
0
28 Jun 2021
12
Next