ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.09482
  4. Cited By
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for
  Natural Language Understanding

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

20 April 2019
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
    FedML
ArXivPDFHTML

Papers citing "Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding"

16 / 16 papers shown
Title
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
72
1
0
21 Feb 2024
Multilingual Neural Machine Translation with Knowledge Distillation
Multilingual Neural Machine Translation with Knowledge Distillation
Xu Tan
Yi Ren
Di He
Tao Qin
Zhou Zhao
Tie-Yan Liu
60
248
0
27 Feb 2019
Multi-Task Deep Neural Networks for Natural Language Understanding
Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
AI4CE
98
1,269
0
31 Jan 2019
Sentence Encoders on STILTs: Supplementary Training on Intermediate
  Labeled-data Tasks
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
Jason Phang
Thibault Févry
Samuel R. Bowman
67
467
0
02 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
882
93,936
0
11 Oct 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
74
672
0
21 Sep 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
594
7,080
0
20 Apr 2018
Stochastic Answer Networks for Machine Reading Comprehension
Stochastic Answer Networks for Machine Reading Comprehension
Xiaodong Liu
Yelong Shen
Kevin Duh
Jianfeng Gao
RALM
28
198
0
10 Dec 2017
FusionNet: Fusing via Fully-Aware Attention with Application to Machine
  Comprehension
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
Hsin-Yuan Huang
Chenguang Zhu
Yelong Shen
Weizhu Chen
FedML
55
183
0
16 Nov 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
427
129,831
0
12 Jun 2017
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
237
10,412
0
21 Jul 2016
Net2Net: Accelerating Learning via Knowledge Transfer
Net2Net: Accelerating Learning via Knowledge Transfer
Tianqi Chen
Ian Goodfellow
Jonathon Shlens
88
663
0
18 Nov 2015
Bayesian Dark Knowledge
Bayesian Dark Knowledge
Masashi Sugiyama
Vivek Rathod
R. Garnett
Max Welling
BDL
UQCV
48
258
0
14 Jun 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
198
19,448
0
09 Mar 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
736
149,474
0
22 Dec 2014
Natural Language Processing (almost) from Scratch
Natural Language Processing (almost) from Scratch
R. Collobert
Jason Weston
Léon Bottou
Michael Karlen
Koray Kavukcuoglu
Pavel P. Kuksa
121
7,711
0
02 Mar 2011
1