ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05010
  4. Cited By
Structural Knowledge Distillation: Tractably Distilling Information for
  Structured Predictor
v1v2v3v4 (latest)

Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor

10 October 2020
Xinyu Wang
Yong Jiang
Zhaohui Yan
Zixia Jia
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
ArXiv (abs)PDFHTML

Papers citing "Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor"

34 / 34 papers shown
Title
Improving Named Entity Recognition by External Context Retrieving and
  Cooperative Learning
Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
75
147
0
08 May 2021
Automated Concatenation of Embeddings for Structured Prediction
Automated Concatenation of Embeddings for Structured Prediction
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
93
177
0
10 Oct 2020
Second-Order Neural Dependency Parsing with Message Passing and
  End-to-End Training
Second-Order Neural Dependency Parsing with Message Passing and End-to-End Training
Xinyu Wang
Kewei Tu
3DV
83
37
0
10 Oct 2020
More Embeddings, Better Sequence Labelers?
More Embeddings, Better Sequence Labelers?
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
47
10
0
17 Sep 2020
AIN: Fast and Accurate Sequence Labeling with Approximate Inference
  Network
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
BDL
41
3
0
17 Sep 2020
Enhanced Universal Dependency Parsing with Second-Order Inference and
  Mixture of Training Data
Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data
Xinyu Wang
Yong Jiang
Kewei Tu
63
11
0
02 Jun 2020
Distilling Neural Networks for Greener and Faster Dependency Parsing
Distilling Neural Networks for Greener and Faster Dependency Parsing
Mark Anderson
Carlos Gómez-Rodríguez
44
18
0
01 Jun 2020
Named Entity Recognition as Dependency Parsing
Named Entity Recognition as Dependency Parsing
Juntao Yu
Bernd Bohnet
Massimo Poesio
76
419
0
14 May 2020
Efficient Second-Order TreeCRF for Neural Dependency Parsing
Efficient Second-Order TreeCRF for Neural Dependency Parsing
Yu Zhang
Zhenghua Li
Min Zhang
52
105
0
03 May 2020
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
Subhabrata Mukherjee
Ahmed Hassan Awadallah
67
59
0
12 Apr 2020
Structure-Level Knowledge Distillation For Multilingual Sequence
  Labeling
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Fei Huang
Kewei Tu
84
36
0
08 Apr 2020
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,593
0
05 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
269
7,554
0
02 Oct 2019
Small and Practical BERT Models for Sequence Labeling
Small and Practical BERT Models for Sequence Labeling
Henry Tsai
Jason Riesa
Melvin Johnson
N. Arivazhagan
Xin Li
Amelia Archer
VLM
74
121
0
31 Aug 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark
Minh-Thang Luong
Urvashi Khandelwal
Christopher D. Manning
Quoc V. Le
72
230
0
10 Jul 2019
Second-Order Semantic Dependency Parsing with End-to-End Neural Networks
Second-Order Semantic Dependency Parsing with End-to-End Neural Networks
Xinyu Wang
Jingxian Huang
Kewei Tu
3DV
51
66
0
19 Jun 2019
GCDT: A Global Context Enhanced Deep Transition Architecture for
  Sequence Labeling
GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling
Yanjun Liu
Fandong Meng
Jinchao Zhang
Jinan Xu
Jinan Xu
Jie Zhou
56
90
0
06 Jun 2019
How multilingual is Multilingual BERT?
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRMVLM
164
1,415
0
04 Jun 2019
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu
Mark Dredze
VLMSSeg
114
681
0
19 Apr 2019
Benchmarking Approximate Inference Methods for Neural Structured
  Prediction
Benchmarking Approximate Inference Methods for Neural Structured Prediction
Lifu Tu
Kevin Gimpel
BDL
85
17
0
01 Apr 2019
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Raphael Tang
Yao Lu
Linqing Liu
Lili Mou
Olga Vechtomova
Jimmy J. Lin
75
421
0
28 Mar 2019
Structured Knowledge Distillation for Dense Prediction
Structured Knowledge Distillation for Dense Prediction
Yifan Liu
Chris Liu
Jingdong Wang
Zhenbo Luo
104
585
0
11 Mar 2019
Viable Dependency Parsing as Sequence Labeling
Viable Dependency Parsing as Sequence Labeling
Michalina Strzyz
David Vilares
Carlos Gómez-Rodríguez
63
69
0
27 Feb 2019
Multilingual Neural Machine Translation with Knowledge Distillation
Multilingual Neural Machine Translation with Knowledge Distillation
Xu Tan
Yi Ren
Di He
Tao Qin
Zhou Zhao
Tie-Yan Liu
98
250
0
27 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
Design Challenges and Misconceptions in Neural Sequence Labeling
Design Challenges and Misconceptions in Neural Sequence Labeling
Jie Yang
Shuailong Liang
Yue Zhang
177
164
0
12 Jun 2018
Stack-Pointer Networks for Dependency Parsing
Stack-Pointer Networks for Dependency Parsing
Xuezhe Ma
Zecong Hu
J. Liu
Nanyun Peng
Graham Neubig
Eduard H. Hovy
GNN
83
167
0
03 May 2018
Deep Biaffine Attention for Neural Dependency Parsing
Deep Biaffine Attention for Neural Dependency Parsing
Timothy Dozat
Christopher D. Manning
116
1,224
0
06 Nov 2016
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
A. Kuncoro
Miguel Ballesteros
Lingpeng Kong
Chris Dyer
Noah A. Smith
MoE
84
77
0
24 Sep 2016
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
NAISSLVLM
234
9,986
0
15 Jul 2016
Sequence-Level Knowledge Distillation
Sequence-Level Knowledge Distillation
Yoon Kim
Alexander M. Rush
132
1,123
0
25 Jun 2016
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Xuezhe Ma
Eduard H. Hovy
120
2,659
0
04 Mar 2016
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
367
19,745
0
09 Mar 2015
Do Deep Nets Really Need to be Deep?
Do Deep Nets Really Need to be Deep?
Lei Jimmy Ba
R. Caruana
188
2,120
0
21 Dec 2013
1