Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.05010
Cited By
v1
v2
v3
v4 (latest)
Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor
10 October 2020
Xinyu Wang
Yong Jiang
Zhaohui Yan
Zixia Jia
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor"
34 / 34 papers shown
Title
Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
75
147
0
08 May 2021
Automated Concatenation of Embeddings for Structured Prediction
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
93
177
0
10 Oct 2020
Second-Order Neural Dependency Parsing with Message Passing and End-to-End Training
Xinyu Wang
Kewei Tu
3DV
83
37
0
10 Oct 2020
More Embeddings, Better Sequence Labelers?
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
47
10
0
17 Sep 2020
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
BDL
41
3
0
17 Sep 2020
Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data
Xinyu Wang
Yong Jiang
Kewei Tu
63
11
0
02 Jun 2020
Distilling Neural Networks for Greener and Faster Dependency Parsing
Mark Anderson
Carlos Gómez-Rodríguez
44
18
0
01 Jun 2020
Named Entity Recognition as Dependency Parsing
Juntao Yu
Bernd Bohnet
Massimo Poesio
76
419
0
14 May 2020
Efficient Second-Order TreeCRF for Neural Dependency Parsing
Yu Zhang
Zhenghua Li
Min Zhang
54
105
0
03 May 2020
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
Subhabrata Mukherjee
Ahmed Hassan Awadallah
67
59
0
12 Apr 2020
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Xinyu Wang
Yong Jiang
Nguyen Bach
Tao Wang
Fei Huang
Kewei Tu
84
36
0
08 Apr 2020
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,593
0
05 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
272
7,554
0
02 Oct 2019
Small and Practical BERT Models for Sequence Labeling
Henry Tsai
Jason Riesa
Melvin Johnson
N. Arivazhagan
Xin Li
Amelia Archer
VLM
74
121
0
31 Aug 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark
Minh-Thang Luong
Urvashi Khandelwal
Christopher D. Manning
Quoc V. Le
72
230
0
10 Jul 2019
Second-Order Semantic Dependency Parsing with End-to-End Neural Networks
Xinyu Wang
Jingxian Huang
Kewei Tu
3DV
51
66
0
19 Jun 2019
GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling
Yanjun Liu
Fandong Meng
Jinchao Zhang
Jinan Xu
Jinan Xu
Jie Zhou
56
90
0
06 Jun 2019
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
164
1,415
0
04 Jun 2019
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu
Mark Dredze
VLM
SSeg
116
681
0
19 Apr 2019
Benchmarking Approximate Inference Methods for Neural Structured Prediction
Lifu Tu
Kevin Gimpel
BDL
85
17
0
01 Apr 2019
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Raphael Tang
Yao Lu
Linqing Liu
Lili Mou
Olga Vechtomova
Jimmy J. Lin
75
421
0
28 Mar 2019
Structured Knowledge Distillation for Dense Prediction
Yifan Liu
Chris Liu
Jingdong Wang
Zhenbo Luo
104
585
0
11 Mar 2019
Viable Dependency Parsing as Sequence Labeling
Michalina Strzyz
David Vilares
Carlos Gómez-Rodríguez
63
69
0
27 Feb 2019
Multilingual Neural Machine Translation with Knowledge Distillation
Xu Tan
Yi Ren
Di He
Tao Qin
Zhou Zhao
Tie-Yan Liu
98
250
0
27 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,324
0
11 Oct 2018
Design Challenges and Misconceptions in Neural Sequence Labeling
Jie Yang
Shuailong Liang
Yue Zhang
177
164
0
12 Jun 2018
Stack-Pointer Networks for Dependency Parsing
Xuezhe Ma
Zecong Hu
J. Liu
Nanyun Peng
Graham Neubig
Eduard H. Hovy
GNN
83
167
0
03 May 2018
Deep Biaffine Attention for Neural Dependency Parsing
Timothy Dozat
Christopher D. Manning
116
1,224
0
06 Nov 2016
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
A. Kuncoro
Miguel Ballesteros
Lingpeng Kong
Chris Dyer
Noah A. Smith
MoE
84
77
0
24 Sep 2016
Enriching Word Vectors with Subword Information
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
NAI
SSL
VLM
234
9,986
0
15 Jul 2016
Sequence-Level Knowledge Distillation
Yoon Kim
Alexander M. Rush
134
1,123
0
25 Jun 2016
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Xuezhe Ma
Eduard H. Hovy
120
2,659
0
04 Mar 2016
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
367
19,745
0
09 Mar 2015
Do Deep Nets Really Need to be Deep?
Lei Jimmy Ba
R. Caruana
188
2,120
0
21 Dec 2013
1