Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.05748
Cited By
GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks
13 September 2021
Weicheng Ma
Renze Lou
Kai Zhang
Lili Wang
Soroush Vosoughi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks"
5 / 5 papers shown
Title
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
100
1,060
0
25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
106
1,134
0
23 May 2019
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations
Soujanya Poria
Devamanyu Hazarika
Navonil Majumder
Gautam Naik
Min Zhang
Rada Mihalcea
98
1,065
0
05 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
989
7,152
0
20 Apr 2018
1