
Pruning Self-attentions into Convolutional Layers in Single Path
Papers citing "Pruning Self-attentions into Convolutional Layers in Single Path"
46 / 46 papers shown
Title |
---|
![]() Sparse Networks from Scratch: Faster Training without Losing Performance Tim Dettmers Luke Zettlemoyer |