ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03939
  4. Cited By
What's Hidden in a One-layer Randomly Weighted Transformer?

What's Hidden in a One-layer Randomly Weighted Transformer?

8 September 2021
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
ArXivPDFHTML

Papers citing "What's Hidden in a One-layer Randomly Weighted Transformer?"

9 / 9 papers shown
Title
Linguistic Blind Spots of Large Language Models
Linguistic Blind Spots of Large Language Models
Jiali Cheng
Hadi Amiri
52
1
0
25 Mar 2025
LEAP: Learnable Pruning for Transformer-based Models
LEAP: Learnable Pruning for Transformer-based Models
Z. Yao
Xiaoxia Wu
Linjian Ma
Sheng Shen
Kurt Keutzer
Michael W. Mahoney
Yuxiong He
20
7
0
30 May 2021
Playing Lottery Tickets with Vision and Language
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
103
54
0
23 Apr 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
153
345
0
23 Jul 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Alex Renda
Jonathan Frankle
Michael Carbin
224
383
0
05 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
230
575
0
12 Sep 2019
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
53
185
0
14 Nov 2017
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
242
31,257
0
16 Jan 2013
1