ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.12460
  4. Cited By
Don't Sweep your Learning Rate under the Rug: A Closer Look at
  Cross-modal Transfer of Pretrained Transformers

Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers

26 July 2021
Dan Rothermel
Margaret Li
Tim Rocktaschel
Jakob N. Foerster
ArXiv (abs)PDFHTML

Papers citing "Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers"

13 / 13 papers shown
Title
Revisiting Random Walks for Learning on Graphs
Revisiting Random Walks for Learning on Graphs
Jinwoo Kim
Olga Zaghen
Ayhan Suleymanzade
Youngmin Ryou
Seunghoon Hong
116
1
0
01 Jul 2024
Pretrained Transformers as Universal Computation Engines
Pretrained Transformers as Universal Computation Engines
Kevin Lu
Aditya Grover
Pieter Abbeel
Igor Mordatch
54
221
0
09 Mar 2021
Long Range Arena: A Benchmark for Efficient Transformers
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
147
720
0
08 Nov 2020
On Empirical Comparisons of Optimizers for Deep Learning
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
78
260
0
11 Oct 2019
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELMAI4MH
571
2,670
0
03 Sep 2019
Evaluating Protein Transfer Learning with TAPE
Evaluating Protein Transfer Learning with TAPE
Roshan Rao
Nicholas Bhattacharya
Neil Thomas
Yan Duan
Xi Chen
John F. Canny
Pieter Abbeel
Yun S. Song
SSL
94
803
0
19 Jun 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
271
2,315
0
02 May 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,182
0
20 Apr 2018
ListOps: A Diagnostic Dataset for Latent Tree Learning
ListOps: A Diagnostic Dataset for Latent Tree Learning
Nikita Nangia
Samuel R. Bowman
57
138
0
17 Apr 2018
Differentiable plasticity: training plastic neural networks with
  backpropagation
Differentiable plasticity: training plastic neural networks with backpropagation
Thomas Miconi
Jeff Clune
Kenneth O. Stanley
AI4CE
64
154
0
06 Apr 2018
DeepSF: deep convolutional neural network for mapping protein sequences
  to folds
DeepSF: deep convolutional neural network for mapping protein sequences to folds
Jie Hou
B. Adhikari
Jianlin Cheng
63
200
0
04 Jun 2017
One Billion Word Benchmark for Measuring Progress in Statistical
  Language Modeling
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
181
1,108
0
11 Dec 2013
Distributed Representations of Words and Phrases and their
  Compositionality
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAIOCL
397
33,550
0
16 Oct 2013
1