Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.03725
Cited By
Towards Memory-Efficient Training for Extremely Large Output Spaces -- Learning with 500k Labels on a Single Commodity GPU
6 June 2023
Erik Schultheis
Rohit Babbar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Memory-Efficient Training for Extremely Large Output Spaces -- Learning with 500k Labels on a Single Commodity GPU"
7 / 7 papers shown
Title
SD
2
^2
2
: Self-Distilled Sparse Drafters
Mike Lasby
Nish Sinnadurai
Valavan Manohararajah
Sean Lie
Vithursan Thangarasa
143
1
0
10 Apr 2025
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
35
0
0
05 Nov 2024
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
Nolan Dey
Shane Bergsma
Joel Hestness
38
5
0
24 May 2024
Dynamic Sparse Training with Structured Sparsity
Mike Lasby
A. Golubeva
Utku Evci
Mihai Nica
Yani Andrew Ioannou
29
19
0
03 May 2023
CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification
Siddhant Kharbanda
Atmadeep Banerjee
Erik Schultheis
Rohit Babbar
41
13
0
29 Oct 2022
LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification
Ting Jiang
Deqing Wang
Leilei Sun
Huayi Yang
Zhengyang Zhao
Fuzhen Zhuang
VLM
122
136
0
09 Jan 2021
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
278
31,267
0
16 Jan 2013
1