Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.11451
Cited By
Tending Towards Stability: Convergence Challenges in Small Language Models
15 October 2024
Richard Diehl Martinez
Pietro Lesci
P. Buttery
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tending Towards Stability: Convergence Challenges in Small Language Models"
13 / 13 papers shown
Title
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
Hao Kang
Zichun Yu
Chenyan Xiong
MoE
55
0
0
26 May 2025
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Patrick Leask
Neel Nanda
Noura Al Moubayed
77
1
0
23 May 2025
Neural Networks Learn Statistics of Increasing Complexity
Nora Belrose
Quintin Pope
Lucia Quirke
Alex Troy Mallen
Xiaoli Z. Fern
56
11
0
06 Feb 2024
Are Large Pre-Trained Language Models Leaking Your Personal Information?
Jie Huang
Hanyin Shao
Kevin Chen-Chuan Chang
PILM
91
198
0
25 May 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
512
6,279
0
05 Apr 2022
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers
Jason Phang
Haokun Liu
Samuel R. Bowman
64
29
0
17 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
452
2,113
0
31 Dec 2020
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth
Thao Nguyen
M. Raghu
Simon Kornblith
OOD
52
282
0
29 Oct 2020
Similarity Analysis of Contextual Word Representation Models
John M. Wu
Yonatan Belinkov
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
92
75
0
03 May 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
523
42,559
0
03 Dec 2019
Green AI
Roy Schwartz
Jesse Dodge
Noah A. Smith
Oren Etzioni
116
1,149
0
22 Jul 2019
Similarity of Neural Network Representations Revisited
Simon Kornblith
Mohammad Norouzi
Honglak Lee
Geoffrey E. Hinton
141
1,429
0
01 May 2019
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
Zhilin Yang
Zihang Dai
Ruslan Salakhutdinov
William W. Cohen
BDL
68
372
0
10 Nov 2017
1