Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.17574
Cited By
Scavenging Hyena: Distilling Transformers into Long Convolution Models
31 January 2024
Tokiniaina Raharison Ralambomihanta
Shahrad Mohammadzadeh
Mohammad Sami Nur Islam
Wassim Jabbour
Laurence Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scavenging Hyena: Distilling Transformers into Long Convolution Models"
4 / 4 papers shown
Title
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang
Wen-Ding Li
Daniele Paliotta
Daniel Ritter
Alexander M. Rush
Tri Dao
LRM
38
0
0
14 Apr 2025
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
Daniele Paliotta
Junxiong Wang
Matteo Pagliardini
Kevin Y. Li
Aviv Bick
J. Zico Kolter
Albert Gu
F. Fleuret
Tri Dao
ReLM
LRM
53
7
0
27 Feb 2025
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
139
629
0
26 Apr 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,000
0
31 Dec 2020
1