ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.10866
  4. Cited By
Hyena Hierarchy: Towards Larger Convolutional Language Models

Hyena Hierarchy: Towards Larger Convolutional Language Models

21 February 2023
Michael Poli
Stefano Massaroli
Eric Q. Nguyen
Daniel Y. Fu
Tri Dao
S. Baccus
Yoshua Bengio
Stefano Ermon
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "Hyena Hierarchy: Towards Larger Convolutional Language Models"

50 / 53 papers shown
Title
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
48
0
0
13 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
54
0
0
13 May 2025
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
45
0
0
22 Mar 2025
HELM: Hierarchical Encoding for mRNA Language Modeling
HELM: Hierarchical Encoding for mRNA Language Modeling
Mehdi Yazdani-Jahromi
Mangal Prakash
Tommaso Mansi
Artem Moskalev
Rui Liao
97
2
0
13 Mar 2025
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Yu Bo
Weian Mao
Yanjun Shao
Weiqiang Bai
Peng Ye
Xinzhu Ma
Junbo Zhao
Hao Chen
Chunhua Shen
3DV
60
1
0
25 Feb 2025
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
Arya Honarpisheh
Mustafa Bozdag
Octavia Camps
Mario Sznaier
Mamba
75
1
0
03 Feb 2025
State-space models are accurate and efficient neural operators for dynamical systems
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
72
11
0
28 Jan 2025
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee
Scott C. Lowe
ZeMing Gong
Pablo Millán Arias
Nicholas Pellegrino
...
Lila Kari
Dirk Steinke
Graham W. Taylor
Paul Fieguth
Angel X. Chang
56
7
0
28 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
93
8
0
17 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
105
18
0
17 Jan 2025
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
43
2
0
29 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Michael Wornow
Suhana Bedi
Miguel Angel Fuentes Hernandez
E. Steinberg
Jason Alan Fries
Christopher Ré
Sanmi Koyejo
N. Shah
95
4
0
09 Dec 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
44
4
0
18 Oct 2024
State-space models can learn in-context by gradient descent
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
41
3
0
15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
95
18
0
14 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
16
0
06 Oct 2024
LongGenBench: Long-context Generation Benchmark
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
50
8
0
05 Oct 2024
System 2 Reasoning Capabilities Are Nigh
System 2 Reasoning Capabilities Are Nigh
Scott C. Lowe
VLM
LRM
48
0
0
04 Oct 2024
Improving Pretraining Data Using Perplexity Correlations
Improving Pretraining Data Using Perplexity Correlations
Tristan Thrush
Christopher Potts
Tatsunori Hashimoto
34
17
0
09 Sep 2024
Masked Mixers for Language Generation and Retrieval
Masked Mixers for Language Generation and Retrieval
Benjamin L. Badger
47
0
0
02 Sep 2024
Cell-ontology guided transcriptome foundation model
Cell-ontology guided transcriptome foundation model
Xinyu Yuan
Zhihao Zhan
Zuobai Zhang
Manqi Zhou
Jianan Zhao
Boyu Han
Yue Li
Jian Tang
47
1
0
22 Aug 2024
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and
  Disentangled Multi-Modality Fusion
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion
Chencan Fu
Yabiao Wang
Jiangning Zhang
Zhengkai Jiang
Xiaofeng Mao
Jiafu Wu
Weijian Cao
Chengjie Wang
Yanhao Ge
Yong Liu
Mamba
43
2
0
29 Jul 2024
Genomic Language Models: Opportunities and Challenges
Genomic Language Models: Opportunities and Challenges
Gonzalo Benegas
Chengzhong Ye
C. Albors
Jianan Canal Li
Yun S. Song
AI4CE
LM&MA
ELM
43
18
0
16 Jul 2024
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient
  Vision Transformer
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Létourneau
Manish Kumar Singh
Hsin-Pai Cheng
Shizhong Han
Yunxiao Shi
Dalton Jones
M. H. Langston
Hong Cai
Fatih Porikli
39
0
0
16 Jul 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
...
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
63
92
0
05 Jul 2024
SE(3)-Hyena Operator for Scalable Equivariant Learning
SE(3)-Hyena Operator for Scalable Equivariant Learning
Artem Moskalev
Mangal Prakash
Rui Liao
Tommaso Mansi
49
2
0
01 Jul 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
77
13
0
20 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
74
56
0
11 Jun 2024
LOLAMEME: Logic, Language, Memory, Mechanistic Framework
LOLAMEME: Logic, Language, Memory, Mechanistic Framework
Jay Desai
Xiaobo Guo
Srinivasan H. Sengamedu
21
0
0
31 May 2024
Understanding the differences in Foundation Models: Attention, State
  Space Models, and Recurrent Neural Networks
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
M. Zeilinger
Antonio Orvieto
AAML
50
8
0
24 May 2024
SpiralMLP: A Lightweight Vision MLP Architecture
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
43
0
0
31 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
70
46
0
23 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
48
6
0
28 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
29
1
0
18 Dec 2023
StableSSM: Alleviating the Curse of Memory in State-space Models through
  Stable Reparameterization
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Shida Wang
Qianxiao Li
22
13
0
24 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
35
4
0
21 Nov 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
31
15
0
28 Sep 2023
Learned Gridification for Efficient Point Cloud Processing
Learned Gridification for Efficient Point Cloud Processing
P. A. V. D. Linden
David W. Romero
Erik J. Bekkers
3DPC
28
2
0
22 Jul 2023
Retentive Network: A Successor to Transformer for Large Language Models
Retentive Network: A Successor to Transformer for Large Language Models
Yutao Sun
Li Dong
Shaohan Huang
Shuming Ma
Yuqing Xia
Jilong Xue
Jianyong Wang
Furu Wei
LRM
78
301
0
17 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
40
1,404
0
06 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
41
151
0
05 Jul 2023
Long Sequence Hopfield Memory
Long Sequence Hopfield Memory
Hamza Tahir Chaudhry
Jacob A. Zavatone-Veth
Dmitry Krotov
Cengiz Pehlevan
38
17
0
07 Jun 2023
Vision Transformers for Mobile Applications: A Short Survey
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
31
3
0
30 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
32
9
0
24 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
90
557
0
22 May 2023
SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric
  Kernels
SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric Kernels
Alexander Moreno
Jonathan Mei
Luke Walters
21
0
0
15 May 2023
Ask Me Anything: A simple strategy for prompting language models
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
220
208
0
05 Oct 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
460
0
24 Sep 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel
  Sizes
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
David W. Romero
Robert-Jan Bruintjes
Jakub M. Tomczak
Erik J. Bekkers
Mark Hoogendoorn
J. C. V. Gemert
80
82
0
15 Oct 2021
12
Next