Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.10509
Cited By
Generating Long Sequences with Sparse Transformers
23 April 2019
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generating Long Sequences with Sparse Transformers"
50 / 1,140 papers shown
Title
Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets
Shima Foolad
Kourosh Kiani
R. Rastgoo
FaML
45
0
0
04 Aug 2024
LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation
Dwij Mehta
Aditya Mehta
Pratik Narang
DiffM
53
0
0
04 Aug 2024
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
27
0
0
04 Aug 2024
What comes after transformers? -- A selective survey connecting ideas in deep learning
Johannes Schneider
AI4CE
43
2
0
01 Aug 2024
A2SF: Accumulative Attention Scoring with Forgetting Factor for Token Pruning in Transformer Decoder
Hyun Rae Jo
Dong Kun Shin
40
4
0
30 Jul 2024
FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li
Delin Chen
Tianle Cai
Peihao Chen
Yining Hong
Zhenfang Chen
Yikang Shen
Chuang Gan
VLM
72
5
0
29 Jul 2024
Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Seungyeon Rhyu
Kichang Yang
Sungjun Cho
Jaehyeon Kim
Kyogu Lee
Moontae Lee
43
0
0
29 Jul 2024
Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads
Xihui Lin
Yunan Zhang
Suyu Ge
Barun Patra
Vishrav Chaudhary
Hao Peng
Xia Song
40
0
0
25 Jul 2024
Towards Robust Knowledge Tracing Models via k-Sparse Attention
Shuyan Huang
Zitao Liu
Xiangyu Zhao
Weiqing Luo
Jian Weng
AI4Ed
27
21
0
24 Jul 2024
Evaluating Long Range Dependency Handling in Code Generation Models using Multi-Step Key Retrieval
Yannick Assogba
Donghao Ren
54
1
0
23 Jul 2024
Mamba meets crack segmentation
Zhili He
Yuhao Wang
Mamba
42
3
0
22 Jul 2024
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Cheng Luo
Jiawei Zhao
Zhuoming Chen
Beidi Chen
A. Anandkumar
37
3
0
22 Jul 2024
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
Hanlin Tang
Yang Lin
Jing Lin
Qingsen Han
Shikuan Hong
Yiwu Yao
Gongyi Wang
MQ
42
27
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
45
23
0
20 Jul 2024
DeepGate3: Towards Scalable Circuit Representation Learning
Zhengyuan Shi
Ziyang Zheng
Sadaf Khan
Qiang Xu
Min Li
Qiang Xu
GNN
AI4CE
44
9
0
15 Jul 2024
Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
Phillip Mueller
Lars Mikelsons
AI4CE
46
1
0
15 Jul 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
Zhenpeng Su
Zijia Lin
Xue Bai
Xing Wu
Yizhe Xiong
...
Guangyuan Ma
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
MoE
34
5
0
13 Jul 2024
Beyond KV Caching: Shared Attention for Efficient LLMs
Bingli Liao
Danilo Vasconcellos Vargas
16
4
0
13 Jul 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Jay Shah
Ganesh Bikshandi
Ying Zhang
Vijay Thakkar
Pradeep Ramani
Tri Dao
74
117
0
11 Jul 2024
HDT: Hierarchical Document Transformer
Haoyu He
Markus Flicke
Jan Buchmann
Iryna Gurevych
Andreas Geiger
43
0
0
11 Jul 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
57
7
0
11 Jul 2024
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task
Yiran Yang
Jinchao Zhang
Ying Deng
Jie Zhou
DiffM
31
0
0
09 Jul 2024
How Effective are State Space Models for Machine Translation?
Hugo Pitorro
Pavlo Vasylenko
Marcos Vinícius Treviso
André F. T. Martins
Mamba
45
3
0
07 Jul 2024
The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model
Brenden Smith
Dallin Baker
Clayton Chase
Myles Barney
Kaden Parker
Makenna Allred
Peter Hu
Alex Evans
Nancy Fulda
37
0
0
04 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Z. Zhang
Di He
KELM
39
0
0
03 Jul 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang
Yucheng Li
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
L. Qiu
72
87
0
02 Jul 2024
Neurocache: Efficient Vector Retrieval for Long-range Language Modeling
Ali Safaya
Deniz Yuret
36
1
0
02 Jul 2024
Efficient Sparse Attention needs Adaptive Token Release
Chaoran Zhang
Lixin Zou
Dan Luo
Min Tang
Xiangyang Luo
Zihao Li
Chenliang Li
49
2
0
02 Jul 2024
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu
Zhe Wang
Chunyun Chen
Xue Geng
Jie Lin
Xulei Yang
Min-man Wu
Min Wu
Xiaoli Li
Weisi Lin
ViT
VLM
51
7
0
02 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
62
5
0
01 Jul 2024
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
Wonbeom Lee
Jungi Lee
Junghwan Seo
Jaewoong Sim
RALM
34
75
0
28 Jun 2024
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Ali Khaleghi Rahimian
Manish Kumar Govind
Subhajit Maity
Dominick Reilly
Christian Kummerle
Srijan Das
A. Dutta
43
1
0
27 Jun 2024
From Efficient Multimodal Models to World Models: A Survey
Xinji Mai
Zeng Tao
Junxiong Lin
Haoran Wang
Yang Chang
Yanlan Kang
Yan Wang
Wenqiang Zhang
37
5
0
27 Jun 2024
Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation
Hui Wei
Maxwell A. Xu
Colin Samplawski
James M. Rehg
Santosh Kumar
Benjamin M. Marlin
35
0
0
27 Jun 2024
Few-Shot Medical Image Segmentation with High-Fidelity Prototypes
Song Tang
Shaxu Yan
Xiaozhi Qi
Jianxin Gao
Mao Ye
Jianwei Zhang
Xiatian Zhu
51
0
0
26 Jun 2024
Long Context Transfer from Language to Vision
Peiyuan Zhang
Kaichen Zhang
Bo Li
Guangtao Zeng
Jingkang Yang
Yuanhan Zhang
Ziyue Wang
Haoran Tan
Chunyuan Li
Ziwei Liu
VLM
72
143
0
24 Jun 2024
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
Chao Lou
Zixia Jia
Zilong Zheng
Kewei Tu
ODL
35
19
0
24 Jun 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
40
1
0
24 Jun 2024
SimSMoE: Solving Representational Collapse via Similarity Measure
Giang Do
Hung Le
T. Tran
MoE
49
1
0
22 Jun 2024
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
Tianyu Fu
Haofeng Huang
Xuefei Ning
Genghan Zhang
Boju Chen
...
Shiyao Li
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
52
17
0
21 Jun 2024
In Tree Structure Should Sentence Be Generated
Yaguang Li
Xin Chen
28
0
0
20 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen
Tam Nguyen
Nhat Ho
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ViT
29
13
0
19 Jun 2024
In-Context Former: Lightning-fast Compressing Context for Large Language Model
Xiangfeng Wang
Zaiyi Chen
Zheyong Xie
Tong Xu
Yongyi He
Enhong Chen
51
1
0
19 Jun 2024
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Qianchao Zhu
Jiangfei Duan
Chang Chen
Siran Liu
Xiuhong Li
...
Huanqi Cao
Xiao Chuanfu
Xingcheng Zhang
Dahua Lin
Chao Yang
30
16
0
17 Jun 2024
Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens
Weiyao Luo
Suncong Zheng
Heming Xia
Weikang Wang
Yan Lei
Tianyu Liu
Shuang Chen
Zhifang Sui
45
1
0
16 Jun 2024
Hierarchical Compression of Text-Rich Graphs via Large Language Models
Shichang Zhang
Da Zheng
Jiani Zhang
Qi Zhu
Xiang Song
Soji Adeshina
Christos Faloutsos
George Karypis
Yizhou Sun
VLM
31
1
0
13 Jun 2024
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Zicheng Liu
Siyuan Li
Li Wang
Zedong Wang
Yunfan Liu
Stan Z. Li
35
8
0
12 Jun 2024
QuickLLaMA: Query-aware Inference Acceleration for Large Language Models
Jingyao Li
Han Shi
Xin Jiang
Zhenguo Li
Hong Xu
Jiaya Jia
LRM
35
2
0
11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
77
57
0
11 Jun 2024
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Chensen Huang
Guibo Zhu
Xuepeng Wang
Yifei Luo
Guojing Ge
Haoran Chen
Dong Yi
Jinqiao Wang
67
1
0
10 Jun 2024
Previous
1
2
3
4
5
...
21
22
23
Next