ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.04371
  4. Cited By
Poolingformer: Long Document Modeling with Pooling Attention

Poolingformer: Long Document Modeling with Pooling Attention

10 May 2021
Hang Zhang
Yeyun Gong
Yelong Shen
Weisheng Li
Jiancheng Lv
Nan Duan
Weizhu Chen
ArXivPDFHTML

Papers citing "Poolingformer: Long Document Modeling with Pooling Attention"

26 / 26 papers shown
Title
Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning
Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning
Xingyu Tan
Xiaoyang Wang
Qing Liu
Xiwei Xu
Xin Yuan
Wenjie Zhang
LRM
78
4
0
18 Oct 2024
Target conversation extraction: Source separation using turn-taking
  dynamics
Target conversation extraction: Source separation using turn-taking dynamics
Tuochao Chen
Qirui Wang
Bohan Wu
Malek Itani
Sefik Emre Eskimez
Takuya Yoshioka
Shyamnath Gollakota
31
4
0
15 Jul 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
43
23
0
18 Jun 2024
Short-Long Convolutions Help Hardware-Efficient Linear Attention to
  Focus on Long Sequences
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Zicheng Liu
Siyuan Li
Li Wang
Zedong Wang
Yunfan Liu
Stan Z. Li
35
7
0
12 Jun 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
48
6
0
28 Feb 2024
Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based
  Sentiment Quadruple Analysis
Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple Analysis
Yuqing Li
Wenyuan Zhang
Binbin Li
Siyu Jia
Zisen Qi
Xingbang Tan
42
3
0
27 Sep 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
26
6
0
12 Jul 2023
Plug-and-Play Document Modules for Pre-trained Models
Plug-and-Play Document Modules for Pre-trained Models
Chaojun Xiao
Zhengyan Zhang
Xu Han
Chi-Min Chan
Yankai Lin
Zhiyuan Liu
Xiangyang Li
Zhonghua Li
Bo Zhao
Maosong Sun
KELM
27
5
0
28 May 2023
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in
  Large Language Models
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
Jiashuo Sun
Yi Luo
Yeyun Gong
Chen Lin
Yelong Shen
Jian Guo
Nan Duan
LRM
41
19
0
23 Apr 2023
Learning to Compress Prompts with Gist Tokens
Learning to Compress Prompts with Gist Tokens
Jesse Mu
Xiang Lisa Li
Noah D. Goodman
VLM
50
206
0
17 Apr 2023
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense
  Retrieval
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
Shunyu Zhang
Yaobo Liang
Ming Gong
Daxin Jiang
Nan Duan
25
4
0
03 Feb 2023
Convolution-enhanced Evolving Attention Networks
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
Jiahao Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
27
6
0
16 Dec 2022
Efficient Long Sequence Modeling via State Space Augmented Transformer
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
125
36
0
15 Dec 2022
Museformer: Transformer with Fine- and Coarse-Grained Attention for
  Music Generation
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
25
55
0
19 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
43
9
0
14 Oct 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong
Anchit Gupta
Shubham Toshniwal
Yashar Mehdad
Wen-tau Yih
RALM
VLM
59
30
0
21 Sep 2022
GAAMA 2.0: An Integrated System that Answers Boolean and Extractive
  Questions
GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions
Scott McCarley
Mihaela A. Bornea
Sara Rosenthal
Anthony Ferritto
Md Arafat Sultan
Avirup Sil
Radu Florian
14
1
0
16 Jun 2022
Fastformer: Additive Attention Can Be All You Need
Fastformer: Additive Attention Can Be All You Need
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
Xing Xie
46
117
0
20 Aug 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
77
65
0
24 Oct 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
114
1,102
0
14 Sep 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
17
10
0
10 Sep 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
285
2,017
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
252
580
0
12 Mar 2020
On Extractive and Abstractive Neural Document Summarization with
  Transformer Language Models
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
Sandeep Subramanian
Raymond Li
Jonathan Pilault
C. Pal
246
215
0
07 Sep 2019
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
213
1,367
0
06 Jun 2016
1