ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.11523
  4. Cited By
RMT: Retentive Networks Meet Vision Transformers

RMT: Retentive Networks Meet Vision Transformers

20 September 2023
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
    ViT
ArXivPDFHTML

Papers citing "RMT: Retentive Networks Meet Vision Transformers"

50 / 54 papers shown
Title
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
Jiquan Shan
Junxiao Wang
Lifeng Zhao
Liang Cai
Hongyuan Zhang
Ioannis Liritzis
ViT
193
0
0
22 May 2025
Unveiling the Hidden: Movie Genre and User Bias in Spoiler Detection
Unveiling the Hidden: Movie Genre and User Bias in Spoiler Detection
Haokai Zhang
Shengtao Zhang
Zijian Cai
Heng Wang
Ruixuan Zhu
Zinan Zeng
Minnan Luo
101
0
0
24 Apr 2025
Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures
Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures
Tim Seizinger
Florin-Alexandru Vasluianu
Marcos V. Conde
Zongwei Wu
Radu Timofte
77
0
0
20 Mar 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
158
4
0
10 Feb 2025
VMamba: Visual State Space Model
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
241
683
0
31 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
86
1
0
12 Nov 2024
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
60
45
0
04 Mar 2024
Scale-Aware Modulation Meet Transformer
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
101
69
0
17 Jul 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
95
28
0
01 Jun 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
195
593
0
22 May 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
111
128
0
29 Mar 2023
BiFormer: Vision Transformer with Bi-Level Routing Attention
BiFormer: Vision Transformer with Bi-Level Routing Attention
Lei Zhu
Xinjiang Wang
Zhanghan Ke
Wayne Zhang
Rynson W. H. Lau
172
510
0
15 Mar 2023
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
85
135
0
22 Nov 2022
Vision Transformer with Super Token Sampling
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Ran He
Tieniu Tan
ViT
38
58
0
21 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
104
680
0
10 Nov 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
71
62
0
04 Oct 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated
  Convolutions
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
68
254
0
28 Jul 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
188
79
0
11 Jul 2022
Global Context Vision Transformers
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
62
125
0
20 Jun 2022
Inception Transformer
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
58
193
0
25 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
  Transformers
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
63
189
0
06 May 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
50
53
0
18 Apr 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
78
268
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
102
248
0
07 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
105
660
0
04 Apr 2022
Focal Modulation Networks
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
69
271
0
22 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
82
54
0
21 Mar 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
114
249
0
12 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
118
5,137
0
10 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
72
478
0
03 Jan 2022
MPViT: Multi-Path Vision Transformer for Dense Prediction
MPViT: Multi-Path Vision Transformer for Dense Prediction
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
62
250
0
21 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
144
686
0
02 Dec 2021
Shunted Self-Attention via Multi-Scale Token Aggregation
Shunted Self-Attention via Multi-Scale Token Aggregation
Sucheng Ren
Daquan Zhou
Shengfeng He
Jiashi Feng
Xinchao Wang
ViT
68
226
0
30 Nov 2021
Contextual Transformer Networks for Visual Recognition
Contextual Transformer Networks for Visual Recognition
Yehao Li
Ting Yao
Yingwei Pan
Tao Mei
ViT
68
482
0
26 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
134
976
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
70
433
0
01 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
87
1,652
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
95
325
0
24 Jun 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
127
1,257
0
22 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision
  Transformers
All Tokens Matter: Token Labeling for Training Better Vision Transformers
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
61
208
0
22 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
133
1,006
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
128
1,901
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
  Classification
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
68
1,469
0
27 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
487
3,709
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
419
3,826
0
11 Feb 2021
RandAugment: Practical automated data augmentation with a reduced search
  space
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
208
3,480
0
30 Sep 2019
MMDetection: Open MMLab Detection Toolbox and Benchmark
MMDetection: Open MMLab Detection Toolbox and Benchmark
Kai-xiang Chen
Jiaqi Wang
Jiangmiao Pang
Yuhang Cao
Yu Xiong
...
Jingdong Wang
Jianping Shi
Wanli Ouyang
Chen Change Loy
Dahua Lin
VOS
130
2,858
0
17 Jun 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
129
18,058
0
28 May 2019
CutMix: Regularization Strategy to Train Strong Classifiers with
  Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
604
4,766
0
13 May 2019
Panoptic Feature Pyramid Networks
Panoptic Feature Pyramid Networks
Alexander Kirillov
Ross B. Girshick
Kaiming He
Piotr Dollár
ISeg
SSeg
108
1,285
0
08 Jan 2019
12
Next