ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.12484
  4. Cited By
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
v1v2v3 (latest)

ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

26 April 2022
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
    ViT
ArXiv (abs)PDFHTMLGithub (1623★)

Papers citing "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation"

50 / 65 papers shown
Title
Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining
Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining
Weizhen He
Yunfeng Yan
Shixiang Tang
Yiheng Deng
Yangyang Zhong
Pengxin Luo
Donglian Qi
VLM
190
1
0
29 Apr 2025
Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation
Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation
Prasun Roy
Saumik Bhattacharya
Subhankar Ghosh
Umapada Pal
Michael Blumenstein
109
0
0
20 Feb 2025
RT-DEMT: A hybrid real-time acupoint detection model combining mamba and transformer
RT-DEMT: A hybrid real-time acupoint detection model combining mamba and transformer
Shilong Yang
Qi Zang
Chulong Zhang
Lingfeng Huang
Yaoqin Xie
Mamba
203
1
0
16 Feb 2025
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Wei-Lun Chen
Chia-Yeh Hsieh
Yu-Hsiang Kao
Kai-Chun Liu
Sheng-Yu Peng
Yu Tsao
157
0
0
30 Jan 2025
WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
Tianjian Jiang
Johsan Billingham
Sebastian Müksch
Juan Jose Zarate
Nicolas Evans
Martin R. Oswald
Marc Polleyfeys
Otmar Hilliges
Manuel Kaufmann
Jie Song
3DH
154
3
0
06 Jan 2025
Measurement of Medial Elbow Joint Space using Landmark Detection
Measurement of Medial Elbow Joint Space using Landmark Detection
Shizuka Akahori
Shotaro Teruya
Pragyan Shrestha
Yuichi Yoshii
Ryuhei Michinobu
S. Iizuka
I. Kitahara
177
0
0
17 Dec 2024
Object Agnostic 3D Lifting in Space and Time
Object Agnostic 3D Lifting in Space and Time
Christopher Fusco
Mosam Dabhi
Shin-Fang Chng
Simon Lucey
3DPC
141
0
0
02 Dec 2024
HandOS: 3D Hand Reconstruction in One Stage
HandOS: 3D Hand Reconstruction in One Stage
Xingyu Chen
Zhuheng Song
Xiaoke Jiang
Yaoqing Hu
Junzhi Yu
Lei Zhang
3DHHAI
182
0
0
02 Dec 2024
Comparison of marker-less 2D image-based methods for infant pose estimation
Comparison of marker-less 2D image-based methods for infant pose estimation
Lennart Jahn
Sarah Flugge
Dajie Zhang
Luise Poustka
Sven Bolte
Florentin Wörgötter
Peter B Marschik
Tomas Kulvicius
154
1
0
07 Oct 2024
Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes
Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes
K. Ludwig
Julian Lorenz
Daniel Kienzle
Tuan Bui
Rainer Lienhart
3DH
118
1
0
26 Sep 2024
WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild
WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild
Rolandos Alexandros Potamias
Jinglei Zhang
Jiankang Deng
Stefanos Zafeiriou
3DH
111
12
0
18 Sep 2024
STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video
STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video
Yang Liu
Zhiyong Zhang
3DH
142
0
0
14 Jul 2024
Automatic infant 2D pose estimation from videos: comparing seven deep neural network methods
Automatic infant 2D pose estimation from videos: comparing seven deep neural network methods
Filipe Gama
Matej Misar
Lukas Navara
S. T. Popescu
Matej Hoffmann
3DH
124
2
0
25 Jun 2024
Pose Priors from Language Models
Pose Priors from Language Models
Sanjay Subramanian
Evonne Ng
Lea Müller
Dan Klein
Shiry Ginosar
Trevor Darrell
109
4
0
06 May 2024
Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training
Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training
Huayi Zhou
Mukun Luo
Fei Jiang
Yue Ding
Hongtao Lu
Kui Jia
120
0
0
18 Feb 2024
APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
Yuxiang Yang
Junjie Yang
Yufei Xu
Jing Zhang
Long Lan
Dacheng Tao
89
44
0
12 Jun 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
418
3,610
0
29 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
85
57
0
18 Apr 2022
Exploring Plain Vision Transformer Backbones for Object Detection
Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li
Hanzi Mao
Ross B. Girshick
Kaiming He
ViT
104
816
0
30 Mar 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector
  Pre-training
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Likun Cai
Zhi-Li Zhang
Yi Zhu
Li Zhang
Mu Li
Xiangyang Xue
VLMObjD
107
41
0
24 Mar 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for
  Image Recognition and Beyond
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
109
234
0
21 Feb 2022
ELSA: Enhanced Local Self-Attention for Vision Transformer
ELSA: Enhanced Local Self-Attention for Vision Transformer
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
112
41
0
23 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
157
694
0
02 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViTTPM
485
7,837
0
11 Nov 2021
HRFormer: High-Resolution Transformer for Dense Prediction
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
111
233
0
18 Oct 2021
Scaled ReLU Matters for Training Vision Transformers
Scaled ReLU Matters for Training Vision Transformers
Pichao Wang
Xue Wang
Haowen Luo
Jingkai Zhou
Zhipeng Zhou
Fan Wang
Hao Li
Rong Jin
94
43
0
08 Sep 2021
AP-10K: A Benchmark for Animal Pose Estimation in the Wild
AP-10K: A Benchmark for Animal Pose Estimation in the Wild
Hang Yu
Yufei Xu
Jing Zhang
Wei Zhao
Ziyu Guan
Dacheng Tao
92
113
0
28 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
200
272
0
31 Jul 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
  in Natural Language Processing
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLMSyDa
248
4,017
0
28 Jul 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
302
2,848
0
15 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
183
339
0
07 Jun 2021
Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic Segmentation
Robin Strudel
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
ViT
220
1,474
0
12 May 2021
Pose Recognition with Cascade Transformers
Pose Recognition with Cascade Transformers
Ke Li
Shijie Wang
Xiang Zhang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
76
212
0
14 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
472
21,656
0
25 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
91
130
0
19 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
547
3,742
0
24 Feb 2021
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation
Rawal Khirodkar
Visesh Chari
Amit Agrawal
A. Tyagi
77
68
0
27 Jan 2021
TransPose: Keypoint Localization via Transformer
TransPose: Keypoint Localization via Transformer
Sen Yang
Zhibin Quan
Mu Nie
Wankou Yang
ViT
196
270
0
28 Dec 2020
Empowering Things with Intelligence: A Survey of the Progress,
  Challenges, and Opportunities in Artificial Intelligence of Things
Empowering Things with Intelligence: A Survey of the Progress, Challenges, and Opportunities in Artificial Intelligence of Things
Jing Zhang
Dacheng Tao
103
475
0
17 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
684
41,563
0
22 Oct 2020
Adversarial Semantic Data Augmentation for Human Pose Estimation
Adversarial Semantic Data Augmentation for Human Pose Estimation
Yanrui Bin
Xuan Cao
Xinya Chen
Yanhao Ge
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Changxin Gao
Nong Sang
68
60
0
03 Aug 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
210
2,993
0
09 Jun 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT3DVPINN
460
13,153
0
26 May 2020
Peeking into occluded joints: A novel framework for crowd pose
  estimation
Peeking into occluded joints: A novel framework for crowd pose estimation
Lingteng Qiu
Xuanye Zhang
Yanran Li
Guanbin Li
Xiaojun Wu
Zixiang Xiong
Xiaoguang Han
Shuguang Cui
183
72
0
23 Mar 2020
Towards High Performance Human Keypoint Detection
Towards High Performance Human Keypoint Detection
Jing Zhang
Zhe Chen
Dacheng Tao
3DH
149
72
0
03 Feb 2020
The Devil is in the Details: Delving into Unbiased Data Processing for
  Human Pose Estimation
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
Junjie Huang
Zheng Zhu
Feng Guo
Guan Huang
Dalong Du
3DH
76
192
0
18 Nov 2019
Distribution-Aware Coordinate Representation for Human Pose Estimation
Distribution-Aware Coordinate Representation for Human Pose Estimation
Feng Zhang
Xiatian Zhu
Hanbin Dai
Mao Ye
Ce Zhu
3DH
73
424
0
14 Oct 2019
Cross-Domain Adaptation for Animal Pose Estimation
Cross-Domain Adaptation for Animal Pose Estimation
Jinkun Cao
Hongyang Tang
Haoshu Fang
Xiaoyong Shen
Cewu Lu
Yu-Wing Tai
OOD
75
164
0
16 Aug 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
238
8,455
0
19 Jun 2019
On the Convergence of Adam and Beyond
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
109
2,506
0
19 Apr 2019
12
Next