v1v2 (latest)

Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective

25 May 2023

Papers citing "Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective"

50 / 66 papers shown

Title
FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding Thanh-Dat Truong Utsav Prabhu Bhiksha Raj Jackson Cothren Khoa Luu CLL 161 3 0 27 Nov 2023
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments Thanh-Dat Truong Hoang-Quan Nguyen Bhiksha Raj Khoa Luu CLL 106 14 0 25 May 2023
SVFormer: Semi-supervised Video Transformer for Action Recognition Zhen Xing Qi Dai Hang-Rui Hu Jingjing Chen Zuxuan Wu Yu-Gang Jiang ViT 87 72 0 23 Nov 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Yuchong Sun Hongwei Xue Ruihua Song Bei Liu Huan Yang Jianlong Fu AI4TS VLM 78 71 0 12 Oct 2022
M&M Mix: A Multimodal Multiview Transformer Ensemble Xuehan Xiong Anurag Arnab Arsha Nagrani Cordelia Schmid ViT 50 20 0 20 Jun 2022
Egocentric Video-Language Pretraining Kevin Qinghong Lin Alex Jinpeng Wang Mattia Soldan Michael Wray Rui Yan ... Hongfa Wang Dima Damen Guohao Li Wei Liu Mike Zheng Shou VLM EgoV 84 206 0 03 Jun 2022
OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation Thanh-Dat Truong N. V. R. Chappa Xuan-Bac Nguyen Ngan Le Ashley Dowling Khoa Luu OOD OT 84 11 0 22 May 2022
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization Sijie Zhu M. Shah Chong Chen ViT 94 160 0 31 Mar 2022
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos Tomávs Souvcek Jean-Baptiste Alayrac Antoine Miech Ivan Laptev Josef Sivic 75 33 0 22 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition Thanh-Dat Truong Quoc-Huy Bui C. Duong Han-Seok Seo Son Lam Phung Xin Li Khoa Luu ViT 113 50 0 19 Mar 2022
All in One: Exploring Unified Video-Language Pre-training Alex Jinpeng Wang Yixiao Ge Rui Yan Yuying Ge Xudong Lin Guanyu Cai Jianping Wu Ying Shan Xiaohu Qie Mike Zheng Shou 95 202 0 14 Mar 2022
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction Yunze Liu Yun-Hai Liu Chen Jiang Kangbo Lyu Weikang Wan Hao Shen Bo-Hua Liang Zhoujie Fu He Wang Li Yi 112 188 0 03 Mar 2022
Multiview Transformers for Video Recognition Shen Yan Xuehan Xiong Anurag Arnab Zhichao Lu Mi Zhang Chen Sun Cordelia Schmid ViT 78 221 0 12 Jan 2022
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection Yanghao Li Chaoxia Wu Haoqi Fan K. Mangalam Bo Xiong Jitendra Malik Christoph Feichtenhofer ViT 155 693 0 02 Dec 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 410 1,114 0 13 Oct 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding Ziyuan Huang Shiwei Zhang Liang Pan Zhiwu Qing Mingqian Tang Ziwei Liu M. Ang 101 49 0 12 Oct 2021
Video Swin Transformer Ze Liu Jia Ning Yue Cao Yixuan Wei Zheng Zhang Stephen Lin Han Hu ViT 121 1,490 0 24 Jun 2021
Space-time Mixing Attention for Video Transformer Adrian Bulat Juan-Manuel Perez-Rua Swathikiran Sudhakaran Brais Martínez Georgios Tzimiropoulos ViT 91 127 0 10 Jun 2021
Multiscale Vision Transformers Haoqi Fan Bo Xiong K. Mangalam Yanghao Li Zhicheng Yan Jitendra Malik Christoph Feichtenhofer ViT 135 1,265 0 22 Apr 2021
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos Yanghao Li Tushar Nagarajan Bo Xiong Kristen Grauman EgoV 99 94 0 16 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Max Bain Arsha Nagrani Gül Varol Andrew Zisserman VGen 170 1,189 0 01 Apr 2021
ViViT: A Video Vision Transformer Anurag Arnab Mostafa Dehghani G. Heigold Chen Sun Mario Lucic Cordelia Schmid ViT 225 2,168 0 29 Mar 2021
Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization Aysim Toker Qunjie Zhou Maxim Maximov Laura Leal-Taixé 70 151 0 11 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 469 3,906 0 11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 403 2,066 0 09 Feb 2021
Understanding Human Hands in Contact at Internet Scale Dandan Shan Jiaqi Geng Michelle Shu David Fouhey 108 325 0 11 Jun 2020
Egocentric Object Manipulation Graphs Eadom Dessalene Michael Maynord Chinmaya Devaraj Cornelia Fermuller Yiannis Aloimonos EgoV 76 19 0 05 Jun 2020
Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching Yujiao Shi Xin Yu Dylan Campbell Hongdong Li 67 174 0 08 May 2020
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video Antonino Furnari G. Farinella EgoV 57 141 0 04 May 2020
X3D: Expanding Architectures for Efficient Video Recognition Christoph Feichtenhofer 146 1,024 0 09 Apr 2020
Vec2Face: Unveil Human Faces from their Blackbox Features in Face Recognition C. Duong Thanh-Dat Truong Kha Gia Quach Hung Bui Kaushik Roy Khoa Luu CVBM 65 54 0 16 Mar 2020
Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network Gaowen Liu Hao Tang Hugo Latapie Yan Yan GAN 68 29 0 08 Feb 2020
EGO-TOPO: Environment Affordances from Egocentric Video Tushar Nagarajan Yanghao Li Christoph Feichtenhofer Kristen Grauman EgoV 131 124 0 14 Jan 2020
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video Miao Liu Siyu Tang Yin Li James M. Rehg EgoV 70 21 0 25 Nov 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition Evangelos Kazakos Arsha Nagrani Andrew Zisserman Dima Damen EgoV 73 339 0 22 Aug 2019
A Short Note on the Kinetics-700 Human Action Dataset João Carreira Eric Noland Chloe Hillier Andrew Zisserman 82 457 0 15 Jul 2019
Bridging the Domain Gap for Ground-to-Aerial Image Matching Krishna Regmi M. Shah 73 154 0 24 Apr 2019
H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions Bugra Tekin Federica Bogo Marc Pollefeys EgoV 97 254 0 10 Apr 2019
Next-Active-Object prediction from Egocentric Videos Antonino Furnari Sebastiano Battiato Kristen Grauman G. Farinella EgoV 57 97 0 10 Apr 2019
SlowFast Networks for Video Recognition Christoph Feichtenhofer Haoqi Fan Jitendra Malik Kaiming He 169 3,286 0 10 Dec 2018
Ego-Downward and Ambient Video based Person Location Association Liang Yang Hao Jiang Jizhong Xiao Zhouyuan Huo EgoV 55 5 0 02 Dec 2018
From Third Person to First Person: Dataset and Baselines for Synthesis and Retrieval Mohamed Elfeki Krishna Regmi Shervin Ardeshir Ali Borji EgoV 58 18 0 01 Dec 2018
LSTA: Long Short-Term Attention for Egocentric Action Recognition Swathikiran Sudhakaran Sergio Escalera Oswald Lanz EgoV 66 143 0 26 Nov 2018
TSM: Temporal Shift Module for Efficient Video Understanding Ji Lin Chuang Gan Song Han 98 1,694 0 20 Nov 2018
Object Level Visual Reasoning in Videos Fabien Baradel Natalia Neverova Christian Wolf J. Mille Greg Mori 97 164 0 16 Jun 2018
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos Gunnar Sigurdsson Abhinav Gupta Cordelia Schmid Ali Farhadi Alahari Karteek SLR EgoV 78 171 0 25 Apr 2018
Cross-View Image Synthesis using Conditional GANs Krishna Regmi Ali Borji GAN 81 189 0 09 Mar 2018
Computational Optimal Transport Gabriel Peyré Marco Cuturi OT 239 2,158 0 01 Mar 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification Saining Xie Chen Sun Jonathan Huang Zhuowen Tu Kevin Patrick Murphy 3DH 155 1,333 0 13 Dec 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition Du Tran Heng Wang Lorenzo Torresani Jamie Ray Yann LeCun Manohar Paluri 240 3,033 0 30 Nov 2017