ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15699
  4. Cited By
Cross-view Action Recognition Understanding From Exocentric to
  Egocentric Perspective
v1v2 (latest)

Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective

25 May 2023
Thanh-Dat Truong
Khoa Luu
    EgoV
ArXiv (abs)PDFHTML

Papers citing "Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective"

50 / 66 papers shown
Title
FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding
FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding
Thanh-Dat Truong
Utsav Prabhu
Bhiksha Raj
Jackson Cothren
Khoa Luu
CLL
161
3
0
27 Nov 2023
Fairness Continual Learning Approach to Semantic Scene Understanding in
  Open-World Environments
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments
Thanh-Dat Truong
Hoang-Quan Nguyen
Bhiksha Raj
Khoa Luu
CLL
106
14
0
25 May 2023
SVFormer: Semi-supervised Video Transformer for Action Recognition
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing
Qi Dai
Hang-Rui Hu
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
ViT
87
72
0
23 Nov 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal
  Contrastive Learning
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Yuchong Sun
Hongwei Xue
Ruihua Song
Bei Liu
Huan Yang
Jianlong Fu
AI4TSVLM
78
71
0
12 Oct 2022
M&M Mix: A Multimodal Multiview Transformer Ensemble
M&M Mix: A Multimodal Multiview Transformer Ensemble
Xuehan Xiong
Anurag Arnab
Arsha Nagrani
Cordelia Schmid
ViT
50
20
0
20 Jun 2022
Egocentric Video-Language Pretraining
Egocentric Video-Language Pretraining
Kevin Qinghong Lin
Alex Jinpeng Wang
Mattia Soldan
Michael Wray
Rui Yan
...
Hongfa Wang
Dima Damen
Guohao Li
Wei Liu
Mike Zheng Shou
VLMEgoV
84
206
0
03 Jun 2022
OTAdapt: Optimal Transport-based Approach For Unsupervised Domain
  Adaptation
OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation
Thanh-Dat Truong
N. V. R. Chappa
Xuan-Bac Nguyen
Ngan Le
Ashley Dowling
Khoa Luu
OODOT
84
11
0
22 May 2022
TransGeo: Transformer Is All You Need for Cross-view Image
  Geo-localization
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Sijie Zhu
M. Shah
Chong Chen
ViT
94
160
0
31 Mar 2022
Look for the Change: Learning Object States and State-Modifying Actions
  from Untrimmed Web Videos
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
75
33
0
22 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust
  Action Recognition
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
Son Lam Phung
Xin Li
Khoa Luu
ViT
113
50
0
19 Mar 2022
All in One: Exploring Unified Video-Language Pre-training
All in One: Exploring Unified Video-Language Pre-training
Alex Jinpeng Wang
Yixiao Ge
Rui Yan
Yuying Ge
Xudong Lin
Guanyu Cai
Jianping Wu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
95
202
0
14 Mar 2022
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object
  Interaction
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
Yunze Liu
Yun-Hai Liu
Chen Jiang
Kangbo Lyu
Weikang Wan
Hao Shen
Bo-Hua Liang
Zhoujie Fu
He Wang
Li Yi
112
188
0
03 Mar 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
78
221
0
12 Jan 2022
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
155
693
0
02 Dec 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
410
1,114
0
13 Oct 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding
TAda! Temporally-Adaptive Convolutions for Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Mingqian Tang
Ziwei Liu
M. Ang
101
49
0
12 Oct 2021
Video Swin Transformer
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
121
1,490
0
24 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
91
127
0
10 Jun 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
135
1,265
0
22 Apr 2021
Ego-Exo: Transferring Visual Representations from Third-person to
  First-person Videos
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos
Yanghao Li
Tushar Nagarajan
Bo Xiong
Kristen Grauman
EgoV
99
94
0
16 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
170
1,189
0
01 Apr 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
225
2,168
0
29 Mar 2021
Coming Down to Earth: Satellite-to-Street View Synthesis for
  Geo-Localization
Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
Aysim Toker
Qunjie Zhou
Maxim Maximov
Laura Leal-Taixé
70
151
0
11 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
469
3,906
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
403
2,066
0
09 Feb 2021
Understanding Human Hands in Contact at Internet Scale
Understanding Human Hands in Contact at Internet Scale
Dandan Shan
Jiaqi Geng
Michelle Shu
David Fouhey
108
325
0
11 Jun 2020
Egocentric Object Manipulation Graphs
Egocentric Object Manipulation Graphs
Eadom Dessalene
Michael Maynord
Chinmaya Devaraj
Cornelia Fermuller
Yiannis Aloimonos
EgoV
76
19
0
05 Jun 2020
Where am I looking at? Joint Location and Orientation Estimation by
  Cross-View Matching
Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching
Yujiao Shi
Xin Yu
Dylan Campbell
Hongdong Li
67
174
0
08 May 2020
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video
Antonino Furnari
G. Farinella
EgoV
57
141
0
04 May 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
146
1,024
0
09 Apr 2020
Vec2Face: Unveil Human Faces from their Blackbox Features in Face
  Recognition
Vec2Face: Unveil Human Faces from their Blackbox Features in Face Recognition
C. Duong
Thanh-Dat Truong
Kha Gia Quach
Hung Bui
Kaushik Roy
Khoa Luu
CVBM
65
54
0
16 Mar 2020
Exocentric to Egocentric Image Generation via Parallel Generative
  Adversarial Network
Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network
Gaowen Liu
Hao Tang
Hugo Latapie
Yan Yan
GAN
68
29
0
08 Feb 2020
EGO-TOPO: Environment Affordances from Egocentric Video
EGO-TOPO: Environment Affordances from Egocentric Video
Tushar Nagarajan
Yanghao Li
Christoph Feichtenhofer
Kristen Grauman
EgoV
131
124
0
14 Jan 2020
Forecasting Human-Object Interaction: Joint Prediction of Motor
  Attention and Actions in First Person Video
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
Miao Liu
Siyu Tang
Yin Li
James M. Rehg
EgoV
70
21
0
25 Nov 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action
  Recognition
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
73
339
0
22 Aug 2019
A Short Note on the Kinetics-700 Human Action Dataset
A Short Note on the Kinetics-700 Human Action Dataset
João Carreira
Eric Noland
Chloe Hillier
Andrew Zisserman
82
457
0
15 Jul 2019
Bridging the Domain Gap for Ground-to-Aerial Image Matching
Bridging the Domain Gap for Ground-to-Aerial Image Matching
Krishna Regmi
M. Shah
73
154
0
24 Apr 2019
H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and
  Interactions
H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions
Bugra Tekin
Federica Bogo
Marc Pollefeys
EgoV
97
254
0
10 Apr 2019
Next-Active-Object prediction from Egocentric Videos
Next-Active-Object prediction from Egocentric Videos
Antonino Furnari
Sebastiano Battiato
Kristen Grauman
G. Farinella
EgoV
57
97
0
10 Apr 2019
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
169
3,286
0
10 Dec 2018
Ego-Downward and Ambient Video based Person Location Association
Ego-Downward and Ambient Video based Person Location Association
Liang Yang
Hao Jiang
Jizhong Xiao
Zhouyuan Huo
EgoV
55
5
0
02 Dec 2018
From Third Person to First Person: Dataset and Baselines for Synthesis
  and Retrieval
From Third Person to First Person: Dataset and Baselines for Synthesis and Retrieval
Mohamed Elfeki
Krishna Regmi
Shervin Ardeshir
Ali Borji
EgoV
58
18
0
01 Dec 2018
LSTA: Long Short-Term Attention for Egocentric Action Recognition
LSTA: Long Short-Term Attention for Egocentric Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
66
143
0
26 Nov 2018
TSM: Temporal Shift Module for Efficient Video Understanding
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
98
1,694
0
20 Nov 2018
Object Level Visual Reasoning in Videos
Object Level Visual Reasoning in Videos
Fabien Baradel
Natalia Neverova
Christian Wolf
J. Mille
Greg Mori
97
164
0
16 Jun 2018
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person
  Videos
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos
Gunnar Sigurdsson
Abhinav Gupta
Cordelia Schmid
Ali Farhadi
Alahari Karteek
SLREgoV
78
171
0
25 Apr 2018
Cross-View Image Synthesis using Conditional GANs
Cross-View Image Synthesis using Conditional GANs
Krishna Regmi
Ali Borji
GAN
81
189
0
09 Mar 2018
Computational Optimal Transport
Computational Optimal Transport
Gabriel Peyré
Marco Cuturi
OT
239
2,158
0
01 Mar 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
3DH
155
1,333
0
13 Dec 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
240
3,033
0
30 Nov 2017
12
Next