Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning

31 May 2024

Papers citing "Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning"

8 / 8 papers shown

Title
Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition Yang Chen Jingcai Guo Song Guo Dacheng Tao 37 0 0 18 Nov 2024
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval Hailang Huang Zhijie Nie Ziqiao Wang Ziyu Shang 35 10 0 08 Mar 2024
Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding Shengkai Sun Daizong Liu Jianfeng Dong Xiaoye Qu Junyu Gao Xun Yang Xun Wang Meng Wang OffRL 27 14 0 06 Nov 2023
Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features Fumiaki Sato Ryo Hachiuma Taiki Sekii 42 22 0 27 Mar 2023
SkeletonMAE: Spatial-Temporal Masked Autoencoders for Self-supervised Skeleton Action Recognition Wenhan Wu Yilei Hua Ce Zheng Shi-Bao Wu Cheng Chen Aidong Lu ViT 30 31 0 01 Sep 2022
ActionCLIP: A New Paradigm for Video Action Recognition Mengmeng Wang Jiazheng Xing Yong Liu VLM 152 362 0 17 Sep 2021
3D Human Action Representation Learning via Cross-View Consistency Pursuit Linguo Li Minsi Wang Bingbing Ni Hang Wang Jiancheng Yang Wenjun Zhang 135 156 0 29 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 298 3,700 0 11 Feb 2021