ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.09444
  4. Cited By
Multimodal Action Quality Assessment

Multimodal Action Quality Assessment

31 January 2024
Ling-an Zeng
Wei-Shi Zheng
ArXivPDFHTML

Papers citing "Multimodal Action Quality Assessment"

49 / 49 papers shown
Title
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
Edoardo Bianchi
Antonio Liotta
39
0
0
13 May 2025
Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Guohong Huang
Ling-an Zeng
Zexin Zheng
Shengbo Gu
Wei-Shi Zheng
55
0
0
29 Mar 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
63
0
0
28 Mar 2025
Progressive Human Motion Generation Based on Text and Few Motion Frames
Progressive Human Motion Generation Based on Text and Few Motion Frames
Ling-an Zeng
Gaojie Wu
Ancong Wu
Jian-Fang Hu
Wei-Shi Zheng
73
1
0
17 Mar 2025
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation
Ling-an Zeng
Guohong Huang
Yi-Lin Wei
Shengbo Gu
Yu-Ming Tang
Jingke Meng
Wei-Shi Zheng
81
2
0
17 Mar 2025
TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching
TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching
Yuan-Ming Li
An-Lan Wang
Kun-Yu Lin
Yu-Ming Tang
Ling-an Zeng
Jian-Fang Hu
Wei-Shi Zheng
132
6
0
26 Nov 2024
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action
  Understanding
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
Yuan-Ming Li
Wei-Jin Huang
An-Lan Wang
Ling-an Zeng
Jing-Ke Meng
Wei-Shi Zheng
62
13
0
13 Jun 2024
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
86
160
0
28 Mar 2023
Multiscale Audio Spectrogram Transformer for Efficient Audio
  Classification
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification
Wenjie Zhu
M. Omar
59
22
0
19 Mar 2023
Action Quality Assessment with Temporal Parsing Transformer
Action Quality Assessment with Temporal Parsing Transformer
Yang Bai
Desen Zhou
Songyang Zhang
Jian Wang
Errui Ding
Yu Guan
Yang Long
Jingdong Wang
ViT
43
41
0
19 Jul 2022
Probing Visual-Audio Representation for Video Highlight Detection via
  Hard-Pairs Guided Contrastive Learning
Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning
Shuaicheng Li
Feng Zhang
Kunlin Yang
Lin-Na Liu
Shinan Liu
Jun Hou
Shuai Yi
57
8
0
21 Jun 2022
Learning Pixel-Level Distinctions for Video Highlight Detection
Learning Pixel-Level Distinctions for Video Highlight Detection
Fanyue Wei
Biao Wang
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
15
19
0
10 Apr 2022
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
  Assessment
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
Jinglin Xu
Yongming Rao
Xumin Yu
Guangyi Chen
Jie Zhou
Jiwen Lu
35
88
0
07 Apr 2022
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval
  and Highlight Detection
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
Ye Liu
Siyuan Li
Yang Wu
C. Chen
Ying Shan
Xiaohu Qie
ViT
50
145
0
23 Mar 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
44
19
0
08 Mar 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition
  on Modality-Specific Annotated Videos
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly
Jian Lu
C. Xu
Yuru Zou
52
18
0
06 Mar 2022
Domain Knowledge-Informed Self-Supervised Representations for Workout
  Form Assessment
Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment
Paritosh Parmar
Amol Gharat
Helge Rhodin
3DH
SSL
34
19
0
28 Feb 2022
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
Shunli Wang
Dingkang Yang
Peng Zhai
Chixiao Chen
Lihua Zhang
ViT
47
64
0
11 Jan 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
49
41
0
06 Jan 2022
Class-aware Sounding Objects Localization via Audiovisual Correspondence
Class-aware Sounding Objects Localization via Audiovisual Correspondence
Di Hu
Yake Wei
Rui Qian
Weiyao Lin
Ruihua Song
Ji-Rong Wen
38
41
0
22 Dec 2021
Group-aware Contrastive Regression for Action Quality Assessment
Group-aware Contrastive Regression for Action Quality Assessment
Xumin Yu
Yongming Rao
Wenliang Zhao
Jiwen Lu
Jie Zhou
AI4TS
38
95
0
17 Aug 2021
Towards Unified Surgical Skill Assessment
Towards Unified Surgical Skill Assessment
Daochang Liu
Qiyue Li
Tingting Jiang
Yizhou Wang
R. Miao
F. Shan
Ziyu Li
23
75
0
02 Jun 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Yikang Shen
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
41
47
0
11 May 2021
Contrastive Learning of Global-Local Video Representations
Contrastive Learning of Global-Local Video Representations
Shuang Ma
Zhaoyang Zeng
Daniel J. McDuff
Yale Song
SSL
39
7
0
07 Apr 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
73
849
0
05 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
526
28,659
0
26 Feb 2021
MSAF: Multimodal Split Attention Fusion
MSAF: Multimodal Split Attention Fusion
Lang Su
Chuqing Hu
Guofa Li
Dongpu Cao
46
37
0
13 Dec 2020
Learning Trailer Moments in Full-Length Movies
Learning Trailer Moments in Full-Length Movies
Lezi Wang
Dong Liu
R. Puri
Dimitris N. Metaxas
12
42
0
19 Aug 2020
MINI-Net: Multiple Instance Ranking Network for Video Highlight
  Detection
MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong
Xuanteng Huang
Weihong Li
Weishi Zheng
35
61
0
20 Jul 2020
Uncertainty-aware Score Distribution Learning for Action Quality
  Assessment
Uncertainty-aware Score Distribution Learning for Action Quality Assessment
Yansong Tang
Zanlin Ni
Jiahuan Zhou
Danyang Zhang
Jiwen Lu
Ying Nian Wu
Jie Zhou
EDL
64
124
0
13 Jun 2020
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
222
207
0
23 Jan 2020
Listen to Look: Action Recognition by Previewing Audio
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
45
251
0
10 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
113
42,038
0
03 Dec 2019
Less is More: Learning Highlight Detection from Video Duration
Less is More: Learning Highlight Detection from Video Duration
Bo Xiong
Yannis Kalantidis
Deepti Ghadiyaram
Kristen Grauman
23
109
0
03 Mar 2019
Manipulation-skill Assessment from Videos with Spatial Attention Network
Manipulation-skill Assessment from Videos with Spatial Attention Network
Zhenqiang Li
Yifei Huang
Minjie Cai
Yoichi Sato
26
59
0
09 Jan 2019
The Pros and Cons: Rank-aware Temporal Attention for Skill Determination
  in Long Videos
The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos
Hazel Doughty
W. Mayol-Cuevas
Dima Damen
48
139
0
13 Dec 2018
Unsupervised Feature Learning via Non-Parametric Instance-level
  Discrimination
Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination
Zhirong Wu
Yuanjun Xiong
Stella X. Yu
Dahua Lin
SSL
127
3,437
0
05 May 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
63
747
0
10 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
81
431
0
23 Mar 2018
Learning to score the figure skating sports videos
Learning to score the figure skating sports videos
C. Xu
Yanwei Fu
Bing Zhang
Z. Chen
Yu-Gang Jiang
Xiangyang Xue
49
112
0
08 Feb 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
309
129,831
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
178
7,961
0
22 May 2017
Video and Accelerometer-Based Motion Analysis for Automated Surgical
  Skills Assessment
Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment
Aneeq Zia
Yachna Sharma
Vinay Bettadapura
E. Sarin
Irfan Essa
40
109
0
24 Feb 2017
Learning To Score Olympic Events
Learning To Score Olympic Events
Paritosh Parmar
B. Morris
25
167
0
16 Nov 2016
Categorical Reparameterization with Gumbel-Softmax
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
193
5,323
0
03 Nov 2016
Video2GIF: Automatic Generation of Animated GIFs from Video
Video2GIF: Automatic Generation of Animated GIFs from Video
Michael Gygli
Yale Song
Liangliang Cao
28
142
0
16 May 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.1K
192,638
0
10 Dec 2015
Unsupervised Extraction of Video Highlights Via Robust Recurrent
  Auto-encoders
Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders
Huan Yang
Baoyuan Wang
Stephen Lin
David Wipf
Minyi Guo
B. Guo
79
178
0
06 Oct 2015
A* Sampling
A* Sampling
Chris J. Maddison
Daniel Tarlow
T. Minka
52
390
0
31 Oct 2014
1