ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.10322
  4. Cited By
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding
  for Video Captioning

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

27 February 2019
Nayyer Aafaq
Naveed Akhtar
Wei Liu
Syed Zulqarnain Gilani
Ajmal Saeed Mian
ArXivPDFHTML

Papers citing "Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning"

22 / 22 papers shown
Title
TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching
TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching
Yuan-Ming Li
An-Lan Wang
Kun-Yu Lin
Yu-Ming Tang
Ling-an Zeng
Jian-Fang Hu
Wei-Shi Zheng
96
6
0
26 Nov 2024
Deep Neural Networks in Video Human Action Recognition: A Review
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
53
4
0
25 May 2023
Refined Semantic Enhancement towards Frequency Diffusion for Video
  Captioning
Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
Xian Zhong
Zipeng Li
Shuqin Chen
Kui Jiang
Chen Chen
Mang Ye
DiffM
VGen
19
40
0
28 Nov 2022
Thinking Hallucination for Video Captioning
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
36
4
0
28 Sep 2022
Multimodal learning with graphs
Multimodal learning with graphs
Yasha Ektefaie
George Dasoulas
Ayush Noori
Maha Farhat
Marinka Zitnik
51
82
0
07 Sep 2022
Large-scale Robustness Analysis of Video Action Recognition Models
Large-scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Schiappa
Naman Biyani
Prudvi Kamtam
Shruti Vyas
Hamid Palangi
Vibhav Vineet
Y. S. Rawat
AAML
34
24
0
04 Jul 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
41
528
0
27 May 2022
Support-set based Multi-modal Representation Enhancement for Video
  Captioning
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
24
4
0
19 May 2022
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Chengpeng Dai
Fuhai Chen
Xiaoshuai Sun
Rongrong Ji
QiXiang Ye
Yongjian Wu
17
1
0
13 Mar 2022
Hierarchical Modular Network for Video Captioning
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
19
67
0
24 Nov 2021
Visual-aware Attention Dual-stream Decoder for Video Captioning
Visual-aware Attention Dual-stream Decoder for Video Captioning
Zhixin Sun
Xian Zhong
Shuqin Chen
Lin Li
Luo Zhong
31
3
0
16 Oct 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal
  Attention
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
31
12
0
07 Sep 2021
Curious Case of Language Generation Evaluation Metrics: A Cautionary
  Tale
Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale
Ozan Caglayan
Pranava Madhyastha
Lucia Specia
ELM
39
35
0
26 Oct 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded
  Dialogues
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
S. Hoi
40
30
0
20 Oct 2020
Learning Modality Interaction for Temporal Sentence Localization and
  Event Captioning in Videos
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
25
101
0
28 Jul 2020
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang
32
52
0
23 Jul 2020
Deep hierarchical pooling design for cross-granularity action
  recognition
Deep hierarchical pooling design for cross-granularity action recognition
A. Mazari
H. Sahbi
21
0
0
08 Jun 2020
Object Relational Graph with Teacher-Recommended Learning for Video
  Captioning
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
Ziqi Zhang
Yaya Shi
Chunfen Yuan
Bing Li
Peijin Wang
Weiming Hu
Zhengjun Zha
VLM
28
271
0
26 Feb 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
22
19
0
17 Jan 2020
Delving Deeper into the Decoder for Video Captioning
Delving Deeper into the Decoder for Video Captioning
Haoran Chen
Jianmin Li
Xiaolin Hu
43
34
0
16 Jan 2020
Relational Reasoning using Prior Knowledge for Visual Captioning
Relational Reasoning using Prior Knowledge for Visual Captioning
Jingyi Hou
Xinxiao Wu
Yayun Qi
Wentian Zhao
Jiebo Luo
Yunde Jia
17
14
0
04 Jun 2019
Memory-augmented Attention Modelling for Videos
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
35
20
0
07 Nov 2016
1