ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.11438
  4. Cited By
Reconstruction Network for Video Captioning

Reconstruction Network for Video Captioning

30 March 2018
Bairui Wang
Lin Ma
Wei Zhang
Wen Liu
ArXiv (abs)PDFHTML

Papers citing "Reconstruction Network for Video Captioning"

50 / 87 papers shown
Title
PR-DETR: Injecting Position and Relation Prior for Dense Video Captioning
PR-DETR: Injecting Position and Relation Prior for Dense Video Captioning
Yizhe Li
Sanping Zhou
Zheng Qin
Le Wang
ViT
22
0
0
19 Jun 2025
F$^3$Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
F3^33Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhe Hou
Yun Lin
Jin Song Dong
79
0
0
11 Apr 2025
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning
  Through Retrieval and Understanding Modalities
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities
Ehsan Faghihi
Mohammedreza Zarenejad
Ali-Asghar Beheshti Shirazi
72
1
0
04 Nov 2024
Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation
Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation
Kun Yuan
V. Srivastav
Nassir Navab
N. Padoy
142
9
0
30 Sep 2024
VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It
VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It
Xiaoxuan Zhu
Zhouhong Gu
Sihang Jiang
Zhixu Li
Hongwei Feng
Yanghua Xiao
94
0
0
15 Jun 2024
Sentiment-oriented Transformer-based Variational Autoencoder Network for
  Live Video Commenting
Sentiment-oriented Transformer-based Variational Autoencoder Network for Live Video Commenting
Fengyi Fu
Shancheng Fang
Weidong Chen
Zhendong Mao
ViTVGen
69
4
0
19 Apr 2024
A Challenging Multimodal Video Summary: Simultaneously Extracting and
  Generating Keyframe-Caption Pairs from Video
A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video
Keito Kudo
Haruki Nagasawa
Jun Suzuki
Nobuyuki Shimizu
75
2
0
04 Dec 2023
Multi Sentence Description of Complex Manipulation Action Videos
Multi Sentence Description of Complex Manipulation Action Videos
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
Florentin Wörgötter
79
3
0
13 Nov 2023
Few-shot Action Recognition with Captioning Foundation Models
Few-shot Action Recognition with Captioning Foundation Models
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
VLM
126
7
0
16 Oct 2023
A Hierarchical Graph-based Approach for Recognition and Description
  Generation of Bimanual Actions in Videos
A Hierarchical Graph-based Approach for Recognition and Description Generation of Bimanual Actions in Videos
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
Florentin Wörgötter
60
2
0
01 Oct 2023
Towards Surveillance Video-and-Language Understanding: New Dataset,
  Baselines, and Challenges
Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges
Tongtong Yuan
Xuange Zhang
Kun Liu
Bo Liu
Chen Chen
Jian Jin
Zhenzhen Jiao
AI4TS
107
19
0
25 Sep 2023
Guided Slot Attention for Unsupervised Video Object Segmentation
Guided Slot Attention for Unsupervised Video Object Segmentation
Minhyeok Lee
Suhwan Cho
Dogyoon Lee
Chaewon Park
Jungho Lee
Sangyoun Lee
VOS
148
11
0
15 Mar 2023
VindLU: A Recipe for Effective Video-and-Language Pretraining
VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng
Xizi Wang
Jie Lei
David J. Crandall
Joey Tianyi Zhou
Gedas Bertasius
VLM
125
81
0
09 Dec 2022
Refined Semantic Enhancement towards Frequency Diffusion for Video
  Captioning
Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
Xian Zhong
Zipeng Li
Shuqin Chen
Kui Jiang
Chen Chen
Mang Ye
DiffMVGen
107
43
0
28 Nov 2022
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language
  Pre-training
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training
Yuanze Lin
Chen Wei
Huiyu Wang
Alan Yuille
Cihang Xie
3DGS
117
15
0
21 Nov 2022
Visual Commonsense-aware Representation Network for Video Captioning
Visual Commonsense-aware Representation Network for Video Captioning
Pengpeng Zeng
Haonan Zhang
Lianli Gao
Xiangpeng Li
Jin Qian
Hengtao Shen
85
18
0
17 Nov 2022
End-to-End Multimodal Representation Learning for Video Dialog
End-to-End Multimodal Representation Learning for Video Dialog
Huda AlAmri
Anthony Bilic
Michael Hu
Apoorva Beedu
Irfan Essa
90
7
0
26 Oct 2022
Thinking Hallucination for Video Captioning
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
84
6
0
28 Sep 2022
Multi-modal Video Chapter Generation
Multi-modal Video Chapter Generation
Xiao Cao
Zitan Chen
Canyu Le
Lei Meng
VGen
70
3
0
26 Sep 2022
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for
  Image-Text Retrieval
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang
Dongliang He
Wenhao Wu
Boyang Xia
Min Yang
Fu Li
YunLong Yu
Zhong Ji
Errui Ding
Jingdong Wang
64
23
0
21 Aug 2022
Automatic Concept Extraction for Concept Bottleneck-based Video
  Classification
Automatic Concept Extraction for Concept Bottleneck-based Video Classification
J. Jeyakumar
Luke Dickens
L. Garcia
Yu Cheng
Diego Ramirez Echavarria
Joseph Noor
Alessandra Russo
Lance M. Kaplan
Erik P. Blasch
Mani B. Srivastava
79
9
0
21 Jun 2022
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kai Zheng
Xiaotong Chen
Odest Chadwicke Jenkins
Xinze Wang
LM&RoCoGe
95
63
0
17 Jun 2022
Egocentric Video-Language Pretraining
Egocentric Video-Language Pretraining
Kevin Qinghong Lin
Alex Jinpeng Wang
Mattia Soldan
Michael Wray
Rui Yan
...
Hongfa Wang
Dima Damen
Guohao Li
Wei Liu
Mike Zheng Shou
VLMEgoV
104
207
0
03 Jun 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
Xinming Zhang
Dongfang Liu
121
59
0
22 May 2022
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Chengpeng Dai
Fuhai Chen
Xiaoshuai Sun
Rongrong Ji
QiXiang Ye
Yongjian Wu
86
1
0
13 Mar 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
186
228
0
18 Feb 2022
An Integrated Approach for Video Captioning and Applications
An Integrated Approach for Video Captioning and Applications
Soheyla Amirian
T. Taha
Khaled Rasheed
H. Arabnia
64
1
0
23 Jan 2022
Bridging the Gap: Using Deep Acoustic Representations to Learn Grounded
  Language from Percepts and Raw Speech
Bridging the Gap: Using Deep Acoustic Representations to Learn Grounded Language from Percepts and Raw Speech
Gaoussou Youssouf Kebe
Luke E. Richards
Edward Raff
Francis Ferraro
Cynthia Matuszek
SSL
92
5
0
27 Dec 2021
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Yitian Yuan
Lin Ma
Wenwu Zhu
65
7
0
02 Dec 2021
Hierarchical Modular Network for Video Captioning
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
129
70
0
24 Nov 2021
CLIP4Caption: CLIP for Video Caption
CLIP4Caption: CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhenhua Liu
Fengyun Rao
Dian Li
Xiu Li
CLIPVLM
91
155
0
13 Oct 2021
End-to-End Dense Video Captioning with Parallel Decoding
End-to-End Dense Video Captioning with Parallel Decoding
Teng Wang
Ruimao Zhang
Zhichao Lu
Feng Zheng
Ran Cheng
Ping Luo
3DV
101
188
0
17 Aug 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Cross-Modal Graph with Meta Concepts for Video Captioning
Hao Wang
Guosheng Lin
Guosheng Lin
Chunyan Miao
67
7
0
14 Aug 2021
Discriminative Latent Semantic Graph for Video Captioning
Discriminative Latent Semantic Graph for Video Captioning
Yang Bai
Junyan Wang
Yang Long
Bingzhang Hu
Yang Song
Maurice Pagnucco
Yu Guan
90
31
0
08 Aug 2021
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable
  Video Captioning
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Bang-ju Yang
Shen Ge
Yuexian Zou
Xu Sun
83
32
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
82
42
0
04 Aug 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
66
38
0
30 May 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
245
59
0
24 May 2021
A Survey on Natural Language Video Localization
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
121
7
0
01 Apr 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
71
17
0
27 Mar 2021
Exploration of Visual Features and their weighted-additive fusion for
  Video Captioning
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
V. PraveenS.
Akhilesh Bharadwaj
Harsh Raj
Janhavi Dadhania
Ganesh Samarth C.A
Nikhil Pareek
S. M. I. S. R. Mahadeva Prasanna
50
1
0
14 Jan 2021
Guidance Module Network for Video Captioning
Guidance Module Network for Video Captioning
Xiao Zhang
Chunsheng Liu
F. Chang
43
4
0
20 Dec 2020
A Comprehensive Review on Recent Methods and Challenges of Video
  Description
A Comprehensive Review on Recent Methods and Challenges of Video Description
Ashutosh Kumar Singh
Thoudam Doren Singh
Sivaji Bandyopadhyay
3DVVLM
47
5
0
30 Nov 2020
Data-efficient Alignment of Multimodal Sequences by Aligning Gradient
  Updates and Internal Feature Distributions
Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions
Jianan Wang
Boyang Albert Li
Xiangyu Fan
Jing-Hua Lin
Yanwei Fu
54
2
0
15 Nov 2020
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via
  Sequential Learning
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning
Xing Yan
Weizhong Zhang
Lin Ma
Wen Liu
Qi Wu
AI4TS
38
23
0
16 Oct 2020
Video captioning with stacked attention and semantic hard pull
Video captioning with stacked attention and semantic hard pull
Md. Mushfiqur Rahman
Thasinul Abedin
Khondokar S. S. Prottoy
Ayana Moshruba
Fazlul Hasan Siddiqui
67
2
0
15 Sep 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
82
13
0
18 Aug 2020
Poet: Product-oriented Video Captioner for E-commerce
Poet: Product-oriented Video Captioner for E-commerce
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Jie Liu
Jingren Zhou
Hongxia Yang
Leilei Gan
73
36
0
16 Aug 2020
Self-supervised Video Representation Learning by Pace Prediction
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang
Jianbo Jiao
Yunhui Liu
SSLAI4TS
84
237
0
13 Aug 2020
Learning Modality Interaction for Temporal Sentence Localization and
  Event Captioning in Videos
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
99
102
0
28 Jul 2020
12
Next