ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.09667
  4. Cited By
Video Captioning with Guidance of Multimodal Latent Topics

Video Captioning with Guidance of Multimodal Latent Topics

31 August 2017
Shizhe Chen
Jia Chen
Qin Jin
Alexander G. Hauptmann
ArXivPDFHTML

Papers citing "Video Captioning with Guidance of Multimodal Latent Topics"

10 / 10 papers shown
Title
ADAPT: Action-aware Driving Caption Transformer
ADAPT: Action-aware Driving Caption Transformer
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
25
69
0
01 Feb 2023
Relational Graph Learning for Grounded Video Description Generation
Relational Graph Learning for Grounded Video Description Generation
Wenqiao Zhang
Qing Guo
Siliang Tang
Haizhou Shi
Haochen Shi
Jun Xiao
Yueting Zhuang
Luu Anh Tuan
27
33
0
02 Dec 2021
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does
  Matter
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
Hierarchical Multimodal Transformer to Summarize Videos
Hierarchical Multimodal Transformer to Summarize Videos
Bin Zhao
Maoguo Gong
Xuelong Li
ViT
30
55
0
22 Sep 2021
End-to-End Dense Video Captioning with Parallel Decoding
End-to-End Dense Video Captioning with Parallel Decoding
Teng Wang
Ruimao Zhang
Zhichao Lu
Feng Zheng
Ran Cheng
Ping Luo
3DV
47
179
0
17 Aug 2021
Learning Modality Interaction for Temporal Sentence Localization and
  Event Captioning in Videos
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
25
101
0
28 Jul 2020
Better Captioning with Sequence-Level Exploration
Better Captioning with Sequence-Level Exploration
Jia Chen
Qin Jin
37
12
0
08 Mar 2020
Temporal Deformable Convolutional Encoder-Decoder Networks for Video
  Captioning
Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Jingwen Chen
Yingwei Pan
Yehao Li
Ting Yao
Hongyang Chao
Tao Mei
21
104
0
03 May 2019
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
36
20
0
07 Dec 2018
Mining for meaning: from vision to language through multiple networks
  consensus
Mining for meaning: from vision to language through multiple networks consensus
Iulia Duta
Andrei Liviu Nicolicioiu
Simion-Vlad Bogolin
Marius Leordeanu
18
3
0
05 Jun 2018
1