ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1608.04959
  4. Cited By
Frame- and Segment-Level Features and Candidate Pool Evaluation for
  Video Caption Generation

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

17 August 2016
Rakshith Shetty
Jorma T. Laaksonen
ArXivPDFHTML

Papers citing "Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation"

9 / 9 papers shown
Title
Visual Subtitle Feature Enhanced Video Outline Generation
Visual Subtitle Feature Enhanced Video Outline Generation
Qi Lv
Ziqiang Cao
Wenrui Xie
Derui Wang
Jingwen Wang
...
Yuan-Fang Li
Min Cao
Wenjie Li
Sujian Li
Guohong Fu
VGen
26
0
0
24 Aug 2022
Video Captioning with Text-based Dynamic Attention and Step-by-Step
  Learning
Video Captioning with Text-based Dynamic Attention and Step-by-Step Learning
Huanhou Xiao
Jinglun Shi
11
24
0
05 Nov 2019
Prediction and Description of Near-Future Activities in Video
Prediction and Description of Near-Future Activities in Video
T. Mahmud
Mohammad Billah
Mahmudul Hasan
Amit K. Roy-Chowdhury
28
16
0
02 Aug 2019
Reconstruct and Represent Video Contents for Captioning via
  Reinforcement Learning
Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning
Wei Zhang
Bairui Wang
Lin Ma
Wei Liu
20
67
0
03 Jun 2019
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding
  for Video Captioning
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
Nayyer Aafaq
Naveed Akhtar
Wen Liu
Syed Zulqarnain Gilani
Ajmal Mian
31
204
0
27 Feb 2019
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
36
20
0
07 Dec 2018
Mining for meaning: from vision to language through multiple networks
  consensus
Mining for meaning: from vision to language through multiple networks consensus
Iulia Duta
Andrei Liviu Nicolicioiu
Simion-Vlad Bogolin
Marius Leordeanu
18
3
0
05 Jun 2018
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal
  Attentions for Video Captioning
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
Junfeng Fang
Yuan-fang Wang
William Yang Wang
16
76
0
15 Apr 2018
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
1