ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1501.02530
  4. Cited By
A Dataset for Movie Description

A Dataset for Movie Description

12 January 2015
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
    VGen
ArXivPDFHTML

Papers citing "A Dataset for Movie Description"

50 / 257 papers shown
Title
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
24
128
0
19 Mar 2021
On Semantic Similarity in Video Retrieval
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
33
66
0
18 Mar 2021
A Straightforward Framework For Video Retrieval Using CLIP
A Straightforward Framework For Video Retrieval Using CLIP
Jesús Andrés Portillo-Quintero
J. C. Ortíz-Bayliss
Hugo Terashima-Marín
CLIP
324
117
0
24 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse
  Sampling
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
46
648
0
11 Feb 2021
The Role of the Input in Natural Language Video Description
The Role of the Input in Natural Language Video Description
S. Cascianelli
G. Costante
Alessandro Devo
Thomas Alessandro Ciarfuglia
P. Valigi
M. L. Fravolini
21
5
0
09 Feb 2021
Narration Generation for Cartoon Videos
Narration Generation for Cartoon Videos
Nikos Papasarantopoulos
Shay B. Cohen
VGen
25
2
0
17 Jan 2021
Recent Advances in Video Question Answering: A Review of Datasets and
  Methods
Recent Advances in Video Question Answering: A Review of Datasets and Methods
Devshree Patel
Ratnam Parikh
Yesha Shastri
15
18
0
15 Jan 2021
Learning Temporal Dynamics from Cycles in Narrated Video
Learning Temporal Dynamics from Cycles in Narrated Video
Dave Epstein
Jiajun Wu
Cordelia Schmid
Chen Sun
AI4TS
38
14
0
07 Jan 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
39
18
0
01 Jan 2021
Movie Summarization via Sparse Graph Construction
Movie Summarization via Sparse Graph Construction
Pinelopi Papalampidi
Frank Keller
Mirella Lapata
27
32
0
14 Dec 2020
MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision
  and Language Research in Turkish
MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish
Begum Citamak
Ozan Caglayan
Menekse Kuyu
Erkut Erdem
Aykut Erdem
Pranava Madhyastha
Lucia Specia
31
8
0
13 Dec 2020
A Comprehensive Review on Recent Methods and Challenges of Video
  Description
A Comprehensive Review on Recent Methods and Challenges of Video Description
Ashutosh Kumar Singh
Thoudam Doren Singh
Sivaji Bandyopadhyay
3DV
VLM
19
5
0
30 Nov 2020
QuerYD: A video dataset with high-quality text and audio narrations
QuerYD: A video dataset with high-quality text and audio narrations
Andreea-Maria Oncescu
João F. Henriques
Yang Liu
Andrew Zisserman
Samuel Albanie
VGen
22
11
0
22 Nov 2020
Video Action Understanding
Video Action Understanding
Matthew Hutchinson
V. Gadepally
43
20
0
13 Oct 2020
Dual Encoding for Video Retrieval by Text
Dual Encoding for Video Retrieval by Text
Jianfeng Dong
Xirong Li
Chaoxi Xu
Xun Yang
Gang Yang
Xun Wang
Meng Wang
24
2
0
10 Sep 2020
Identity-Aware Multi-Sentence Video Description
Identity-Aware Multi-Sentence Video Description
J. S. Park
Trevor Darrell
Anna Rohrbach
26
17
0
22 Aug 2020
Text-based Localization of Moments in a Video Corpus
Text-based Localization of Moments in a Video Corpus
Sudipta Paul
Niluthpol Chowdhury Mithun
Amit K. Roy-Chowdhury
10
14
0
20 Aug 2020
Poet: Product-oriented Video Captioner for E-commerce
Poet: Product-oriented Video Captioner for E-commerce
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Jie Liu
Jingren Zhou
Hongxia Yang
Fei Wu
14
34
0
16 Aug 2020
Enriching Video Captions With Contextual Text
Enriching Video Captions With Contextual Text
Philipp Rimle
Pelin Dogan
Markus Gross
30
3
0
29 Jul 2020
Active Learning for Video Description With Cluster-Regularized Ensemble
  Ranking
Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
David M. Chan
Sudheendra Vijayanarasimhan
David A. Ross
John F. Canny
VLM
14
6
0
27 Jul 2020
MovieNet: A Holistic Dataset for Movie Understanding
MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang
Yu Xiong
Anyi Rao
Jiaze Wang
Dahua Lin
VGen
45
235
0
21 Jul 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
430
596
0
21 Jul 2020
Knowledge-Based Video Question Answering with Unsupervised Scene
  Descriptions
Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
Noa Garcia
Yuta Nakashima
26
32
0
17 Jul 2020
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Xun Yang
Jianfeng Dong
Yixin Cao
Xun Wang
Meng Wang
Tat-Seng Chua
33
137
0
06 Jul 2020
Auto-captions on GIF: A Large-scale Video-sentence Dataset for
  Vision-language Pre-training
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Yingwei Pan
Yehao Li
Jianjie Luo
Jun Xu
Ting Yao
Tao Mei
38
57
0
05 Jul 2020
Comprehensive Information Integration Modeling Framework for Video
  Titling
Comprehensive Information Integration Modeling Framework for Video Titling
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Tan Jiang
Jingren Zhou
Hongxia Yang
Fei Wu
31
40
0
24 Jun 2020
Rescaling Egocentric Vision
Rescaling Egocentric Vision
Dima Damen
Hazel Doughty
G. Farinella
Antonino Furnari
Evangelos Kazakos
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
19
437
0
23 Jun 2020
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
23
225
0
29 Apr 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal
  Learning
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani
Rami Ben-Ari
Daniel Rotman
A. Bronstein
17
121
0
06 Mar 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
38
57
0
22 Jan 2020
On the Evaluation of Intelligent Process Automation
On the Evaluation of Intelligent Process Automation
Deborah Ferreira
Julia Rozanova
K. Dubba
Dell Zhang
André Freitas
9
9
0
08 Jan 2020
End-to-End Learning of Visual Representations from Uncurated
  Instructional Videos
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Antoine Miech
Jean-Baptiste Alayrac
Lucas Smaira
Ivan Laptev
Josef Sivic
Andrew Zisserman
VGen
SSL
42
703
0
13 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Guohao Li
M. Worring
AAML
OOD
23
23
0
30 Nov 2019
A Graph-Based Framework to Bridge Movies and Synopses
A Graph-Based Framework to Bridge Movies and Synopses
Yu Xiong
Chengyi Zhang
Lingfeng Guo
Hang Zhou
Bolei Zhou
Dahua Lin
27
62
0
24 Oct 2019
Embodied Language Grounding with 3D Visual Feature Representations
Embodied Language Grounding with 3D Visual Feature Representations
Mihir Prabhudesai
H. Tung
Syed Ashar Javed
Maximilian Sieb
Adam W. Harley
Katerina Fragkiadaki
25
21
0
02 Oct 2019
Proposal-free Temporal Moment Localization of a Natural-Language Query
  in Video using Guided Attention
Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
F. Saleh
Hongdong Li
Stephen Gould
30
147
0
20 Aug 2019
Use What You Have: Video Retrieval Using Representations From
  Collaborative Experts
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Finding Moments in Video Collections Using Natural Language
Finding Moments in Video Collections Using Natural Language
Victor Escorcia
Mattia Soldan
Josef Sivic
Guohao Li
Bryan C. Russell
31
6
0
30 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
25
133
0
22 Jul 2019
Cross-Lingual Transfer Learning for Question Answering
Cross-Lingual Transfer Learning for Question Answering
Chia-Hsuan Lee
Hung-yi Lee
28
23
0
13 Jul 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
27
1,175
0
07 Jun 2019
Synthetic Defocus and Look-Ahead Autofocus for Casual Videography
Synthetic Defocus and Look-Ahead Autofocus for Casual Videography
X. Zhang
Kevin Blackburn-Matzen
Vivien Nguyen
Dillon Yao
You Zhang
Ren Ng
VGen
26
37
0
15 May 2019
VATEX: A Large-Scale, High-Quality Multilingual Dataset for
  Video-and-Language Research
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Eric Wang
Jiawei Wu
Junkun Chen
Lei Li
Yuan-fang Wang
William Yang Wang
32
539
0
06 Apr 2019
M-VAD Names: a Dataset for Video Captioning with Naming
M-VAD Names: a Dataset for Video Captioning with Naming
S. Pini
Marcella Cornia
Federico Bolelli
Lorenzo Baraldi
Rita Cucchiara
27
29
0
04 Mar 2019
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
23
221
0
26 Dec 2018
MTLE: A Multitask Learning Encoder of Visual Feature Representations for
  Video and Movie Description
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description
Oliver A. Nina
Washington Garcia
Scott Clouse
Alper Yilmaz
23
4
0
19 Sep 2018
Dual Encoding for Zero-Example Video Retrieval
Dual Encoding for Zero-Example Video Retrieval
Jianfeng Dong
Xirong Li
Chaoxi Xu
S. Ji
Yuan He
Gang Yang
Xun Wang
30
268
0
17 Sep 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual
  Contexts
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Shuming Ma
Lei Cui
Damai Dai
Furu Wei
Xu Sun
VGen
28
61
0
13 Sep 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
36
617
0
05 Sep 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Youngjae Yu
Jongseok Kim
Gunhee Kim
40
340
0
07 Aug 2018
Previous
123456
Next