ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.00431
  4. Cited By
MAD: A Scalable Dataset for Language Grounding in Videos from Movie
  Audio Descriptions
v1v2 (latest)

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions

1 December 2021
Mattia Soldan
Alejandro Pardo
Juan Carlos León Alcázar
Fabian Caba Heilbron
Chen Zhao
Silvio Giancola
Guohao Li
    VGen
ArXiv (abs)PDFHTMLGithub (164★)

Papers citing "MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions"

32 / 32 papers shown
Title
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
Zijia Lu
A S M Iftekhar
Gaurav Mittal
Tianjian Meng
Xiawei Wang
Cheng Zhao
Rohith Kukkala
Ehsan Elhamifar
Mei Chen
64
0
0
22 May 2025
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffMVGen
189
3
0
22 Nov 2024
Contextual AD Narration with Interleaved Multimodal Sequence
Contextual AD Narration with Interleaved Multimodal Sequence
Hanlin Wang
Zhan Tong
Kecheng Zheng
Yujun Shen
Limin Wang
VGen
100
4
0
19 Mar 2024
Towards Debiasing Temporal Sentence Grounding in Video
Towards Debiasing Temporal Sentence Grounding in Video
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
90
16
0
08 Nov 2021
A Survey on Temporal Sentence Grounding in Videos
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
92
47
0
16 Sep 2021
MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
Alejandro Pardo
Fabian Caba Heilbron
Juan Carlos León Alcázar
Ali K. Thabet
Guohao Li
VGen
68
28
0
12 Sep 2021
Learning to Cut by Watching Movies
Learning to Cut by Watching Movies
Alejandro Pardo
Fabian Caba Heilbron
Juan Carlos León Alcázar
Ali K. Thabet
Guohao Li
VGen
90
20
0
09 Aug 2021
Transcript to Video: Efficient Clip Sequencing from Texts
Transcript to Video: Efficient Clip Sequencing from Texts
Yu Xiong
Fabian Caba Heilbron
Dahua Lin
CLIP
48
10
0
25 Jul 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIPVLM
419
809
0
18 Apr 2021
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal
  Grounding
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
Hao Zhou
Chongyang Zhang
Yan Luo
Yanjun Chen
Chuanping Hu
53
52
0
31 Mar 2021
Context-aware Biaffine Localizing Network for Temporal Sentence
  Grounding
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
63
145
0
22 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
969
29,810
0
26 Feb 2021
Video Self-Stitching Graph Network for Temporal Action Localization
Video Self-Stitching Graph Network for Temporal Action Localization
Chen Zhao
Ali K. Thabet
Guohao Li
76
141
0
30 Nov 2020
VLG-Net: Video-Language Graph Matching Network for Video Grounding
VLG-Net: Video-Language Graph Matching Network for Video Grounding
Mattia Soldan
Mengmeng Xu
Sisi Qu
Jesper N. Tegnér
Guohao Li
81
70
0
19 Nov 2020
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
130
76
0
01 Sep 2020
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based
  Moment Localization
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization
Daizong Liu
Xiaoye Qu
Xiao-Yang Liu
Jianfeng Dong
Pan Zhou
Zichuan Xu
75
129
0
04 Aug 2020
DeeperGCN: All You Need to Train Deeper GCNs
DeeperGCN: All You Need to Train Deeper GCNs
Guohao Li
Chenxin Xiong
Ali K. Thabet
Guohao Li
GNN
226
442
0
13 Jun 2020
Local-Global Video-Text Interactions for Temporal Grounding
Local-Global Video-Text Interactions for Temporal Grounding
Jonghwan Mun
Minsu Cho
Bohyung Han
81
269
0
16 Apr 2020
Dense Regression Network for Video Grounding
Dense Regression Network for Video Grounding
Runhao Zeng
Haoming Xu
Wenbing Huang
Peihao Chen
Mingkui Tan
Chuang Gan
81
283
0
07 Apr 2020
Learning 2D Temporal Adjacent Networks for Moment Localization with
  Natural Language
Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang
Houwen Peng
Jianlong Fu
Jiebo Luo
75
470
0
08 Dec 2019
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action Detection
Mengmeng Xu
Chen Zhao
D. Rojas
Ali K. Thabet
Guohao Li
127
437
0
26 Nov 2019
DeepGCNs: Making GCNs Go as Deep as CNNs
DeepGCNs: Making GCNs Go as Deep as CNNs
Ge Li
Matthias Muller
Guocheng Qian
Itzel C. Delgadillo
Abdulellah Abualshour
Ali K. Thabet
Guohao Li
3DPCGNN
83
174
0
15 Oct 2019
DeepGCNs: Can GCNs Go as Deep as CNNs?
DeepGCNs: Can GCNs Go as Deep as CNNs?
Ge Li
Matthias Muller
Ali K. Thabet
Guohao Li
3DPCGNN
130
1,350
0
07 Apr 2019
Localizing Moments in Video with Natural Language
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
123
949
0
04 Aug 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
127
824
0
05 May 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
144
1,249
0
02 May 2017
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DVVGen
81
359
0
12 May 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity
  Understanding
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
111
1,246
0
06 Apr 2016
Using Descriptive Video Services to Create a Large Data Source for Video
  Annotation Research
Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research
Atousa Torabi
C. Pal
Hugo Larochelle
Aaron Courville
VGen
101
205
0
03 Mar 2015
A Dataset for Movie Description
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
124
502
0
12 Jan 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.0K
150,312
0
22 Dec 2014
Coherent Multi-Sentence Video Description with Variable Level of Detail
Coherent Multi-Sentence Video Description with Variable Level of Detail
Anna Rohrbach
Marcus Rohrbach
Weijian Qiu
Annemarie Friedrich
Sikandar Amin
Mykhaylo Andriluka
Manfred Pinkal
Bernt Schiele
91
218
0
24 Mar 2014
1