It's Just Another Day: Unique Video Captioning by Discriminative Prompting

15 October 2024

Papers citing "It's Just Another Day: Unique Video Captioning by Discriminative Prompting"

23 / 23 papers shown

Title
Progress-Aware Video Frame Captioning Zihui Xue Joungbin An Xitong Yang Kristen Grauman 217 1 0 03 Dec 2024
Step Differences in Instructional Video Tushar Nagarajan Lorenzo Torresani VGen 98 5 0 24 Apr 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Shengbang Tong Zhuang Liu Yuexiang Zhai Yi-An Ma Yann LeCun Saining Xie VLM MLLM 118 349 0 11 Jan 2024
AutoAD: Movie Description in Context Tengda Han Max Bain Arsha Nagrani Gül Varol Weidi Xie Andrew Zisserman VGen 74 35 0 29 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale Quan-Sen Sun Yuxin Fang Ledell Yu Wu Xinlong Wang Yue Cao CLIP VLM 151 513 0 27 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 432 4,663 0 30 Jan 2023
Egocentric Video-Language Pretraining Kevin Qinghong Lin Alex Jinpeng Wang Mattia Soldan Michael Wray Rui Yan ... Hongfa Wang Dima Damen Guohao Li Wei Liu Mike Zheng Shou VLM EgoV 89 206 0 03 Jun 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners Zhenhailong Wang Manling Li Ruochen Xu Luowei Zhou Jie Lei ... Chenguang Zhu Derek Hoiem Shih-Fu Chang Joey Tianyi Zhou Heng Ji MLLM VLM 218 142 0 22 May 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality Tristan Thrush Ryan Jiang Max Bartolo Amanpreet Singh Adina Williams Douwe Kiela Candace Ross CoGe 122 429 0 07 Apr 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong Guosheng Lin MLLM BDL VLM CLIP 557 4,421 0 28 Jan 2022
End-to-end Generative Pretraining for Multimodal Video Captioning Paul Hongsuck Seo Arsha Nagrani Anurag Arnab Cordelia Schmid 73 170 0 20 Jan 2022
SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning Kevin Qinghong Lin Linjie Li Chung-Ching Lin Faisal Ahmed Zhe Gan Zicheng Liu Yumao Lu Lijuan Wang ViT 85 245 0 25 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 410 1,114 0 13 Oct 2021
Towards Unique and Informative Captioning of Images Zeyu Wang Berthy Feng Karthik Narasimhan Olga Russakovsky 64 37 0 08 Sep 2020
MovieNet: A Holistic Dataset for Movie Understanding Qingqiu Huang Yu Xiong Anyi Rao Jiaze Wang Dahua Lin VGen 106 244 0 21 Jul 2020
Rescaling Egocentric Vision Dima Damen Hazel Doughty G. Farinella Antonino Furnari Evangelos Kazakos ... Davide Moltisanti Jonathan Munro Toby Perrett Will Price Michael Wray EgoV 122 466 0 23 Jun 2020
Condensed Movies: Story Based Retrieval with Contextual Embeddings Max Bain Arsha Nagrani A. Brown Andrew Zisserman 115 102 0 08 May 2020
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips Antoine Miech Dimitri Zhukov Jean-Baptiste Alayrac Makarand Tapaswi Ivan Laptev Josef Sivic VGen 122 1,208 0 07 Jun 2019
The "something something" video database for learning and evaluating visual common sense Raghav Goyal Samira Ebrahimi Kahou Vincent Michalski Joanna Materzynska S. Westphal ... Moritz Mueller-Freitag F. Hoppe Christian Thurau Ingo Bax Roland Memisevic VLM 108 1,542 0 13 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset João Carreira Andrew Zisserman 245 8,045 0 22 May 2017
Towards Automatic Learning of Procedures from Web Instructional Videos Luowei Zhou Chenliang Xu Jason J. Corso EgoV 79 832 0 28 Mar 2017
MovieQA: Understanding Stories in Movies through Question-Answering Makarand Tapaswi Yukun Zhu Rainer Stiefelhagen Antonio Torralba R. Urtasun Sanja Fidler 120 752 0 09 Dec 2015
A Dataset for Movie Description Anna Rohrbach Marcus Rohrbach Niket Tandon Bernt Schiele VGen 124 502 0 12 Jan 2015