ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.03734
  4. Cited By
MHMS: Multimodal Hierarchical Multimedia Summarization

MHMS: Multimodal Hierarchical Multimedia Summarization

7 April 2022
Jielin Qiu
Jiacheng Zhu
Mengdi Xu
Franck Dernoncourt
Trung Bui
Zhaowen Wang
Yue Liu
Ding Zhao
Hailin Jin
ArXivPDFHTML

Papers citing "MHMS: Multimodal Hierarchical Multimedia Summarization"

50 / 59 papers shown
Title
Multi-modal Alignment using Representation Codebook
Multi-modal Alignment using Representation Codebook
Jiali Duan
Liqun Chen
Son Tran
Jinyu Yang
Yi Xu
Belinda Zeng
Trishul Chilimbi
55
68
0
28 Feb 2022
Using Optimal Transport as Alignment Objective for fine-tuning
  Multilingual Contextualized Embeddings
Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings
Sawsan Alqahtani
Garima Lalwani
Yi Zhang
Salvatore Romeo
Saab Mansour
OT
54
25
0
06 Oct 2021
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang
Yonatan Bisk
Jianfeng Gao
81
140
0
23 Aug 2021
Scalable Optimal Transport in High Dimensions for Graph Distances,
  Embedding Alignment, and More
Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More
Johannes Klicpera
Marten Lienen
Stephan Günnemann
OT
49
13
0
14 Jul 2021
See, Hear, Read: Leveraging Multimodality with Guided Attention for
  Abstractive Text Summarization
See, Hear, Read: Leveraging Multimodality with Guided Attention for Abstractive Text Summarization
Yash Kumar Atri
Shraman Pramanick
Vikram Goyal
Tanmoy Chakraborty
59
35
0
20 May 2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen
Xiaohan Nie
David D. Fan
Dongqing Zhang
Vimal Bhat
Raffay Hamid
SSL
49
62
0
28 Apr 2021
Temporally-Weighted Hierarchical Clustering for Unsupervised Action
  Segmentation
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
M. Sarfraz
Naila Murray
Vivek Sharma
Ali Diba
Luc Van Gool
Rainer Stiefelhagen
71
71
0
20 Mar 2021
Video Summarization Using Deep Neural Networks: A Survey
Video Summarization Using Deep Neural Networks: A Survey
Evlampios Apostolidis
E. Adamantidou
Alexandros I. Metsai
Vasileios Mezaris
Ioannis Patras
AI4TS
81
210
0
15 Jan 2021
VMSMO: Learning to Generate Multimodal Summary for Video-based News
  Articles
VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles
Li Mingzhe
Preslav Nakov
Shen Gao
Zhangming Chan
Dongyan Zhao
Rui Yan
83
83
0
12 Oct 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase
  Grounding
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
133
11
0
12 Oct 2020
Multi-modal Summarization for Video-containing Documents
Multi-modal Summarization for Video-containing Documents
Xiyan Fu
Jun Wang
Zhenglu Yang
45
23
0
17 Sep 2020
Graph Optimal Transport for Cross-Domain Alignment
Graph Optimal Transport for Cross-Domain Alignment
Liqun Chen
Zhe Gan
Yu Cheng
Linjie Li
Lawrence Carin
Jingjing Liu
OT
85
152
0
26 Jun 2020
Text Segmentation by Cross Segment Attention
Text Segmentation by Cross Segment Attention
Michal Lukasik
Boris Dadachev
Gonçalo Simões
Kishore Papineni
VLM
39
84
0
30 Apr 2020
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Anyi Rao
Linning Xu
Yu Xiong
Guodong Xu
Qingqiu Huang
Bolei Zhou
Dahua Lin
44
111
0
06 Apr 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Shizhe Chen
Yida Zhao
Qin Jin
Qi Wu
82
314
0
01 Mar 2020
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
225
10,815
0
29 Oct 2019
Text Summarization with Pretrained Encoders
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
443
1,450
0
22 Aug 2019
Fine-Grained Action Retrieval Through Multiple Parts-of-Speech
  Embeddings
Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings
Michael Wray
Diane Larlus
G. Csurka
Dima Damen
81
152
0
09 Aug 2019
Hierarchical Optimal Transport for Multimodal Distribution Alignment
Hierarchical Optimal Transport for Multimodal Distribution Alignment
John Lee
M. Dabagia
Eva L. Dyer
Christopher Rozell
OT
34
65
0
27 Jun 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
105
1,199
0
07 Jun 2019
A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action
  Segmentation
A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation
Hilde Kuehne
Alexander Richard
Juergen Gall
96
83
0
03 Jun 2019
HIBERT: Document Level Pre-training of Hierarchical Bidirectional
  Transformers for Document Summarization
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
Xingxing Zhang
Furu Wei
M. Zhou
75
379
0
16 May 2019
COIN: A Large-scale Dataset for Comprehensive Instructional Video
  Analysis
COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis
Yansong Tang
Dajun Ding
Yongming Rao
Yu Zheng
Danyang Zhang
Lili Zhao
Jiwen Lu
Jie Zhou
117
315
0
07 Mar 2019
Image-Question-Answer Synergistic Network for Visual Dialog
Image-Question-Answer Synergistic Network for Visual Dialog
Dalu Guo
Chang Xu
Dacheng Tao
46
74
0
26 Feb 2019
A Perceptual Prediction Framework for Self Supervised Event Segmentation
A Perceptual Prediction Framework for Self Supervised Event Segmentation
Sathyanarayanan N. Aakur
Sudeep Sarkar
67
69
0
12 Nov 2018
Cross-Modal and Hierarchical Modeling of Video and Text
Cross-Modal and Hierarchical Modeling of Video and Text
Bowen Zhang
Hexiang Hu
Fei Sha
BDL
AI4TS
56
191
0
16 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.6K
94,729
0
11 Oct 2018
Iterative Document Representation Learning Towards Summarization with
  Polishing
Iterative Document Representation Learning Towards Summarization with Polishing
Preslav Nakov
Shen Gao
Chongyang Tao
Yan Song
Dongyan Zhao
Rui Yan
52
41
0
27 Sep 2018
Exploring Visual Relationship for Image Captioning
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
74
833
0
19 Sep 2018
Toward Fast and Accurate Neural Discourse Segmentation
Toward Fast and Accurate Neural Discourse Segmentation
Yizhong Wang
Sujian Li
Jingfeng Yang
37
94
0
28 Aug 2018
Video Summarisation by Classification with Deep Reinforcement Learning
Video Summarisation by Classification with Deep Reinforcement Learning
Kaiyang Zhou
Tao Xiang
Andrea Cavallaro
OffRL
38
35
0
09 Jul 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal
  Attention-Based Video Features
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
Gordon Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
51
125
0
21 Jun 2018
A Unified Model for Extractive and Abstractive Summarization using
  Inconsistency Loss
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
W. Hsu
Chieh-Kai Lin
Ming-Ying Lee
Kerui Min
Jing Tang
Min Sun
CVBM
70
240
0
16 May 2018
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Yuxiang Wu
Baotian Hu
AI4TS
38
170
0
19 Apr 2018
Superframes, A Temporal Video Segmentation
Superframes, A Temporal Video Segmentation
Hajar Sadeghi Sokeh
Vasileios Argyriou
D. Monekosso
Paolo Remagnino
32
13
0
18 Apr 2018
Text Segmentation as a Supervised Learning Task
Text Segmentation as a Supervised Learning Task
Omri Koshorek
Adir Cohen
Noam Mor
Michael Rotman
Jonathan Berant
41
144
0
25 Mar 2018
Stacked Cross Attention for Image-Text Matching
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
74
1,151
0
21 Mar 2018
Ranking Sentences for Extractive Summarization with Reinforcement
  Learning
Ranking Sentences for Extractive Summarization with Reinforcement Learning
Shashi Narayan
Shay B. Cohen
Mirella Lapata
165
549
0
23 Feb 2018
Deep Reinforcement Learning for Unsupervised Video Summarization with
  Diversity-Representativeness Reward
Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward
Kaiyang Zhou
Yu Qiao
Tao Xiang
40
429
0
29 Dec 2017
Graph Attention Networks
Graph Attention Networks
Petar Velickovic
Guillem Cucurull
Arantxa Casanova
Adriana Romero
Pietro Lio
Yoshua Bengio
GNN
443
20,089
0
30 Oct 2017
Video Summarization with Attention-Based Encoder-Decoder Networks
Video Summarization with Attention-Based Encoder-Decoder Networks
Zhong Ji
Kailin Xiong
Yanwei Pang
Xuelong Li
38
307
0
31 Aug 2017
Tensor Fusion Network for Multimodal Sentiment Analysis
Tensor Fusion Network for Multimodal Sentiment Analysis
Amir Zadeh
Minghai Chen
Soujanya Poria
Min Zhang
Louis-Philippe Morency
68
1,231
0
23 Jul 2017
Large-scale, Fast and Accurate Shot Boundary Detection through
  Spatio-temporal Convolutional Neural Networks
Large-scale, Fast and Accurate Shot Boundary Detection through Spatio-temporal Convolutional Neural Networks
Ahmed Hassanien
Mohamed A. Elgharib
Ahmed A. S. Seleim
Sung-Ho Bae
M. Hefeeda
Wojciech Matusik
45
51
0
09 May 2017
Temporal Segment Networks for Action Recognition in Videos
Temporal Segment Networks for Action Recognition in Videos
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
110
810
0
08 May 2017
Temporal Action Detection with Structured Segment Networks
Temporal Action Detection with Structured Segment Networks
Yue Zhao
Yuanjun Xiong
Limin Wang
Zhirong Wu
Xiaoou Tang
Dahua Lin
67
914
0
20 Apr 2017
Get To The Point: Summarization with Pointer-Generator Networks
Get To The Point: Summarization with Pointer-Generator Networks
A. See
Peter J. Liu
Christopher D. Manning
3DPC
267
4,014
0
14 Apr 2017
Temporal Convolutional Networks for Action Segmentation and Detection
Temporal Convolutional Networks for Action Segmentation and Detection
Colin S. Lea
Michael D. Flynn
René Vidal
A. Reiter
Gregory Hager
91
1,490
0
16 Nov 2016
SummaRuNNer: A Recurrent Neural Network based Sequence Model for
  Extractive Summarization of Documents
SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents
Ramesh Nallapati
Feifei Zhai
Bowen Zhou
331
1,261
0
14 Nov 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
115
231
0
10 Oct 2016
Video Summarization using Deep Semantic Features
Video Summarization using Deep Semantic Features
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
N. Yokoya
46
113
0
28 Sep 2016
12
Next