Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.06058
Cited By
Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
12 October 2021
Zongmeng Zhang
Xianjing Han
Xuemeng Song
Yan Yan
Liqiang Nie
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos"
16 / 16 papers shown
Title
Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining
Lu Dong
H. Zhang
Hongjie Zhang
Y. Huang
Z. Ling
Yu Qiao
Limin Wang
Yishuo Wang
AI4TS
31
0
0
10 May 2025
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
61
0
0
04 May 2025
Exploiting Inter-Sample Correlation and Intra-Sample Redundancy for Partially Relevant Video Retrieval
Junlong Ren
Gangjian Zhang
Y. Hu
Jian Shu
Haoran Wang
29
0
0
28 Apr 2025
A Survey on Multimodal Music Emotion Recognition
Rashini Liyanarachchi
Aditya Joshi
Erik Meijering
29
1
0
26 Apr 2025
NowYouSee Me: Context-Aware Automatic Audio Description
Seon-Ho Lee
Jue Wang
D. Fan
Zhikang Zhang
Linda Liu
Xiang Hao
Vimal Bhat
Xinyu Li
93
0
0
13 Dec 2024
Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation
Changcheng Xiao
Qiong Cao
Yujie Zhong
Xiang Zhang
Tao Wang
Canqun Yang
L. Lan
28
0
0
17 Oct 2024
A Survey on Video Moment Localization
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
29
28
0
13 Jun 2023
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Yining Qi
Xing Di
Weining Lu
Yu Cheng
46
8
0
06 May 2023
Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain
Liqiang Jing
Xuemeng Song
Xuming Lin
Zhongzhou Zhao
Wei Zhou
Liqiang Nie
27
14
0
05 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
51
33
0
29 Apr 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
40
12
0
15 Mar 2023
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
25
3
0
09 Oct 2022
Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
Jun-Bin Zhang
Pei-Hsuan Tsai
Meng-Hsun Tsai
31
20
0
13 Sep 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
36
38
0
20 Jan 2022
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
13
9
0
31 Oct 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Hao Wang
Guosheng Lin
S. Hoi
C. Miao
20
6
0
14 Aug 2021
1