ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.01241
  4. Cited By
OS-MSL: One Stage Multimodal Sequential Link Framework for Scene
  Segmentation and Classification

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

4 July 2022
Ye Liu
Lingfeng Qiao
Di Yin
Zhuoxuan Jiang
Xinghua Jiang
Deqiang Jiang
Bo Ren
ArXivPDFHTML

Papers citing "OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification"

19 / 19 papers shown
Title
Learnable Optimal Sequential Grouping for Video Scene Detection
Learnable Optimal Sequential Grouping for Video Scene Detection
Daniel Rotman
Yevgeny Yaroker
Elad Amrani
Udi Barzelay
Rami Ben-Ari
26
10
0
17 May 2022
End-to-end Temporal Action Detection with Transformer
End-to-end Temporal Action Detection with Transformer
Xiaolong Liu
Qimeng Wang
Yao Hu
Xu Tang
Shiwei Zhang
S. Bai
X. Bai
ViT
92
232
0
18 Jun 2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen
Xiaohan Nie
David D. Fan
Dongqing Zhang
Vimal Bhat
Raffay Hamid
SSL
49
62
0
28 Apr 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
201
2,137
0
29 Mar 2021
TransNet V2: An effective deep network architecture for fast shot
  transition detection
TransNet V2: An effective deep network architecture for fast shot transition detection
Tomás Soucek
Jakub Lokoč
54
124
0
11 Aug 2020
MovieNet: A Holistic Dataset for Movie Understanding
MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang
Yu Xiong
Anyi Rao
Jiaze Wang
Dahua Lin
VGen
76
237
0
21 Jul 2020
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Anyi Rao
Linning Xu
Yu Xiong
Guodong Xu
Qingqiu Huang
Bolei Zhou
Dahua Lin
44
111
0
06 Apr 2020
Utterance-level Aggregation For Speaker Recognition In The Wild
Utterance-level Aggregation For Speaker Recognition In The Wild
Weidi Xie
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
52
344
0
26 Feb 2019
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
162
3,262
0
10 Dec 2018
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
Tianwei Lin
Xu Zhao
Haisheng Su
Chongjing Wang
Ming Yang
195
701
0
08 Jun 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
646
130,942
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
219
7,989
0
22 May 2017
Temporal Segment Networks for Action Recognition in Videos
Temporal Segment Networks for Action Recognition in Videos
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
110
809
0
08 May 2017
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
J. Gao
Zhenheng Yang
Chen Sun
Kan Chen
Ram Nevatia
ViT
AI4TS
52
461
0
17 Mar 2017
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.1K
193,426
0
10 Dec 2015
A Deep Siamese Network for Scene Detection in Broadcast Videos
A Deep Siamese Network for Scene Detection in Broadcast Videos
Lorenzo Baraldi
C. Grana
Rita Cucchiara
37
101
0
29 Oct 2015
Bidirectional LSTM-CRF Models for Sequence Tagging
Bidirectional LSTM-CRF Models for Sequence Tagging
Zhiheng Huang
Wenyuan Xu
Kai Yu
171
4,018
0
09 Aug 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
432
43,234
0
11 Feb 2015
Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
237
7,526
0
09 Jun 2014
1