ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.07841
  4. Cited By
Labeling Comic Mischief Content in Online Videos with a Multimodal
  Hierarchical-Cross-Attention Model

Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model

12 June 2024
Elaheh Baharlouei
Mahsa Shafaei
Yigeng Zhang
Hugo Jair Escalante
Thamar Solorio
ArXivPDFHTML

Papers citing "Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model"

5 / 5 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
From None to Severe: Predicting Severity in Movie Scripts
From None to Severe: Predicting Severity in Movie Scripts
Yigeng Zhang
Mahsa Shafaei
Fabio Gonzalez
Thamar Solorio
56
5
0
20 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
248
577
0
22 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Self-supervised Co-training for Video Representation Learning
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
215
309
0
19 Oct 2020
1