ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.19542
  4. Cited By
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal
  Multi-scale and Action Label Features

One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features

30 April 2024
Trung Thanh Nguyen
Yasutomo Kawanishi
Takahiro Komamizu
Ichiro Ide
    VLM
ArXivPDFHTML

Papers citing "One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features"

22 / 22 papers shown
Title
Meta Learning to Bridge Vision and Language Models for Multimodal
  Few-Shot Learning
Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning
Ivona Najdenkoska
Xiantong Zhen
Marcel Worring
VLM
85
18
0
28 Feb 2023
Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text
  Features
Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
V. Rathod
Bryan Seybold
Sudheendra Vijayanarasimhan
Austin Myers
Xiuye Gu
Vighnesh Birodkar
David A. Ross
VLM
ObjD
34
7
0
20 Dec 2022
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation
Vishnu Sashank Dorbala
Gunnar Sigurdsson
Robinson Piramuthu
Jesse Thomason
Gaurav Sukhatme
LM&Ro
48
55
0
30 Nov 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
54
66
0
17 Jul 2022
ReAct: Temporal Action Detection with Relational Queries
ReAct: Temporal Action Detection with Relational Queries
Ding Shi
Yujie Zhong
Qiong Cao
Jing Zhang
Lin Ma
Jia Li
Dacheng Tao
ViT
71
68
0
14 Jul 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
283
3,458
0
29 Apr 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge
  Distillation
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
55
91
0
12 Mar 2022
OpenTAL: Towards Open Set Temporal Action Localization
OpenTAL: Towards Open Set Temporal Action Localization
Wentao Bao
Qi Yu
Yu Kong
EDL
47
26
0
10 Mar 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
54
336
0
16 Feb 2022
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
54
371
0
08 Dec 2021
Combined Scaling for Zero-shot Transfer Learning
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
46
197
0
19 Nov 2021
End-to-end Temporal Action Detection with Transformer
End-to-end Temporal Action Detection with Transformer
Xiaolong Liu
Qimeng Wang
Yao Hu
Xu Tang
Shiwei Zhang
S. Bai
X. Bai
ViT
69
230
0
18 Jun 2021
Temporal Context Aggregation Network for Temporal Action Proposal
  Refinement
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
106
174
0
24 Mar 2021
Learning Salient Boundary Feature for Anchor-free Temporal Action
  Localization
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
Chuming Lin
C. Xu
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Yanwei Fu
56
251
0
24 Mar 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
392
3,778
0
11 Feb 2021
Video Self-Stitching Graph Network for Temporal Action Localization
Video Self-Stitching Graph Network for Temporal Action Localization
Chen Zhao
Ali K. Thabet
Guohao Li
46
139
0
30 Nov 2020
Open-Vocabulary Object Detection Using Captions
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
114
426
0
20 Nov 2020
Contrastive Representation Learning: A Framework and Review
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
SSL
AI4TS
238
697
0
10 Oct 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
72
1,927
0
13 Apr 2020
Graph Convolutional Networks for Temporal Action Localization
Graph Convolutional Networks for Temporal Action Localization
Runhao Zeng
Wenbing Huang
Mingkui Tan
Yu Rong
P. Zhao
Junzhou Huang
Chuang Gan
GNN
74
476
0
07 Sep 2019
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
199
7,961
0
22 May 2017
The THUMOS Challenge on Action Recognition for Videos "in the Wild"
The THUMOS Challenge on Action Recognition for Videos "in the Wild"
Haroon Idrees
Amir Zamir
Yu-Gang Jiang
Alexander N. Gorban
Ivan Laptev
Rahul Sukthankar
M. Shah
68
776
0
21 Apr 2016
1