ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09335
  4. Cited By
Distilling Vision-Language Pre-training to Collaborate with
  Weakly-Supervised Temporal Action Localization

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

19 December 2022
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
ArXivPDFHTML

Papers citing "Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization"

50 / 84 papers shown
Title
Multi-modal Prompting for Low-Shot Temporal Action Localization
Multi-modal Prompting for Low-Shot Temporal Action Localization
Chen Ju
Zeqian Li
Peisen Zhao
Ya Zhang
Xiaopeng Zhang
Qi Tian
Yanfeng Wang
Weidi Xie
50
19
0
21 Mar 2023
DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery
DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery
Chaofan Ma
Yu-Hao Yang
Chen Ju
Feifan Zhang
Jinxian Liu
Yu Wang
Ya Zhang
Yanfeng Wang
DiffM
41
37
0
17 Mar 2023
Constraint and Union for Partially-Supervised Temporal Sentence
  Grounding
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
32
15
0
20 Feb 2023
Frozen CLIP Models are Efficient Video Learners
Frozen CLIP Models are Efficient Video Learners
Ziyi Lin
Shijie Geng
Renrui Zhang
Peng Gao
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Hongsheng Li
CLIP
VLM
56
203
0
06 Aug 2022
LocVTP: Video-Text Pre-training for Temporal Localization
LocVTP: Video-Text Pre-training for Temporal Localization
Meng Cao
Tianyu Yang
Junwu Weng
Can Zhang
Jue Wang
Yuexian Zou
44
64
0
21 Jul 2022
Exploiting Transformation Invariance and Equivariance for
  Self-supervised Sound Localisation
Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation
Jinxian Liu
Chen Ju
Weidi Xie
Ya Zhang
43
38
0
26 Jun 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
102
1,279
0
04 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
230
3,458
0
29 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
  Language Guidance
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
92
375
0
18 Apr 2022
An Empirical Study of End-to-End Temporal Action Detection
An Empirical Study of End-to-End Temporal Action Detection
Xiaolong Liu
S. Bai
Xiang Bai
53
59
0
06 Apr 2022
TALLFormer: Temporal Action Localization with a Long-memory Transformer
TALLFormer: Temporal Action Localization with a Long-memory Transformer
Feng Cheng
Gedas Bertasius
ViT
45
93
0
04 Apr 2022
Fine-grained Temporal Contrastive Learning for Weakly-supervised
  Temporal Action Localization
Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization
Junyu Gao
Mengyuan Chen
Changsheng Xu
25
66
0
31 Mar 2022
ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal
  Action Localization
ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
Bo He
Xitong Yang
Le Kang
Zhiyu Cheng
Xingfa Zhou
Abhinav Shrivastava
47
77
0
29 Mar 2022
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for
  HOI Detection
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
Yue Liao
Aixi Zhang
Miao Lu
Yongliang Wang
Xiaobo Li
Si Liu
VLM
38
126
0
26 Mar 2022
MotionCLIP: Exposing Human Motion Generation to CLIP Space
MotionCLIP: Exposing Human Motion Generation to CLIP Space
Guy Tevet
Brian Gordon
Amir Hertz
Amit H. Bermano
Daniel Cohen-Or
CLIP
91
335
0
15 Mar 2022
RCL: Recurrent Continuous Localization for Temporal Action Detection
RCL: Recurrent Continuous Localization for Temporal Action Detection
Qiang Wang
Yanhao Zhang
Yun Zheng
Pan Pan
ObjD
46
38
0
14 Mar 2022
Weakly Supervised Temporal Action Localization via Representative
  Snippet Knowledge Propagation
Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation
Linjiang Huang
Liang Wang
Hongsheng Li
AI4TS
35
66
0
06 Mar 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
45
336
0
16 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
107
859
0
07 Feb 2022
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal
  Action Localization
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization
Zichen Yang
Jie Qin
Di Huang
49
57
0
21 Dec 2021
Temporal Action Proposal Generation with Background Constraint
Temporal Action Proposal Generation with Background Constraint
Haosen Yang
Wenhao Wu
Lining Wang
Sheng Jin
Boyang Xia
Huanjin Yao
Hujie Huang
83
27
0
15 Dec 2021
More Control for Free! Image Synthesis with Semantic Diffusion Guidance
More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu
Dong Huk Park
S. Azadi
Gong Zhang
Arman Chopikyan
Yuxiao Hu
Humphrey Shi
Anna Rohrbach
Trevor Darrell
DiffM
60
253
0
10 Dec 2021
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields
Can Wang
Menglei Chai
Mingming He
Dongdong Chen
Jing Liao
CLIP
93
383
0
09 Dec 2021
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
47
371
0
08 Dec 2021
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
163
564
0
02 Dec 2021
Extract Free Dense Labels from CLIP
Extract Free Dense Labels from CLIP
Chong Zhou
Chen Change Loy
Bo Dai
VLM
CLIP
98
467
0
02 Dec 2021
Simple but Effective: CLIP Embeddings for Embodied AI
Simple but Effective: CLIP Embeddings for Embodied AI
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
VLM
LM&Ro
57
220
0
18 Nov 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
51
627
0
09 Nov 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
175
367
0
17 Sep 2021
Foreground-Action Consistency Network for Weakly Supervised Temporal
  Action Localization
Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization
Linjiang Huang
Liang Wang
Hongsheng Li
74
76
0
14 Aug 2021
Learning Action Completeness from Points for Weakly-supervised Temporal
  Action Localization
Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization
Pilhyeon Lee
H. Byun
74
64
0
11 Aug 2021
Enriching Local and Global Contexts for Temporal Action Localization
Enriching Local and Global Contexts for Temporal Action Localization
Zixin Zhu
Wei Tang
Le Wang
N. Zheng
G. Hua
52
109
0
27 Jul 2021
Cross-modal Consensus Network for Weakly Supervised Temporal Action
  Localization
Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization
Fa-Ting Hong
Jialuo Feng
Dan Xu
Ying Shan
Weishi Zheng
75
85
0
27 Jul 2021
Action Unit Memory Network for Weakly Supervised Temporal Action
  Localization
Action Unit Memory Network for Weakly Supervised Temporal Action Localization
Wang Luo
Tianzhu Zhang
Wenfei Yang
Jingen Liu
Tao Mei
Feng Wu
Yongdong Zhang
65
79
0
29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
247
906
0
28 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
353
796
0
18 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
420
3,952
0
18 Apr 2021
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action
  Localization
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
Chen Ju
Peisen Zhao
Siheng Chen
Ya Zhang
Xiaoyun Zhang
Qi Tian
WSOL
60
19
0
06 Apr 2021
The Blessings of Unlabeled Background in Untrimmed Videos
The Blessings of Unlabeled Background in Untrimmed Videos
Yuan Liu
Jingyuan Chen
Zhenfang Chen
Bing Deng
Jianqiang Huang
Hanwang Zhang
CML
53
43
0
24 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal
  Refinement
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
100
174
0
24 Mar 2021
Learning Salient Boundary Feature for Anchor-free Temporal Action
  Localization
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
Chuming Lin
C. Xu
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Yanwei Fu
49
251
0
24 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
526
28,659
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
376
3,778
0
11 Feb 2021
Relaxed Transformer Decoders for Direct Action Proposal Generation
Relaxed Transformer Decoders for Direct Action Proposal Generation
Jing Tan
Jiaqi Tang
Limin Wang
Gangshan Wu
ViT
86
178
0
03 Feb 2021
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action
  Localization
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
Ashraful Islam
Chengjiang Long
Richard J. Radke
55
124
0
03 Jan 2021
Point-Level Temporal Action Localization: Bridging Fully-supervised
  Proposals to Weakly-supervised Losses
Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses
Chen Ju
Peisen Zhao
Ya Zhang
Yanfeng Wang
Qi Tian
28
27
0
15 Dec 2020
D2-Net: Weakly-Supervised Action Localization via Discriminative
  Embeddings and Denoised Activations
D2-Net: Weakly-Supervised Action Localization via Discriminative Embeddings and Denoised Activations
Sanath Narayan
Hisham Cholakkal
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
Ling Shao
44
54
0
11 Dec 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization
  Tasks
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
Humam Alwassel
Silvio Giancola
Guohao Li
52
124
0
23 Nov 2020
Boundary-sensitive Pre-training for Temporal Localization in Videos
Boundary-sensitive Pre-training for Temporal Localization in Videos
Mengmeng Xu
Juan-Manuel Perez-Rua
Victor Escorcia
Brais Martínez
Xiatian Zhu
Li Zhang
Guohao Li
Tao Xiang
40
61
0
21 Nov 2020
Open-Vocabulary Object Detection Using Captions
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
92
423
0
20 Nov 2020
12
Next