ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.23904
  4. Cited By
EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI
  Detection

EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

31 October 2024
Qinqian Lei
Bo Wang
Robby T. Tan
    VLM
ArXivPDFHTML

Papers citing "EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection"

30 / 30 papers shown
Title
Top-Down Compression: Revisit Efficient Vision Token Projection for Visual Instruction Tuning
Top-Down Compression: Revisit Efficient Vision Token Projection for Visual Instruction Tuning
Bonan li
Zicheng Zhang
Songhua Liu
Weihao Yu
Xinchao Wang
VLM
120
0
0
17 May 2025
Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model
Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model
Kotaro Ikeda
Masanori Koyama
Jinzhe Zhang
Kohei Hayashi
Kenji Fukumizu
OT
495
0
0
04 Apr 2025
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely
  Low-Light Conditions
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions
Yihao Ai
Yifei Qi
Bo Wang
Yu-Feng Cheng
Xinchao Wang
Robby T. Tan
85
2
0
22 Jul 2024
Detecting Any Human-Object Interaction Relationship: Universal HOI
  Detector with Spatial Prompt Learning on Foundation Models
Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models
Yichao Cao
Qingfei Tang
Xiu Su
Chen Song
Shan You
Xiaobo Lu
Chang Xu
61
22
0
07 Nov 2023
Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
Baoshuo Kan
Teng Wang
Wenpeng Lu
Xiantong Zhen
Weili Guan
Feng Zheng
VPVLM
VLM
80
26
0
22 Aug 2023
Agglomerative Transformer for Human-Object Interaction Detection
Agglomerative Transformer for Human-Object Interaction Detection
Danyang Tu
Wei Sun
Guangtao Zhai
Wei Shen
ViT
74
6
0
16 Aug 2023
Exploring Predicate Visual Context in Detecting Human-Object
  Interactions
Exploring Predicate Visual Context in Detecting Human-Object Interactions
Frederic Z. Zhang
Yuhui Yuan
Dylan Campbell
Zhuoyao Zhong
Stephen Gould
75
40
0
11 Aug 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
529
4,725
0
17 Apr 2023
Relational Context Learning for Human-Object Interaction Detection
Relational Context Learning for Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
80
40
0
11 Apr 2023
Category Query Learning for Human-Object Interaction Classification
Category Query Learning for Human-Object Interaction Classification
Chi Xie
Fangao Zeng
Yue Hu
Shuang Liang
Yichen Wei
VLM
49
20
0
24 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
424
4,539
0
30 Jan 2023
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
VPVLM
VLM
251
565
0
06 Oct 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
371
3,535
0
29 Apr 2022
What to look at and where: Semantic and Spatial Refined Transformer for
  detecting human-object interactions
What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions
A S M Iftekhar
Hao Chen
Kaustav Kundu
Xinyu Li
Joseph Tighe
Davide Modolo
ViT
88
51
0
02 Apr 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
148
1,624
0
23 Mar 2022
Conditional Prompt Learning for Vision-Language Models
Conditional Prompt Learning for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VLM
CLIP
VPVLM
125
1,348
0
10 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
524
4,343
0
28 Jan 2022
Efficient Two-Stage Detection of Human-Object Interactions with a Novel
  Unary-Pairwise Transformer
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
Frederic Z. Zhang
Dylan Campbell
Stephen Gould
ViT
64
107
0
03 Dec 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
490
2,396
0
02 Sep 2021
Affordance Transfer Learning for Human-Object Interaction Detection
Affordance Transfer Learning for Human-Object Interaction Detection
Zhi Hou
Baosheng Yu
Yu Qiao
Xiaojiang Peng
Dacheng Tao
69
106
0
07 Apr 2021
Detecting Human-Object Interaction via Fabricated Compositional Learning
Detecting Human-Object Interaction via Fabricated Compositional Learning
Zhi Hou
B. Yu
Yu Qiao
Xiaojiang Peng
Dacheng Tao
96
98
0
15 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
903
29,372
0
26 Feb 2021
Spatially Conditioned Graphs for Detecting Human-Object Interactions
Spatially Conditioned Graphs for Detecting Human-Object Interactions
Frederic Z. Zhang
Dylan Campbell
Stephen Gould
68
127
0
11 Dec 2020
HOI Analysis: Integrating and Decomposing Human-Object Interaction
HOI Analysis: Integrating and Decomposing Human-Object Interaction
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Cewu Lu
59
123
0
30 Oct 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
382
13,035
0
26 May 2020
Exploring Visual Relationship for Image Captioning
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
74
833
0
19 Sep 2018
Detecting and Recognizing Human-Object Interactions
Detecting and Recognizing Human-Object Interactions
Georgia Gkioxari
Ross B. Girshick
Piotr Dollár
Kaiming He
76
576
0
24 Apr 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
350
27,181
0
20 Mar 2017
Learning to Detect Human-Object Interactions
Learning to Detect Human-Object Interactions
Yu-Wei Chao
Yunfan Liu
Michael Xieyang Liu
Huayi Zeng
Jia Deng
66
508
0
17 Feb 2017
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
1