EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

31 October 2024

Papers citing "EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection"

30 / 30 papers shown

Title
Top-Down Compression: Revisit Efficient Vision Token Projection for Visual Instruction Tuning Bonan li Zicheng Zhang Songhua Liu Weihao Yu Xinchao Wang VLM 120 0 0 17 May 2025
Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model Kotaro Ikeda Masanori Koyama Jinzhe Zhang Kohei Hayashi Kenji Fukumizu OT 495 0 0 04 Apr 2025
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions Yihao Ai Yifei Qi Bo Wang Yu-Feng Cheng Xinchao Wang Robby T. Tan 85 2 0 22 Jul 2024
Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models Yichao Cao Qingfei Tang Xiu Su Chen Song Shan You Xiaobo Lu Chang Xu 61 22 0 07 Nov 2023
Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models Baoshuo Kan Teng Wang Wenpeng Lu Xiantong Zhen Weili Guan Feng Zheng VPVLM VLM 80 26 0 22 Aug 2023
Agglomerative Transformer for Human-Object Interaction Detection Danyang Tu Wei Sun Guangtao Zhai Wei Shen ViT 74 6 0 16 Aug 2023
Exploring Predicate Visual Context in Detecting Human-Object Interactions Frederic Z. Zhang Yuhui Yuan Dylan Campbell Zhuoyao Zhong Stephen Gould 75 40 0 11 Aug 2023
Visual Instruction Tuning Haotian Liu Chunyuan Li Qingyang Wu Yong Jae Lee SyDa VLM MLLM 529 4,725 0 17 Apr 2023
Relational Context Learning for Human-Object Interaction Detection Sanghyun Kim Deunsol Jung Minsu Cho 80 40 0 11 Apr 2023
Category Query Learning for Human-Object Interaction Classification Chi Xie Fangao Zeng Yue Hu Shuang Liang Yichen Wei VLM 49 20 0 24 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 424 4,539 0 30 Jan 2023
MaPLe: Multi-modal Prompt Learning Muhammad Uzair Khattak H. Rasheed Muhammad Maaz Salman Khan Fahad Shahbaz Khan VPVLM VLM 251 565 0 06 Oct 2022
Flamingo: a Visual Language Model for Few-Shot Learning Jean-Baptiste Alayrac Jeff Donahue Pauline Luc Antoine Miech Iain Barr ... Mikolaj Binkowski Ricardo Barreira Oriol Vinyals Andrew Zisserman Karen Simonyan MLLM VLM 371 3,535 0 29 Apr 2022
What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions A S M Iftekhar Hao Chen Kaustav Kundu Xinyu Li Joseph Tighe Davide Modolo ViT 88 51 0 02 Apr 2022
Visual Prompt Tuning Menglin Jia Luming Tang Bor-Chun Chen Claire Cardie Serge Belongie Bharath Hariharan Ser-Nam Lim VLM VPVLM 148 1,624 0 23 Mar 2022
Conditional Prompt Learning for Vision-Language Models Kaiyang Zhou Jingkang Yang Chen Change Loy Ziwei Liu VLM CLIP VPVLM 125 1,348 0 10 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong Guosheng Lin MLLM BDL VLM CLIP 524 4,343 0 28 Jan 2022
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer Frederic Z. Zhang Dylan Campbell Stephen Gould ViT 64 107 0 03 Dec 2021
Learning to Prompt for Vision-Language Models Kaiyang Zhou Jingkang Yang Chen Change Loy Ziwei Liu VPVLM CLIP VLM 490 2,396 0 02 Sep 2021
Affordance Transfer Learning for Human-Object Interaction Detection Zhi Hou Baosheng Yu Yu Qiao Xiaojiang Peng Dacheng Tao 69 106 0 07 Apr 2021
Detecting Human-Object Interaction via Fabricated Compositional Learning Zhi Hou B. Yu Yu Qiao Xiaojiang Peng Dacheng Tao 96 98 0 15 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 903 29,372 0 26 Feb 2021
Spatially Conditioned Graphs for Detecting Human-Object Interactions Frederic Z. Zhang Dylan Campbell Stephen Gould 68 127 0 11 Dec 2020
HOI Analysis: Integrating and Decomposing Human-Object Interaction Yong-Lu Li Xinpeng Liu Xiaoqian Wu Yizhuo Li Cewu Lu 59 123 0 30 Oct 2020
End-to-End Object Detection with Transformers Nicolas Carion Francisco Massa Gabriel Synnaeve Nicolas Usunier Alexander Kirillov Sergey Zagoruyko ViT 3DV PINN 382 13,035 0 26 May 2020
Exploring Visual Relationship for Image Captioning Ting Yao Yingwei Pan Yehao Li Tao Mei 74 833 0 19 Sep 2018
Detecting and Recognizing Human-Object Interactions Georgia Gkioxari Ross B. Girshick Piotr Dollár Kaiming He 76 576 0 24 Apr 2017
Mask R-CNN Kaiming He Georgia Gkioxari Piotr Dollár Ross B. Girshick ObjD 350 27,181 0 20 Mar 2017
Learning to Detect Human-Object Interactions Yu-Wei Chao Yunfan Liu Michael Xieyang Liu Huayi Zeng Jia Deng 66 508 0 17 Feb 2017
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 413 43,638 0 01 May 2014