ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.01814
  4. Cited By
RLIP: Relational Language-Image Pre-training for Human-Object
  Interaction Detection

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

5 September 2022
Hangjie Yuan
Jianwen Jiang
Samuel Albanie
Tao Feng
Ziyuan Huang
Dong Ni
Mingqian Tang
    VLM
ArXivPDFHTML

Papers citing "RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection"

20 / 20 papers shown
Title
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
173
0
0
20 Jan 2025
Open-World Human-Object Interaction Detection via Multi-modal Prompts
Open-World Human-Object Interaction Detection via Multi-modal Prompts
Jie-jin Yang
Bingliang Li
Ailing Zeng
L. Zhang
Ruimao Zhang
VLM
32
8
0
11 Jun 2024
Towards Zero-shot Human-Object Interaction Detection via Vision-Language
  Integration
Towards Zero-shot Human-Object Interaction Detection via Vision-Language Integration
Weiying Xue
Qi Liu
Qiwei Xiong
Yuxiao Wang
Zhenao Wei
Xiaofen Xing
Xiangmin Xu
VLM
45
3
0
12 Mar 2024
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
Jiun Tian Hoe
Xudong Jiang
Chee Seng Chan
Yap-Peng Tan
Weipeng Hu
19
11
0
10 Dec 2023
Neural-Logic Human-Object Interaction Detection
Neural-Logic Human-Object Interaction Detection
Liulei Li
Jianan Wei
Wenguan Wang
Yi Yang
43
16
0
16 Nov 2023
Few-shot Action Recognition with Captioning Foundation Models
Few-shot Action Recognition with Captioning Foundation Models
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
VLM
28
7
0
16 Oct 2023
Diagnosing Human-object Interaction Detectors
Diagnosing Human-object Interaction Detectors
Fangrui Zhu
Yiming Xie
Weidi Xie
Huaizu Jiang
28
7
0
16 Aug 2023
HICO-DET-SG and V-COCO-SG: New Data Splits for Evaluating the Systematic
  Generalization Performance of Human-Object Interaction Detection Models
HICO-DET-SG and V-COCO-SG: New Data Splits for Evaluating the Systematic Generalization Performance of Human-Object Interaction Detection Models
Kenta Takemoto
Moyuru Yamada
Tomotake Sasaki
H. Akima
37
0
0
17 May 2023
SPAN: Learning Similarity between Scene Graphs and Images with
  Transformers
SPAN: Learning Similarity between Scene Graphs and Images with Transformers
Yuren Cong
Wentong Liao
Bodo Rosenhahn
M. Yang
35
6
0
02 Apr 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
30
42
0
28 Mar 2023
Category Query Learning for Human-Object Interaction Classification
Category Query Learning for Human-Object Interaction Classification
Chi Xie
Fangao Zeng
Yue Hu
Shuang Liang
Yichen Wei
VLM
26
20
0
24 Mar 2023
Progressive Learning without Forgetting
Progressive Learning without Forgetting
Tao Feng
Hangjie Yuan
Mang Wang
Ziyuan Huang
Ang Bian
Jianzhou Zhang
CLL
KELM
44
4
0
28 Nov 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
155
728
0
28 Jan 2022
RelTR: Relation Transformer for Scene Graph Generation
RelTR: Relation Transformer for Scene Graph Generation
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
97
133
0
27 Jan 2022
Rethinking Supervised Pre-training for Better Downstream Transferring
Rethinking Supervised Pre-training for Better Downstream Transferring
Yutong Feng
Jianwen Jiang
Mingqian Tang
R. L. Jin
Yue Gao
SSL
48
39
0
12 Oct 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
248
577
0
22 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
304
3,708
0
11 Feb 2021
Revisiting the Sibling Head in Object Detector
Revisiting the Sibling Head in Object Detector
Guanglu Song
Yu Liu
Xiaogang Wang
ObjD
181
348
0
17 Mar 2020
PPDM: Parallel Point Detection and Matching for Real-time Human-Object
  Interaction Detection
PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Yue Liao
Si Liu
Fei-Yue Wang
Yanjie Chen
Chen Qian
Jiashi Feng
71
264
0
30 Dec 2019
Image Generation from Scene Graphs
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
223
815
0
04 Apr 2018
1