ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.06869
  4. Cited By
CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal
  Large Language Models

CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models

11 November 2024
Junho Kim
Hyungjin Chung
Byung-Hoon Kim
    VLM
ArXivPDFHTML

Papers citing "CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models"

12 / 12 papers shown
Title
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
285
55
0
03 Jan 2025
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
M. Rusanovsky
Or Hirschorn
S. Avidan
60
3
0
01 Jun 2024
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Jiaqi Wang
CLIP
VLM
72
132
0
22 Mar 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
318
4,288
0
09 Jun 2023
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
...
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
3DH
84
179
0
01 Jun 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
95
2,049
0
11 May 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
451
4,715
0
17 Apr 2023
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
262
347
0
22 Sep 2021
AP-10K: A Benchmark for Animal Pose Estimation in the Wild
AP-10K: A Benchmark for Animal Pose Estimation in the Wild
Hang Yu
Yufei Xu
Jing Zhang
Wei Zhao
Ziyu Guan
Dacheng Tao
66
111
0
28 Aug 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
383
10,301
0
17 Jun 2021
Deep High-Resolution Representation Learning for Human Pose Estimation
Deep High-Resolution Representation Learning for Human Pose Estimation
Ke Sun
Bin Xiao
Dong Liu
Jingdong Wang
3DV
120
4,049
0
25 Feb 2019
Simple Baselines for Human Pose Estimation and Tracking
Simple Baselines for Human Pose Estimation and Tracking
Bin Xiao
Haiping Wu
Yichen Wei
3DH
VOT
116
1,791
0
17 Apr 2018
1