ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.16500
  4. Cited By
CogVLM2: Visual Language Models for Image and Video Understanding

CogVLM2: Visual Language Models for Image and Video Understanding

29 August 2024
Wenyi Hong
Weihan Wang
Ming Ding
Wenmeng Yu
Qingsong Lv
Yan Wang
Yean Cheng
Shiyu Huang
Junhui Ji
Zhao Xue
Lei Zhao
Zhuoyi Yang
Xiaotao Gu
Xiaohan Zhang
Guanyu Feng
Da Yin
Zihan Wang
Ji Qi
Xixuan Song
Peng Zhang
Debing Liu
Bin Xu
Juanzi Li
Yuxiao Dong
Jie Tang
    VLMMLLM
ArXiv (abs)PDFHTMLGithub (2356★)

Papers citing "CogVLM2: Visual Language Models for Image and Video Understanding"

3 / 53 papers shown
Title
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
87
561
0
14 Apr 2017
An Analysis of Visual Question Answering Algorithms
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
85
234
0
28 Mar 2017
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
233
5,509
0
03 May 2015
Previous
12