Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data

31 January 2024

Papers citing "Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data"

5 / 5 papers shown

Title
Group-in-Group Policy Optimization for LLM Agent Training Lang Feng Zhenghai Xue Tingcong Liu Bo An OffRL 12 0 0 16 May 2025
Decoding Neighborhood Environments with Large Language Models Andrew Cart Shaohu Zhang Melanie Escue Xugui Zhou Haitao Zhao Prashanth BusiReddyGari Beiyu Lin Shuang Li 23 0 0 13 May 2025
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem Declan Campbell Sunayana Rane Tyler Giallanza Nicolò De Sabbata Kia Ghods ... Alexander Ku Steven M. Frankland Thomas L. Griffiths Jonathan D. Cohen Taylor W. Webb 42 13 0 31 Oct 2024
Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Sai Krishna Revanth Vuruma Dezhi Wu Saborny Sen Gupta Lucas Aust Valerie Lookingbill ... Erin Kasson Li-Shiun Chen Patricia Cavazos-Rehg Dian Hu Ming Huang 36 0 0 28 Jun 2024
Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark Evan M. Williams Kathleen M. Carley CoGe 44 0 0 10 May 2024