
v1v2 (latest)
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
Papers citing "LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models"
32 / 82 papers shown
Title |
---|
![]() Qwen Technical Report Jinze Bai Shuai Bai Yunfei Chu Zeyu Cui Kai Dang ...Zhenru Zhang Chang Zhou Jingren Zhou Xiaohuan Zhou Tianhang Zhu |
![]() Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level
Vision Haoning Wu Zicheng Zhang Erli Zhang Chaofeng Chen Liang Liao ...Chunyi Li Wenxiu Sun Qiong Yan Guangtao Zhai Weisi Lin |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |