
v1v2 (latest)
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Papers citing "Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic"
50 / 163 papers shown
Title |
---|
![]() VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen Zhaoyang Lv Shiwei Wu Kevin Qinghong Lin Chenan Song Difei Gao Jia-Wei Liu Ziteng Gao Dongxing Mao Mike Zheng Shou |
![]() Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with
Instruction Tuning Zebang Cheng Zhi-Qi Cheng Jun-Yan He Jingdong Sun Kai Wang Yuxiang Lin Zheng Lian Xiaojiang Peng Alexander G. Hauptmann |
![]() Needle In A Multimodal Haystack Weiyun Wang Shuibo Zhang Yiming Ren Yuchen Duan Tiantong Li ...Ping Luo Yu Qiao Jifeng Dai Wenqi Shao Wenhai Wang |
![]() DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo Yunshui Li Longze Chen Wanwei He Ting-En Lin ...Zikai Song Xiaobo Xia Tongliang Liu Min Yang Binyuan Hui |