Title |
---|
![]() LoGra-Med: Long Context Multi-Graph Alignment for Medical
Vision-Language Model Duy M. H. Nguyen N. T. Diep Trung Q. Nguyen Hoang-Bao Le Tai Nguyen ...Pengtao Xie Roger Wattenhofer James Zhou Daniel Sonntag Mathias Niepert |
![]() MMSearch: Benchmarking the Potential of Large Models as Multi-modal
Search Engines Dongzhi Jiang Renrui Zhang Ziyu Guo Yanmin Wu Jiayi Lei ...Guanglu Song Peng Gao Yu Liu Chunyuan Li Hongsheng Li |
![]() VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths
Vision Computation Shiwei Wu Joya Chen Kevin Qinghong Lin Qimeng Wang Yan Gao Qianli Xu Tong Bill Xu Yao Hu Enhong Chen Mike Zheng Shou |
![]() VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen Zhaoyang Lv Shiwei Wu Kevin Qinghong Lin Chenan Song Difei Gao Jia-Wei Liu Ziteng Gao Dongxing Mao Mike Zheng Shou |
![]() Needle In A Multimodal Haystack Weiyun Wang Shuibo Zhang Yiming Ren Yuchen Duan Tiantong Li ...Ping Luo Yu Qiao Jifeng Dai Wenqi Shao Wenhai Wang |