Title |
---|
![]() MM-Ego: Towards Building Egocentric Multimodal LLMs for Video QA Hanrong Ye Haotian Zhang Erik Daxberger Lin Chen Zongyu Lin ...Haoxuan You Dan Xu Zhe Gan Jiasen Lu Yinfei Yang |
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |
![]() KeyVideoLLM: Towards Large-scale Video Keyframe Selection Hao Liang Jiapeng Li Tianyi Bai Xijie Huang Linzhuang Sun Zhengren Wang Conghui He Bin Cui Chong Chen Wentao Zhang |