Title |
---|
![]() GUI Action Narrator: Where and When Did That Action Take Place? Qinchen Wu Difei Gao Kevin Qinghong Lin Zhuoyu Wu Xiangwu Guo Peiran Li Weichen Zhang Hengxu Wang Mike Zheng Shou |
![]() VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen Zhaoyang Lv Shiwei Wu Kevin Qinghong Lin Chenan Song Difei Gao Jia-Wei Liu Ziteng Gao Dongxing Mao Mike Zheng Shou |
![]() Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with
Instruction Tuning Zebang Cheng Zhi-Qi Cheng Jun-Yan He Jingdong Sun Kai Wang Yuxiang Lin Zheng Lian Xiaojiang Peng Alexander G. Hauptmann |
![]() MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation
in Videos Xuehai He Weixi Feng Kaizhi Zheng Yujie Lu Wanrong Zhu ...Zhengyuan Yang Kevin Lin William Yang Wang Lijuan Wang Xin Eric Wang |