Title |
---|
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |
![]() PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Weifeng Lin Xinyu Wei Renrui Zhang Le Zhuo Shitian Zhao ...Junlin Xie Junlin Xie Yu Qiao Peng Gao Hongsheng Li |
![]() ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring
Instruction Tuning Liang Zhao En Yu Zheng Ge Jinrong Yang Hao-Ran Wei ...Jian‐Yuan Sun Yuang Peng Runpei Dong Chunrui Han Xiangyu Zhang |