
v1v2 (latest)
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Papers citing "Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective"
15 / 15 papers shown
Title |
---|
![]() Aria: An Open Multimodal Native Mixture-of-Experts Model Dongxu Li Yudong Liu Haoning Wu Yue Wang Zhiqi Shen ...Lihuan Zhang Hanshu Yan Guoyin Wang Bei Chen Junnan Li |
![]() ControlAR: Controllable Image Generation with Autoregressive Models Zongming Li Tianheng Cheng Shoufa Chen Peize Sun Haocheng Shen Longjin Ran Xiaoxin Chen Wenyu Liu Xinggang Wang |
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |