Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.14324
Cited By
v1
v2 (latest)
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
18 March 2025
Wei Song
Yansen Wang
Zijia Song
Yadong Li
Haoze Sun
Xin Wu
Guosheng Dong
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies"
8 / 8 papers shown
Title
Show-o2: Improved Native Unified Multimodal Models
Jinheng Xie
Zhenheng Yang
Mike Zheng Shou
VGen
35
0
0
18 Jun 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
297
1
0
05 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
Haoyang Li
LRM
185
24
0
01 May 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
191
24
0
03 Apr 2025
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Tianpeng Li
Qingbin Liu
Tao Zhang
Yuanbo Fang
Zheng Liang
...
Bin Cui
Jianhua Xu
Haoze Sun
Guosheng Dong
Xin Wu
AuLLM
119
7
0
24 Feb 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Qingbin Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Guosheng Dong
Xin Wu
AuLLM
175
23
0
28 Jan 2025
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Zhuoyan Luo
Fengyuan Shi
Yixiao Ge
Yujiu Yang
Limin Wang
Ying Shan
VLM
160
59
0
06 Sep 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
210
338
0
16 May 2024
1