Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.11846
Cited By
Any-to-Any Generation via Composable Diffusion
19 May 2023
Zineng Tang
Ziyi Yang
Chenguang Zhu
Michael Zeng
Joey Tianyi Zhou
VGen
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Any-to-Any Generation via Composable Diffusion"
35 / 35 papers shown
Title
Show-o2: Improved Native Unified Multimodal Models
Jinheng Xie
Zhenheng Yang
Mike Zheng Shou
VGen
48
0
0
18 Jun 2025
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Zhiyang Xu
Jiuhai Chen
Zhaojiang Lin
Xichen Pan
Lifu Huang
...
Di Jin
Michihiro Yasunaga
Lili Yu
Xi Lin
Shaoliang Nie
125
1
0
12 Jun 2025
Multimodal Representation Alignment for Cross-modal Information Retrieval
Fan Xu
Luis A. Leiva
19
0
0
10 Jun 2025
Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Ruiyang Zhang
Hu Zhang
Hao Fei
Zhedong Zheng
UQCV
46
0
0
09 Jun 2025
Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation
Theodore Barfoot
Luis C. Garcia-Peraza-Herrera
Samet Akcay
Ben Glocker
Tom Vercauteren
UQCV
140
0
0
04 Jun 2025
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
Lingwei Dang
Ruizhi Shao
Hongwen Zhang
Wei Min
Yebin Liu
Qingyao Wu
DiffM
VGen
95
0
0
03 Jun 2025
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation
Daniele Molino
Francesco Di Feola
Linlin Shen
Paolo Soda
V. Guarrasi
MedIm
LM&MA
132
1
0
02 May 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
Xianwei Zhuang
Yuxin Xie
Yufan Deng
Dongchao Yang
Liming Liang
Jinghan Ru
Yuguo Yin
Yuexian Zou
162
5
0
03 Apr 2025
What Makes an Evaluation Useful? Common Pitfalls and Best Practices
Gil Gekker
Meirav Segal
Dan Lahav
Omer Nevo
ELM
110
0
0
30 Mar 2025
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Ruchika Chavhan
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
Luca Morreale
Mehdi Noroozi
Sourav Bhattacharya
91
0
0
14 Mar 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Zhiyong Yang
Mike Zheng Shou
MoE
202
1
0
10 Feb 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
236
12
0
23 Jan 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
191
3
0
10 Jan 2025
XGeM: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation
Daniele Molino
Francesco Di Feola
E. Faiella
Deborah Fazzini
D. Santucci
Linlin Shen
V. Guarrasi
Paolo Soda
SyDa
MedIm
127
1
0
08 Jan 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
Hao Zhang
Tat-Seng Chua
Shuicheng Yan
190
42
0
31 Dec 2024
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Alex Schwing
Yuki Mitsufuji
VGen
294
18
0
19 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip Torr
VLM
ObjD
548
1
0
12 Dec 2024
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Zichun Liao
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VGen
183
9
0
02 Dec 2024
Spider: Any-to-Many Multimodal LLM
Jinxiang Lai
Jie Zhang
Jun Liu
Jian Li
Xiaocheng Lu
Song Guo
MLLM
191
2
0
14 Nov 2024
MEV Capture Through Time-Advantaged Arbitrage
Robin Fritsch
Maria Ines Silva
A. Mamageishvili
Benjamin Livshits
E. Felten
108
3
0
14 Oct 2024
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
165
4
0
26 Sep 2024
An overview of domain-specific foundation model: key technologies, applications and challenges
Haolong Chen
Hanzhi Chen
Zijian Zhao
Kaifeng Han
Guangxu Zhu
Yichen Zhao
Ying Du
Wei Xu
Qingjiang Shi
ALM
VLM
131
5
0
06 Sep 2024
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
89
15
0
08 Jul 2024
Sequential Contrastive Audio-Visual Learning
Ioannis Tsiamas
Santiago Pascual
Chunghsin Yeh
Joan Serrà
100
3
0
08 Jul 2024
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Rishit Dagli
Shivesh Prakash
Robert Wu
H. Khosravani
143
6
0
06 Jun 2024
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Jiangkai Wu
Liming Liu
Yunpeng Tan
Junlin Hao
Xinggong Zhang
146
3
0
30 May 2024
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
145
16
0
06 May 2024
Denoising Task Difficulty-based Curriculum for Training Diffusion Models
Jin-Young Kim
Hyojun Go
Soonwoo Kwon
Hyun-Gyoon Kim
DiffM
180
6
0
15 Mar 2024
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Shoubin Yu
Jaehong Yoon
Mohit Bansal
175
7
0
08 Feb 2024
Towards Flexible, Scalable, and Adaptive Multi-Modal Conditioned Face Synthesis
Jingjing Ren
Cheng Xu
Haoyu Chen
Xinran Qin
Lei Zhu
CVBM
DiffM
99
4
0
26 Dec 2023
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following
Shufan Li
Harkanwar Singh
Aditya Grover
DiffM
93
10
0
11 Dec 2023
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Yatong Bai
Trung D. Q. Dang
Dung N. Tran
K. Koishida
Somayeh Sojoudi
DiffM
165
23
0
19 Sep 2023
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
120
22
0
28 Aug 2023
On the Design Fundamentals of Diffusion Models: A Survey
Ziyi Chang
George Alex Koulieris
Hyung Jin Chang
Hubert P. H. Shum
DiffM
183
56
0
07 Jun 2023
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You
Mandy Guo
Zhecan Wang
Kai-Wei Chang
Jason Baldridge
Jiahui Yu
DiffM
83
13
0
23 Mar 2023
1