ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.00973
  4. Cited By
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion
  Models

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

1 June 2023
Chang-rui Liu
Haoning Wu
Yujie Zhong
Xiaoyu Zhang
Yanfeng Wang
Weidi Xie
    DiffM
    VLM
ArXivPDFHTML

Papers citing "Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models"

35 / 35 papers shown
Title
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
Jaskirat Singh
Junshen Kevin Chen
Jonas Kohler
Michael Cohen
DiffM
VGen
43
0
0
08 Apr 2025
One-Minute Video Generation with Test-Time Training
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
91
4
0
07 Apr 2025
Consistent Subject Generation via Contrastive Instantiated Concepts
Consistent Subject Generation via Contrastive Instantiated Concepts
Lee Hsin-Ying
Kelvin Chan
Ming Yang
DiffM
95
0
0
31 Mar 2025
Object Isolated Attention for Consistent Story Visualization
Object Isolated Attention for Consistent Story Visualization
Xiangyang Luo
Junhao Cheng
Yifan Xie
Xin Zhang
Tao Feng
Ziqiang Liu
Fei Ma
Fei Richard Yu
DiffM
50
1
0
30 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
72
0
0
20 Mar 2025
Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu
Zeyu Zhu
Mike Zheng Shou
VGen
80
2
0
10 Mar 2025
VisAgent: Narrative-Preserving Story Visualization Framework
Seungkwon Kim
GyuTae Park
Sangyeon Kim
Seung-Hun Nam
45
0
0
04 Mar 2025
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
122
1
0
22 Nov 2024
StoryAgent: Customized Storytelling Video Generation via Multi-Agent
  Collaboration
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Panwen Hu
Jin Jiang
Jianqi Chen
Mingfei Han
Shengcai Liao
Xiaojun Chang
Xiaodan Liang
VGen
DiffM
43
5
0
07 Nov 2024
KAHANI: Culturally-Nuanced Visual Storytelling Tool for Non-Western Cultures
KAHANI: Culturally-Nuanced Visual Storytelling Tool for Non-Western Cultures
Hamna
Deepthi Sudharsan
Agrima Seth
Ritvik Budhiraja
Deepika Khullar
Vyshak Jain
Kalika Bali
Aditya Vashistha
Sameer Segal
DiffM
39
0
0
25 Oct 2024
M2Diffuser: Diffusion-based Trajectory Optimization for Mobile
  Manipulation in 3D Scenes
M2Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes
Sixu Yan
Zeyu Zhang
Muzhi Han
Zaijin Wang
Qi Xie
Zhitian Li
Zhehan Li
Hangxin Liu
Xinggang Wang
Song-Chun Zhu
57
4
0
15 Oct 2024
Story-Adapter: A Training-free Iterative Framework for Long Story
  Visualization
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization
Jiawei Mao
Xiaoke Huang
Yunfei Xie
Yuanqi Chang
Mude Hui
Bingjie Xu
Yuyin Zhou
VGen
DiffM
43
0
0
08 Oct 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
42
9
0
21 Sep 2024
CinePreGen: Camera Controllable Video Previsualization via
  Engine-powered Diffusion
CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion
Yiran Chen
Anyi Rao
Xuekun Jiang
Shishi Xiao
Ruiqing Ma
Zeyu Wang
Hui Xiong
Bo Dai
VGen
DiffM
38
1
0
30 Aug 2024
MegaFusion: Extend Diffusion Models towards Higher-resolution Image
  Generation without Further Tuning
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Haoning Wu
Shaocheng Shen
Qiang Hu
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
40
10
0
20 Aug 2024
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware
  Open-domain Visual Storytelling
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Zilyu Ye
Yu Lei
Ruotian Peng
Jinjin Cao
Zhiyang Chen
...
Mingyuan Zhou
Xiaoqian Shen
Mohamed Elhoseiny
Nan Zhuang
Guo-Jun Qi
VGen
VLM
40
1
0
07 Aug 2024
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
Huiguo He
Huan Yang
Zixi Tuo
Yuan Zhou
Qiuyue Wang
Yuhang Zhang
Zeyu Liu
Wenhao Huang
Hongyang Chao
Jian Yin
DiffM
VGen
62
12
0
17 Jul 2024
SEED-Story: Multimodal Long Story Generation with Large Language Model
SEED-Story: Multimodal Long Story Generation with Large Language Model
Shuai Yang
Yuying Ge
Yang Li
Yukang Chen
Yixiao Ge
Ying Shan
Yingcong Chen
VGen
DiffM
83
26
0
11 Jul 2024
StoryDiffusion: How to Support UX Storyboarding With Generative-AI
StoryDiffusion: How to Support UX Storyboarding With Generative-AI
Zhaohui Liang
Xiaoyu Zhang
Kevin Ma
Zhao Liu
Xipei Ren
K. Goucher-Lambert
Can Liu
DiffM
40
6
0
10 Jul 2024
Replication in Visual Diffusion Models: A Survey and Outlook
Replication in Visual Diffusion Models: A Survey and Outlook
Wenhao Wang
Yifan Sun
Zongxin Yang
Zhengdong Hu
Zhentao Tan
Yi Yang
86
7
0
07 Jul 2024
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen
Lin Li
Yongqi Yang
Bin Wen
Fan Yang
Tingting Gao
Yu Wu
Long Chen
VLM
VGen
47
6
0
15 Jun 2024
Binarized Diffusion Model for Image Super-Resolution
Binarized Diffusion Model for Image Super-Resolution
Zheng Chen
Haotong Qin
Yong Guo
Xiongfei Su
Xin Yuan
Linghe Kong
Yulun Zhang
DiffM
34
9
0
09 Jun 2024
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image
  Generation
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Junhao Cheng
Xi Lu
Hanhui Li
Khun Loun Zai
Baiqiao Yin
Yuhao Cheng
Yiqiang Yan
Xiaodan Liang
DiffM
VGen
45
10
0
03 Jun 2024
RefDrop: Controllable Consistency in Image or Video Generation via
  Reference Feature Guidance
RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance
JiaoJiao Fan
Haotian Xue
Qinsheng Zhang
Yongxin Chen
38
1
0
27 May 2024
StoryImager: A Unified and Efficient Framework for Coherent Story
  Visualization and Completion
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
Ming Tao
Bing-Kun Bao
Hao Tang
Yaowei Wang
Changsheng Xu
DiffM
46
5
0
09 Apr 2024
Many-to-many Image Generation with Auto-regressive Diffusion Models
Many-to-many Image Generation with Auto-regressive Diffusion Models
Ying Shen
Yizhe Zhang
Shuangfei Zhai
Lifu Huang
J. Susskind
Jiatao Gu
43
6
0
03 Apr 2024
WonderJourney: Going from Anywhere to Everywhere
WonderJourney: Going from Anywhere to Everywhere
Hong-Xing Yu
Haoyi Duan
Junhwa Hur
Kyle Sargent
Michael Rubinstein
...
Forrester Cole
Deqing Sun
Noah Snavely
Jiajun Wu
Charles Herrmann
VGen
33
51
0
06 Dec 2023
Make-A-Storyboard: A General Framework for Storyboard with Disentangled
  and Merged Control
Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Sitong Su
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
DiffM
35
3
0
06 Dec 2023
OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Jie An
Zhengyuan Yang
Linjie Li
Jianfeng Wang
K. Lin
Zicheng Liu
Lijuan Wang
Jiebo Luo
25
11
0
11 Oct 2023
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image
  Generation
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
Yuval Kirstain
Adam Polyak
Uriel Singer
Shahbuland Matiana
Joe Penna
Omer Levy
EGVM
168
352
0
02 May 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
Feng Yu
Radu Timofte
Luc Van Gool
DiffM
233
1,355
0
24 Jan 2022
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
362
5,811
0
29 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,796
0
24 Feb 2021
Imagine This! Scripts to Compositions to Videos
Imagine This! Scripts to Compositions to Videos
Tanmay Gupta
Dustin Schwenk
Ali Farhadi
Derek Hoiem
Aniruddha Kembhavi
CoGe
VGen
113
87
0
10 Apr 2018
1