Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.12037
Cited By
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control
18 March 2024
Enshen Zhou
Yiran Qin
Zhen-fei Yin
Yuzhou Huang
Ruimao Zhang
Lu Sheng
Yu Qiao
Jing Shao
LM&Ro
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control"
16 / 16 papers shown
Title
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xueliang Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
67
0
0
30 Apr 2025
Solving New Tasks by Adapting Internet Video Knowledge
Calvin Luo
Zilai Zeng
Yilun Du
Chen Sun
33
1
0
21 Apr 2025
Grounding Video Models to Actions through Goal Conditioned Exploration
Yunhao Luo
Yilun Du
LM&Ro
VGen
90
1
0
11 Nov 2024
Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs
Yu Xia
Rui Wang
Xu Liu
Mingyan Li
Tong Yu
Xiang Chen
Julian McAuley
Shuai Li
LRM
53
19
0
24 Apr 2024
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
Jingkun An
Yinghao Zhu
Zongjian Li
Haoran Feng
Bohua Chen
Yemin Shi
Chengwei Pan
43
2
0
20 Mar 2024
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
Yiran Qin
Enshen Zhou
Qichang Liu
Zhen-fei Yin
Lu Sheng
Ruimao Zhang
Yu Qiao
Jing Shao
LM&Ro
32
39
0
12 Dec 2023
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
Shaofei Cai
Bowei Zhang
Zihao Wang
Xiaojian Ma
Guy Van den Broeck
Yitao Liang
83
26
0
12 Oct 2023
MindAgent: Emergent Gaming Interaction
Ran Gong
Qiuyuan Huang
Xiaojian Ma
Hoi Vo
Zane Durante
...
Zilong Zheng
Song-Chun Zhu
Demetri Terzopoulos
Fei-Fei Li
Jianfeng Gao
LM&Ro
107
65
0
18 Sep 2023
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Kai Zhang
Lingbo Mo
Wenhu Chen
Huan Sun
Yu-Chuan Su
EGVM
111
240
0
16 Jun 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
906
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
320
4,279
0
30 Jan 2023
Red-Teaming the Stable Diffusion Safety Filter
Javier Rando
Daniel Paleka
David Lindner
Lennard Heim
Florian Tramèr
DiffM
132
184
0
03 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
369
12,081
0
04 Mar 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,279
0
02 Sep 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
299
1,084
0
17 Feb 2021
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
375
75,972
0
18 May 2015
1