Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks

2 April 2024

Papers citing "Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks"

3 / 3 papers shown

Title
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action Dhruv Shah B. Osinski Brian Ichter Sergey Levine LM&Ro 160 439 0 10 Jul 2022
A survey of multimodal deep generative models Masahiro Suzuki Y. Matsuo SyDa DRL 62 76 0 05 Jul 2022
VideoGPT: Video Generation using VQ-VAE and Transformers Wilson Yan Yunzhi Zhang Pieter Abbeel A. Srinivas ViT VGen 245 484 0 20 Apr 2021