ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.04869
  4. Cited By
PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks

PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks

10 September 2021
Jiankai Sun
De-An Huang
Bo Lu
Yunhui Liu
Bolei Zhou
Animesh Garg
ArXivPDFHTML

Papers citing "PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks"

37 / 37 papers shown
Title
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang
Yilu Wu
Sheng Guo
Limin Wang
VGen
DiffM
127
31
0
26 Mar 2023
Egocentric Human Trajectory Forecasting with a Wearable Camera and
  Multi-Modal Fusion
Egocentric Human Trajectory Forecasting with a Wearable Camera and Multi-Modal Fusion
Jianing Qiu
Lipeng Chen
Xiao Gu
Frank P.-W. Lo
Ya-Yen Tsai
Jiankai Sun
Jiaqi Liu
Benny Lo
46
15
0
01 Nov 2021
Procedure Planning in Instructional Videos via Contextual Modeling and
  Model-based Policy Learning
Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning
Jing Bi
Jiebo Luo
Chenliang Xu
105
49
0
05 Oct 2021
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
Dieter Fox
Animesh Garg
Yoav Artzi
LM&Ro
244
137
0
12 Jul 2021
Modular Action Concept Grounding in Semantic Video Prediction
Modular Action Concept Grounding in Semantic Video Prediction
Wei Yu
Wenxin Chen
Songheng Yin
S. Easterbrook
Animesh Garg
37
13
0
23 Nov 2020
Actionet: An Interactive End-To-End Platform For Task-Based Data
  Collection And Augmentation In 3D Environment
Actionet: An Interactive End-To-End Platform For Task-Based Data Collection And Augmentation In 3D Environment
Jiafei Duan
Samson Yu
Hui Li Tan
Cheston Tan
31
8
0
03 Oct 2020
On the model-based stochastic value gradient for continuous
  reinforcement learning
On the model-based stochastic value gradient for continuous reinforcement learning
Brandon Amos
Samuel Stanton
Denis Yarats
A. Wilson
58
71
0
28 Aug 2020
Best-First Beam Search
Best-First Beam Search
Clara Meister
Tim Vieira
Ryan Cotterell
46
71
0
08 Jul 2020
Transferable Active Grasping and Real Embodied Dataset
Transferable Active Grasping and Real Embodied Dataset
Xiangyu Chen
Zelin Ye
Jiankai Sun
Yuda Fan
Fangwei Hu
Chenxi Wang
Cewu Lu
37
19
0
28 Apr 2020
Learning a Decision Module by Imitating Driver's Control Behaviors
Learning a Decision Module by Imitating Driver's Control Behaviors
Junning Huang
Sirui Xie
Jiankai Sun
Gary Qiurui Ma
Chunxiao Liu
Jianping Shi
Dahua Lin
Bolei Zhou
32
31
0
30 Nov 2019
Dynamics Learning with Cascaded Variational Inference for Multi-Step
  Manipulation
Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation
Kuan Fang
Yuke Zhu
Animesh Garg
Silvio Savarese
Li Fei-Fei
DRL
52
48
0
29 Oct 2019
Uncertainty-Aware Anticipation of Activities
Uncertainty-Aware Anticipation of Activities
Yazan Abu Farha
Juergen Gall
58
49
0
26 Aug 2019
Procedure Planning in Instructional Videos
Procedure Planning in Instructional Videos
C. Chang
De-An Huang
Danfei Xu
Ehsan Adeli
Li Fei-Fei
Juan Carlos Niebles
52
103
0
02 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
83
948
0
19 Jun 2019
Cross-view Semantic Segmentation for Sensing Surroundings
Cross-view Semantic Segmentation for Sensing Surroundings
Bowen Pan
Jiankai Sun
Ho Yin Tiga Leung
A. Andonian
Bolei Zhou
48
264
0
09 Jun 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
75
1,243
0
03 Apr 2019
Cross-task weakly supervised learning from instructional videos
Cross-task weakly supervised learning from instructional videos
Dimitri Zhukov
Jean-Baptiste Alayrac
R. G. Cinbis
David Fouhey
Ivan Laptev
Josef Sivic
SSL
113
249
0
19 Mar 2019
D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly
  Supervised Action Alignment and Segmentation
D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation
C. Chang
De-An Huang
Yanan Sui
Li Fei-Fei
Juan Carlos Niebles
90
156
0
09 Jan 2019
Zero-Shot Anticipation for Instructional Activities
Zero-Shot Anticipation for Instructional Activities
Fadime Sener
Angela Yao
LM&Ro
119
68
0
06 Dec 2018
Learning Latent Dynamics for Planning from Pixels
Learning Latent Dynamics for Planning from Pixels
Danijar Hafner
Timothy Lillicrap
Ian S. Fischer
Ruben Villegas
David R Ha
Honglak Lee
James Davidson
BDL
84
1,430
0
12 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.6K
94,511
0
11 Oct 2018
Learning Plannable Representations with Causal InfoGAN
Learning Plannable Representations with Causal InfoGAN
Thanard Kurutach
Aviv Tamar
Ge Yang
Stuart J. Russell
Pieter Abbeel
GAN
DRL
66
180
0
24 Jul 2018
Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video
  Demonstration
Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration
De-An Huang
Suraj Nair
Danfei Xu
Yuke Zhu
Animesh Garg
Li Fei-Fei
Silvio Savarese
Juan Carlos Niebles
49
140
0
10 Jul 2018
Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
Kiana Ehsani
Hessam Bagherinezhad
Joseph Redmon
Roozbeh Mottaghi
Ali Farhadi
VGen
51
59
0
28 Mar 2018
AI2-THOR: An Interactive 3D Environment for Visual AI
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
58
1,096
0
14 Dec 2017
Neural Task Programming: Learning to Generalize Across Hierarchical
  Tasks
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
Danfei Xu
Suraj Nair
Yuke Zhu
J. Gao
Animesh Garg
Li Fei-Fei
Silvio Savarese
61
195
0
04 Oct 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
654
130,942
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
219
7,989
0
22 May 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
106
2,433
0
15 May 2017
Towards Automatic Learning of Procedures from Web Instructional Videos
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
72
825
0
28 Mar 2017
Joint Discovery of Object States and Manipulation Actions
Joint Discovery of Object States and Manipulation Actions
Jean-Baptiste Alayrac
Josef Sivic
Ivan Laptev
Simon Lacoste-Julien
64
79
0
09 Feb 2017
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous
  Off-Policy Updates
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
S. Gu
E. Holly
Timothy Lillicrap
Sergey Levine
OffRL
SSL
114
1,479
0
03 Oct 2016
CNN Architectures for Large-Scale Audio Classification
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
111
2,497
0
29 Sep 2016
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
De-An Huang
Li Fei-Fei
Juan Carlos Niebles
66
237
0
28 Jul 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.1K
193,426
0
10 Dec 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
149,842
0
22 Dec 2014
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence
  Modeling
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Junyoung Chung
Çağlar Gülçehre
Kyunghyun Cho
Yoshua Bengio
545
12,692
0
11 Dec 2014
1