2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition

29 December 2020

Zuxuan Wu

Papers citing "2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition"

42 / 42 papers shown

Title
Learning to Inference Adaptively for Multimodal Large Language Models Zhuoyan Xu Khoi Duc Nguyen Preeti Mukherjee Saurabh Bagchi Somali Chaterji Yingyu Liang Yin Li LRM 82 2 0 13 Mar 2025
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition Yue Meng Chung-Ching Lin Yikang Shen P. Sattigeri Leonid Karlinsky A. Oliva Kate Saenko Rogerio Feris 37 142 0 31 Jul 2020
Dynamic Sampling Networks for Efficient Action Recognition in Videos Yin-Dong Zheng Zhaoyang Liu Tong Lu Limin Wang 35 75 0 28 Jun 2020
X3D: Expanding Architectures for Efficient Video Recognition Christoph Feichtenhofer 118 1,013 0 09 Apr 2020
Temporal Pyramid Network for Action Recognition Ceyuan Yang Yinghao Xu Jianping Shi Bo Dai Bolei Zhou 38 369 0 07 Apr 2020
Resolution Adaptive Networks for Efficient Inference Le Yang Yizeng Han Xi Chen Shiji Song Jifeng Dai Gao Huang 48 215 0 16 Mar 2020
Learning When and Where to Zoom with Deep Reinforcement Learning Burak Uzkent Stefano Ermon 54 68 0 01 Mar 2020
Listen to Look: Action Recognition by Previewing Audio Ruohan Gao Tae-Hyun Oh Kristen Grauman Lorenzo Torresani VLM 58 251 0 10 Dec 2019
Improved Techniques for Training Adaptive Deep Networks Hao Li Hong Zhang Xiaojuan Qi Ruigang Yang Gao Huang 51 132 0 17 Aug 2019
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition Wenhao Wu Dongliang He Xiao Tan Shifeng Chen Shilei Wen 42 128 0 31 Jul 2019
Batch-Shaping for Learning Conditional Channel Gated Networks B. Bejnordi Tijmen Blankevoort Max Welling AI4CE 37 76 0 15 Jul 2019
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition Bruno Korbar Du Tran Lorenzo Torresani 46 224 0 08 Apr 2019
Video Classification with Channel-Separated Convolutional Networks Du Tran Heng Wang Lorenzo Torresani Matt Feiszli 3DV 50 583 0 04 Apr 2019
SlowFast Networks for Video Recognition Christoph Feichtenhofer Haoqi Fan Jitendra Malik Kaiming He 146 3,244 0 10 Dec 2018
AutoFocus: Efficient Multi-Scale Inference Mahyar Najibi Bharat Singh L. Davis ObjD 61 127 0 04 Dec 2018
AdaFrame: Adaptive Frame Selection for Fast Video Recognition Zuxuan Wu Caiming Xiong Chih-Yao Ma R. Socher L. Davis 142 195 0 29 Nov 2018
TSM: Temporal Shift Module for Efficient Video Understanding Ji Lin Chuang Gan Song Han 78 1,677 0 20 Nov 2018
Multi-Fiber Networks for Video Recognition Yunpeng Chen Yannis Kalantidis Jianshu Li Shuicheng Yan Jiashi Feng CVBM 97 217 0 30 Jul 2018
ECO: Efficient Convolutional Network for Online Video Understanding Mohammadreza Zolfaghari Kamaljeet Singh Thomas Brox 170 498 0 24 Apr 2018
AMC: AutoML for Model Compression and Acceleration on Mobile Devices Yihui He Ji Lin Zhijian Liu Hanrui Wang Li Li Song Han 69 1,348 0 10 Feb 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks Mark Sandler Andrew G. Howard Menglong Zhu A. Zhmoginov Liang-Chieh Chen 148 19,124 0 13 Jan 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification Saining Xie Chen Sun Jonathan Huang Zhuowen Tu Kevin Patrick Murphy 3DH 133 1,317 0 13 Dec 2017
Convolutional Networks with Adaptive Inference Graphs Andreas Veit Serge J. Belongie OOD GNN 79 383 0 30 Nov 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition Du Tran Heng Wang Lorenzo Torresani Jamie Ray Yann LeCun Manohar Paluri 184 3,007 0 30 Nov 2017
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks Zhaofan Qiu Ting Yao Tao Mei 71 1,655 0 28 Nov 2017
SkipNet: Learning Dynamic Routing in Convolutional Networks Xin Wang Feng Yu Zi-Yi Dou Trevor Darrell Joseph E. Gonzalez 59 628 0 26 Nov 2017
Temporal Relational Reasoning in Videos Bolei Zhou A. Andonian Aude Oliva Antonio Torralba NAI 78 1,035 0 22 Nov 2017
BlockDrop: Dynamic Inference Paths in Residual Networks Zuxuan Wu Tushar Nagarajan Abhishek Kumar Steven J. Rennie L. Davis Kristen Grauman Rogerio Feris 79 463 0 22 Nov 2017
Non-local Neural Networks Xinyu Wang Ross B. Girshick Abhinav Gupta Kaiming He OffRL 218 8,867 0 21 Nov 2017
The "something something" video database for learning and evaluating visual common sense Raghav Goyal Samira Ebrahimi Kahou Vincent Michalski Joanna Materzynska S. Westphal ... Moritz Mueller-Freitag F. Hoppe Christian Thurau Ingo Bax Roland Memisevic VLM 71 1,516 0 13 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset João Carreira Andrew Zisserman 199 7,961 0 22 May 2017
Temporal Segment Networks for Action Recognition in Videos Limin Wang Yuanjun Xiong Zhe Wang Yu Qiao Dahua Lin Xiaoou Tang Luc Van Gool ViT 83 807 0 08 May 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 1.1K 20,747 0 17 Apr 2017
Spatially Adaptive Computation Time for Residual Networks Michael Figurnov Maxwell D. Collins Yukun Zhu Li Zhang Jonathan Huang Dmitry Vetrov Ruslan Salakhutdinov 44 346 0 07 Dec 2016
VideoLSTM Convolves, Attends and Flows for Action Recognition Zhenyang Li E. Gavves Mihir Jain Cees G. M. Snoek 77 465 0 06 Jul 2016
Convolutional Two-Stream Network Fusion for Video Action Recognition Christoph Feichtenhofer A. Pinz Andrew Zisserman 124 2,606 0 22 Apr 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 1.4K 192,638 0 10 Dec 2015
Beyond Short Snippets: Deep Networks for Video Classification Joe Yue-Hei Ng Matthew J. Hausknecht Sudheendra Vijayanarasimhan Oriol Vinyals R. Monga G. Toderici 111 2,334 0 31 Mar 2015
Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks Yu-Gang Jiang Zuxuan Wu Jun Wang Xiangyang Xue Shih-Fu Chang 59 360 0 25 Feb 2015
Learning Spatiotemporal Features with 3D Convolutional Networks Du Tran Lubomir D. Bourdev Rob Fergus Lorenzo Torresani Manohar Paluri 3DPC 46 410 0 02 Dec 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description Jeff Donahue Lisa Anne Hendricks Marcus Rohrbach Subhashini Venugopalan S. Guadarrama Kate Saenko Trevor Darrell VLM 121 6,046 0 17 Nov 2014
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman 225 7,518 0 09 Jun 2014