Visual Knowledge in the Big Model Era: Retrospect and Prospect

5 April 2024

Papers citing "Visual Knowledge in the Big Model Era: Retrospect and Prospect"

25 / 25 papers shown

Title
Learning Clustering-based Prototypes for Compositional Zero-shot Learning Hongyu Qu Jianan Wei Xiangbo Shu Wenguan Wang VLM 145 1 0 10 Feb 2025
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation Minghan Chen Guikun Chen Wenguan Wang Yi Yang 114 4 0 16 Sep 2024
Fine-Grained Domain Generalization with Feature Structuralization Wenlong Yu Dongyue Chen Qilong Wang Qinghua Hu 81 0 0 13 Jun 2024
A Survey on 3D Gaussian Splatting Guikun Chen Wenguan Wang 3DGS 192 192 0 08 Jan 2024
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Nataniel Ruiz Yuanzhen Li Varun Jampani Yael Pritch Michael Rubinstein Kfir Aberman 279 2,891 0 25 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion Rinon Gal Yuval Alaluf Yuval Atzmon Or Patashnik Amit H. Bermano Gal Chechik Daniel Cohen-Or 164 1,897 0 02 Aug 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents Aditya A. Ramesh Prafulla Dhariwal Alex Nichol Casey Chu Mark Chen VLM DiffM 413 6,916 0 13 Apr 2022
Rethinking Semantic Segmentation: A Prototype View Tianfei Zhou Wenguan Wang E. Konukoglu Luc Van Gool SSeg 109 274 0 28 Mar 2022
Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning Wenjie Shi Gao Huang Shiji Song Cheng Wu 66 9 0 06 Dec 2021
Relational World Knowledge Representation in Contextual Language Models: A Review Tara Safavi Danai Koutra KELM 82 51 0 12 Apr 2021
Counterfactual Zero-Shot and Open-Set Visual Recognition Zhongqi Yue Tan Wang Hanwang Zhang Qianru Sun Xiansheng Hua BDL 216 197 0 01 Mar 2021
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 418 5,000 0 24 Feb 2021
D-NeRF: Neural Radiance Fields for Dynamic Scenes Albert Pumarola Enric Corona Gerard Pons-Moll Francesc Moreno-Noguer 121 1,448 0 27 Nov 2020
Visual Commonsense R-CNN Tan Wang Jianqiang Huang Hanwang Zhang Qianru Sun SSL ObjD CML 60 250 0 27 Feb 2020
From Recognition to Cognition: Visual Commonsense Reasoning Rowan Zellers Yonatan Bisk Ali Farhadi Yejin Choi LRM BDL OCL ReLM 178 883 0 27 Nov 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Kexin Yi Jiajun Wu Chuang Gan Antonio Torralba Pushmeet Kohli J. Tenenbaum NAI 84 611 0 04 Oct 2018
Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB Tomoyuki Suzuki Hirokatsu Kataoka Y. Aoki Y. Satoh 68 114 0 08 Apr 2018
Dynamic Routing Between Capsules S. Sabour Nicholas Frosst Geoffrey E. Hinton 182 4,606 0 26 Oct 2017
The "something something" video database for learning and evaluating visual common sense Raghav Goyal Samira Ebrahimi Kahou Vincent Michalski Joanna Materzynska S. Westphal ... Moritz Mueller-Freitag F. Hoppe Christian Thurau Ingo Bax Roland Memisevic VLM 98 1,542 0 13 Jun 2017
Visual Interaction Networks Nicholas Watters Andrea Tacchetti T. Weber Razvan Pascanu Peter W. Battaglia Daniel Zoran PINN 3DH 98 279 0 05 Jun 2017
Prototypical Networks for Few-shot Learning Jake C. Snell Kevin Swersky R. Zemel 305 8,154 0 15 Mar 2017
Generative Adversarial Text to Image Synthesis Scott E. Reed Zeynep Akata Xinchen Yan Lajanugen Logeswaran Bernt Schiele Honglak Lee GAN 207 3,149 0 17 May 2016
Deep multi-scale video prediction beyond mean square error Michaël Mathieu Camille Couprie Yann LeCun GAN 126 1,882 0 17 Nov 2015
Neural Module Networks Jacob Andreas Marcus Rohrbach Trevor Darrell Dan Klein CoGe 139 1,076 0 09 Nov 2015
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 1.7K 39,615 0 01 Sep 2014