Learning Visual Representations via Language-Guided Sampling

23 February 2023

Papers citing "Learning Visual Representations via Language-Guided Sampling"

35 / 35 papers shown

Title
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia Valerie Zermatten J. Castillo-Navarro Pallavi Jain D. Tuia Diego Marcos 62 0 0 28 Apr 2025
Impact of Language Guidance: A Reproducibility Study Cherish Puniani Advika Sinha Shree Singhi Aayan Yadav VLM 47 0 0 10 Apr 2025
DiffCLIP: Differential Attention Meets CLIP Hasan Hammoud Guohao Li VLM 44 0 0 09 Mar 2025
VLM-Vac: Enhancing Smart Vacuums through VLM Knowledge Distillation and Language-Guided Experience Replay Reihaneh Mirjalili Michael Krawez Florian Walter Wolfram Burgard 34 0 0 21 Sep 2024
What to align in multimodal contrastive learning? Benoit Dufumier J. Castillo-Navarro D. Tuia Jean-Philippe Thiran 29 3 0 11 Sep 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment Arda Senocak H. Ryu Junsik Kim Tae-Hyun Oh Hanspeter Pfister Joon Son Chung 38 3 0 18 Jul 2024
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery Kam Woh Ng Xiatian Zhu Yi-Zhe Song Tao Xiang 37 2 0 12 Jun 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models Yifei Ming Yixuan Li VLM 39 7 0 02 May 2024
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions Leena Mathur Paul Pu Liang Louis-Philippe Morency LLMAG 32 7 0 17 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition Otto Brookes Majid Mirmehdi H. Kühl T. Burghardt 35 3 0 13 Apr 2024
Probing the 3D Awareness of Visual Foundation Models Mohamed El Banani Amit Raj Kevis-Kokitsi Maninis Abhishek Kar Yuanzhen Li Michael Rubinstein Deqing Sun Leonidas J. Guibas Justin Johnson Varun Jampani 40 79 0 12 Apr 2024
Multi Positive Contrastive Learning with Pose-Consistent Generated Images Sho Inayoshi Aji Resindra Widya Satoshi Ozaki Junji Otsuka Takeshi Ohashi 3DH 52 1 0 04 Apr 2024
UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All Yuanhuiyi Lyu Xueye Zheng Jiazhou Zhou Lin Wang 32 15 0 19 Mar 2024
Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning Mark D Mcdonnell Dong Gong Ehsan Abbasnejad Anton Van Den Hengel VLM DiffM 83 3 0 12 Mar 2024
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos Tarun Kalluri Bodhisattwa Prasad Majumder Manmohan Chandraker VLM 31 4 0 08 Mar 2024
MNN: Mixed Nearest-Neighbors for Self-Supervised Learning Xianzhong Long Chen Peng Yun Li SSL 33 0 0 01 Nov 2023
Sound Source Localization is All about Cross-Modal Alignment Arda Senocak H. Ryu Junsik Kim Tae-Hyun Oh Hanspeter Pfister Joon Son Chung 30 18 0 19 Sep 2023
A General-Purpose Self-Supervised Model for Computational Pathology Richard J. Chen Tong Ding Ming Y. Lu Drew F. K. Williamson Guillaume Jaume ... Judy J. Wang Walt Williams L. Le Georg Gerber Faisal Mahmood MedIm 22 42 0 29 Aug 2023
Compositionally Equivariant Representation Learning Xiao Liu Pedro Sanchez Spyridon Thermos Alison Q. OÑeil Sotirios A. Tsaftaris CoGe OOD 24 2 0 13 Jun 2023
Scalable 3D Captioning with Pretrained Models Tiange Luo C. Rockwell Honglak Lee Justin Johnson 24 152 0 12 Jun 2023
Retrieval-Enhanced Contrastive Vision-Text Models Ahmet Iscen Mathilde Caron Alireza Fathi Cordelia Schmid CLIP VLM 31 26 0 12 Jun 2023
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners Yonglong Tian Lijie Fan Phillip Isola Huiwen Chang Dilip Krishnan VLM DiffM 26 140 0 01 Jun 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection Shyamgopal Karthik Karsten Roth Massimiliano Mancini Zeynep Akata 33 20 0 22 May 2023
Hyperbolic Image-Text Representations Karan Desai Maximilian Nickel Tanmay Rajpurohit Justin Johnson Ramakrishna Vedantam VLM 39 57 0 18 Apr 2023
UniCLIP: Unified Framework for Contrastive Language-Image Pre-training Janghyeon Lee Jongsuk Kim Hyounguk Shon Bumsoo Kim Seung Wook Kim Honglak Lee Junmo Kim CLIP VLM 50 53 0 27 Sep 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision Jiarui Xu Shalini De Mello Sifei Liu Wonmin Byeon Thomas Breuel Jan Kautz Xinyu Wang ViT VLM 189 499 0 22 Feb 2022
Masked Autoencoders Are Scalable Vision Learners Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Dollár Ross B. Girshick ViT TPM 305 7,443 0 11 Nov 2021
ResNet strikes back: An improved training procedure in timm Ross Wightman Hugo Touvron Hervé Jégou AI4TS 212 487 0 01 Oct 2021
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations Debidatta Dwibedi Y. Aytar Jonathan Tompson P. Sermanet Andrew Zisserman SSL 188 453 0 29 Apr 2021
Pri3D: Can 3D Priors Help 2D Representation Learning? Ji Hou Saining Xie Benjamin Graham Angela Dai Matthias Nießner SSL 3DPC MDE 85 79 0 22 Apr 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Soravit Changpinyo P. Sharma Nan Ding Radu Soricut VLM 278 1,082 0 17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 298 3,700 0 11 Feb 2021
Self-supervised Co-training for Video Representation Learning Tengda Han Weidi Xie Andrew Zisserman SSL 215 309 0 19 Oct 2020
Improved Baselines with Momentum Contrastive Learning Xinlei Chen Haoqi Fan Ross B. Girshick Kaiming He SSL 267 3,371 0 09 Mar 2020
A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics Yunchao Gong Qifa Ke Michael Isard Svetlana Lazebnik 3DV 76 584 0 18 Dec 2012