ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.12248
  4. Cited By
Learning Visual Representations via Language-Guided Sampling

Learning Visual Representations via Language-Guided Sampling

23 February 2023
Mohamed El Banani
Karan Desai
Justin Johnson
    SSL
    VLM
ArXivPDFHTML

Papers citing "Learning Visual Representations via Language-Guided Sampling"

35 / 35 papers shown
Title
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and Wikipedia
Valerie Zermatten
J. Castillo-Navarro
Pallavi Jain
D. Tuia
Diego Marcos
62
0
0
28 Apr 2025
Impact of Language Guidance: A Reproducibility Study
Impact of Language Guidance: A Reproducibility Study
Cherish Puniani
Advika Sinha
Shree Singhi
Aayan Yadav
VLM
47
0
0
10 Apr 2025
DiffCLIP: Differential Attention Meets CLIP
Hasan Hammoud
Guohao Li
VLM
44
0
0
09 Mar 2025
VLM-Vac: Enhancing Smart Vacuums through VLM Knowledge Distillation and
  Language-Guided Experience Replay
VLM-Vac: Enhancing Smart Vacuums through VLM Knowledge Distillation and Language-Guided Experience Replay
Reihaneh Mirjalili
Michael Krawez
Florian Walter
Wolfram Burgard
34
0
0
21 Sep 2024
What to align in multimodal contrastive learning?
What to align in multimodal contrastive learning?
Benoit Dufumier
J. Castillo-Navarro
D. Tuia
Jean-Philippe Thiran
29
3
0
11 Sep 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through
  Audio-Visual Alignment
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
38
3
0
18 Jul 2024
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery
Kam Woh Ng
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
37
2
0
12 Jun 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
39
7
0
02 May 2024
Advancing Social Intelligence in AI Agents: Technical Challenges and
  Open Questions
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions
Leena Mathur
Paul Pu Liang
Louis-Philippe Morency
LLMAG
32
7
0
17 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
35
3
0
13 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
40
79
0
12 Apr 2024
Multi Positive Contrastive Learning with Pose-Consistent Generated
  Images
Multi Positive Contrastive Learning with Pose-Consistent Generated Images
Sho Inayoshi
Aji Resindra Widya
Satoshi Ozaki
Junji Otsuka
Takeshi Ohashi
3DH
52
1
0
04 Apr 2024
UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind
  Them All
UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All
Yuanhuiyi Lyu
Xueye Zheng
Jiazhou Zhou
Lin Wang
32
15
0
19 Mar 2024
Premonition: Using Generative Models to Preempt Future Data Changes in
  Continual Learning
Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning
Mark D Mcdonnell
Dong Gong
Ehsan Abbasnejad
Anton Van Den Hengel
VLM
DiffM
83
3
0
12 Mar 2024
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in
  Images and Videos
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos
Tarun Kalluri
Bodhisattwa Prasad Majumder
Manmohan Chandraker
VLM
31
4
0
08 Mar 2024
MNN: Mixed Nearest-Neighbors for Self-Supervised Learning
MNN: Mixed Nearest-Neighbors for Self-Supervised Learning
Xianzhong Long
Chen Peng
Yun Li
SSL
33
0
0
01 Nov 2023
Sound Source Localization is All about Cross-Modal Alignment
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
30
18
0
19 Sep 2023
A General-Purpose Self-Supervised Model for Computational Pathology
A General-Purpose Self-Supervised Model for Computational Pathology
Richard J. Chen
Tong Ding
Ming Y. Lu
Drew F. K. Williamson
Guillaume Jaume
...
Judy J. Wang
Walt Williams
L. Le
Georg Gerber
Faisal Mahmood
MedIm
22
42
0
29 Aug 2023
Compositionally Equivariant Representation Learning
Compositionally Equivariant Representation Learning
Xiao Liu
Pedro Sanchez
Spyridon Thermos
Alison Q. OÑeil
Sotirios A. Tsaftaris
CoGe
OOD
24
2
0
13 Jun 2023
Scalable 3D Captioning with Pretrained Models
Scalable 3D Captioning with Pretrained Models
Tiange Luo
C. Rockwell
Honglak Lee
Justin Johnson
24
152
0
12 Jun 2023
Retrieval-Enhanced Contrastive Vision-Text Models
Retrieval-Enhanced Contrastive Vision-Text Models
Ahmet Iscen
Mathilde Caron
Alireza Fathi
Cordelia Schmid
CLIP
VLM
31
26
0
12 Jun 2023
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual
  Representation Learners
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Yonglong Tian
Lijie Fan
Phillip Isola
Huiwen Chang
Dilip Krishnan
VLM
DiffM
26
140
0
01 Jun 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based
  Text-to-Image Generation by Selection
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
33
20
0
22 May 2023
Hyperbolic Image-Text Representations
Hyperbolic Image-Text Representations
Karan Desai
Maximilian Nickel
Tanmay Rajpurohit
Justin Johnson
Ramakrishna Vedantam
VLM
39
57
0
18 Apr 2023
UniCLIP: Unified Framework for Contrastive Language-Image Pre-training
UniCLIP: Unified Framework for Contrastive Language-Image Pre-training
Janghyeon Lee
Jongsuk Kim
Hyounguk Shon
Bumsoo Kim
Seung Wook Kim
Honglak Lee
Junmo Kim
CLIP
VLM
50
53
0
27 Sep 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
189
499
0
22 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
212
487
0
01 Oct 2021
With a Little Help from My Friends: Nearest-Neighbor Contrastive
  Learning of Visual Representations
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations
Debidatta Dwibedi
Y. Aytar
Jonathan Tompson
P. Sermanet
Andrew Zisserman
SSL
188
453
0
29 Apr 2021
Pri3D: Can 3D Priors Help 2D Representation Learning?
Pri3D: Can 3D Priors Help 2D Representation Learning?
Ji Hou
Saining Xie
Benjamin Graham
Angela Dai
Matthias Nießner
SSL
3DPC
MDE
85
79
0
22 Apr 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
278
1,082
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Self-supervised Co-training for Video Representation Learning
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
215
309
0
19 Oct 2020
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
267
3,371
0
09 Mar 2020
A Multi-View Embedding Space for Modeling Internet Images, Tags, and
  their Semantics
A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics
Yunchao Gong
Qifa Ke
Michael Isard
Svetlana Lazebnik
3DV
76
584
0
18 Dec 2012
1