ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1605.05395
  4. Cited By
Learning Deep Representations of Fine-grained Visual Descriptions

Learning Deep Representations of Fine-grained Visual Descriptions

17 May 2016
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
    OCLVLM
ArXiv (abs)PDFHTML

Papers citing "Learning Deep Representations of Fine-grained Visual Descriptions"

50 / 351 papers shown
Title
Few-shot Novel Category Discovery
Few-shot Novel Category Discovery
Chunming Li
Shidong Wang
Haofeng Zhang
64
0
0
13 May 2025
Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models
Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models
Aishwarya Venkataramanan
P. Bodesheim
Joachim Denzler
BDLVLM
102
0
0
08 May 2025
Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval
Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval
Zehong Ma
Hao Chen
Wei Zeng
Limin Su
Shiliang Zhang
AI4TS
126
0
0
10 Apr 2025
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Huajie Jiang
Zechao Li
Xiaohan Yu
Yongli Hu
Baocai Yin
Jian Yang
Yuankai Qi
VLM
78
0
0
29 Mar 2025
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Raman Dutt
Harleen Hanspal
Guoxuan Xia
Petru-Daniel Tudosiu
Alexander Black
Yongxin Yang
Jingyu Sun
Sarah Parisot
MoE
102
0
0
28 Mar 2025
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
Yulin Pan
Xiangteng He
Chaojie Mao
Zhen Han
Zeyinzi Jiang
Junxuan Zhang
Yu Liu
EGVMVLM
114
2
0
18 Mar 2025
MADS: Multi-Attribute Document Supervision for Zero-Shot Image Classification
Xiangyan Qu
Jing Yu
Jiamin Zhuang
Gaopeng Gou
Gang Xiong
Qi Wu
VLM
134
0
0
10 Mar 2025
End-to-end Training for Text-to-Image Synthesis using Dual-Text Embeddings
End-to-end Training for Text-to-Image Synthesis using Dual-Text Embeddings
Yeruru Asrar Ahmed
Anurag Mittal
DiffM
122
0
0
03 Feb 2025
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
158
3
0
10 Dec 2024
TaxaBind: A Unified Embedding Space for Ecological Applications
TaxaBind: A Unified Embedding Space for Ecological Applications
Srikumar Sastry
Subash Khanal
Aayush Dhakal
Adeel Ahmad
Nathan Jacobs
132
11
0
01 Nov 2024
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving
  Fine-Grained Zero-Shot Image Captioning
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Joshua Forster Feinglass
Yezhou Yang
65
0
0
30 Sep 2024
A Multimodal Single-Branch Embedding Network for Recommendation in
  Cold-Start and Missing Modality Scenarios
A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios
Christian Ganhor
Marta Moscati
Anna Hausberger
Shah Nawaz
Markus Schedl
65
2
0
26 Sep 2024
Finetuning CLIP to Reason about Pairwise Differences
Finetuning CLIP to Reason about Pairwise Differences
Dylan Sam
Devin Willmott
João Dias Semedo
J. Zico Kolter
VLM
115
4
0
15 Sep 2024
Making Large Vision Language Models to be Good Few-shot Learners
Making Large Vision Language Models to be Good Few-shot Learners
Fan Liu
Wenwen Cai
Jian Huo
Chuanyi Zhang
Delong Chen
Jun Zhou
89
0
0
21 Aug 2024
Modality Invariant Multimodal Learning to Handle Missing Modalities: A
  Single-Branch Approach
Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach
Muhammad Saad Saeed
Shah Nawaz
Muhammad Zaigham Zaheer
Muhammad Haris Khan
Karthik Nandakumar
Muhammad Haroon Yousaf
Hassan Sajjad
Tom De Schepper
Markus Schedl
93
0
0
14 Aug 2024
From Attributes to Natural Language: A Survey and Foresight on
  Text-based Person Re-identification
From Attributes to Natural Language: A Survey and Foresight on Text-based Person Re-identification
Fanzhi Jiang
Su Yang
Mark W. Jones
Liumei Zhang
104
1
0
31 Jul 2024
ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with
  Semantic Enhanced Learning and Dual-Modal Uniform Alignment
ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment
Ziyan Wang
Zhankun Xiong
Feng Huang
Xuan Liu
Wen Zhang
82
6
0
01 Jul 2024
On the Limits of Multi-modal Meta-Learning with Auxiliary Task
  Modulation Using Conditional Batch Normalization
On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization
Jordi Armengol-Estapé
Vincent Michalski
Ramnath Kumar
P. St-Charles
Doina Precup
Samira Ebrahimi Kahou
143
0
0
29 May 2024
Faithful Attention Explainer: Verbalizing Decisions Based on
  Discriminative Features
Faithful Attention Explainer: Verbalizing Decisions Based on Discriminative Features
Yao Rong
David Scheerer
Enkelejda Kasneci
80
0
0
16 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
113
4
0
02 May 2024
AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with
  Foundation Models
AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models
Zhiqiang Tang
Haoyang Fang
Su Zhou
Taojiannan Yang
Zihan Zhong
Tony Hu
Katrin Kirchhoff
George Karypis
109
14
0
24 Apr 2024
Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot
  Learning
Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
W. Hou
Shiming Chen
Shuhuang Chen
Ziming Hong
Yan Wang
Xuetao Feng
Salman Khan
Fahad Shahbaz Khan
Xinge You
99
12
0
23 Apr 2024
From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
Jintao Sun
Zhedong Zheng
Gangyi Ding
Gangyi Ding
124
8
0
16 Apr 2024
CREST: Cross-modal Resonance through Evidential Deep Learning for
  Enhanced Zero-Shot Learning
CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
Haojian Huang
Xiaozhen Qiao
Zhuo Chen
Haodong Chen
Bingyu Li
Zhe Sun
Mulin. Chen
Xuelong Li
117
11
0
15 Apr 2024
Exploring the Potential of Large Foundation Models for Open-Vocabulary
  HOI Detection
Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
Ting Lei
Shaofeng Yin
Yang Liu
VLM
115
9
0
09 Apr 2024
Improving deep learning with prior knowledge and cognitive models: A
  survey on enhancing explainability, adversarial robustness and zero-shot
  learning
Improving deep learning with prior knowledge and cognitive models: A survey on enhancing explainability, adversarial robustness and zero-shot learning
F. Mumuni
A. Mumuni
AAML
103
7
0
11 Mar 2024
Cross-Modal Coordination Across a Diverse Set of Input Modalities
Cross-Modal Coordination Across a Diverse Set of Input Modalities
Jorge Sánchez
Rodrigo Laguna
VLM
80
0
0
29 Jan 2024
Data-Free Generalized Zero-Shot Learning
Data-Free Generalized Zero-Shot Learning
Bowen Tang
Long Yan
Jing Zhang
Qian Yu
Lu Sheng
Dong Xu
VLM
83
11
0
28 Jan 2024
Improved Zero-Shot Classification by Adapting VLMs with Text
  Descriptions
Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions
Oindrila Saha
Grant Van Horn
Subhransu Maji
VLM
144
24
0
04 Jan 2024
Prototype-Guided Text-based Person Search based on Rich Chinese
  Descriptions
Prototype-Guided Text-based Person Search based on Rich Chinese Descriptions
ZiQiang Wu
Bingpeng Ma
56
0
0
22 Dec 2023
ProS: Prompting-to-simulate Generalized knowledge for Universal
  Cross-Domain Retrieval
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Kaipeng Fang
Jingkuan Song
Lianli Gao
Pengpeng Zeng
Zhi-Qi Cheng
Xiyao Li
Hengtao Shen
VLM
71
11
0
19 Dec 2023
LIME: Localized Image Editing via Attention Regularization in Diffusion
  Models
LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Enis Simsar
A. Tonioni
Yongqin Xian
Thomas Hofmann
Federico Tombari
DiffM
68
9
0
14 Dec 2023
Large Language Models are Good Prompt Learners for Low-Shot Image
  Classification
Large Language Models are Good Prompt Learners for Low-Shot Image Classification
Zhao-Heng Zheng
Jingmin Wei
Xuefeng Hu
Haidong Zhu
Ramkant Nevatia
VLM
106
5
0
07 Dec 2023
TextAug: Test time Text Augmentation for Multimodal Person
  Re-identification
TextAug: Test time Text Augmentation for Multimodal Person Re-identification
Mulham Fawakherji
Eduard Vazquez
P. Giampa
Binod Bhattarai
86
3
0
04 Dec 2023
Holistic Evaluation of Text-To-Image Models
Holistic Evaluation of Text-To-Image Models
Tony Lee
Michihiro Yasunaga
Chenlin Meng
Yifan Mai
Joon Sung Park
...
Jun-Yan Zhu
Fei-Fei Li
Jiajun Wu
Stefano Ermon
Percy Liang
241
139
0
07 Nov 2023
Multimodal Foundation Models for Zero-shot Animal Species Recognition in
  Camera Trap Images
Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images
Zalan Fabian
Zhongqi Miao
Chunyuan Li
Yuanhan Zhang
Ziwei Liu
...
Laura Siabatto
Andrés Link
Pablo Arbelaez
Rahul Dodhia
J. L. Ferres
98
11
0
02 Nov 2023
Recognize Any Regions
Recognize Any Regions
Haosen Yang
Chuofan Ma
Bin Wen
Yi Jiang
Zehuan Yuan
Xiatian Zhu
ObjDVLM
96
3
0
02 Nov 2023
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species
  Classification and Mapping
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping
Srikumar Sastry
Subash Khanal
Aayush Dhakal
Di Huang
Nathan Jacobs
77
10
0
29 Oct 2023
Open-Set Image Tagging with Multi-Grained Text Supervision
Open-Set Image Tagging with Multi-Grained Text Supervision
Xinyu Huang
Yi-Jie Huang
Youcai Zhang
Weiwei Tian
Rui Feng
Yuejie Zhang
Yanchun Xie
Yaqian Li
Lei Zhang
VLM
87
35
0
23 Oct 2023
Dual Feature Augmentation Network for Generalized Zero-shot Learning
Dual Feature Augmentation Network for Generalized Zero-shot Learning
L. Xiang
Yuan Zhou
Haoran Duan
Yang Long
80
1
0
25 Sep 2023
Exploring Meta Information for Audio-based Zero-shot Bird Classification
Exploring Meta Information for Audio-based Zero-shot Bird Classification
Alexander Gebhard
Andreas Triantafyllopoulos
Teresa Bez
Lukas Christ
Alexander Kathan
Björn W. Schuller
97
6
0
15 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jing Liu
278
31
0
27 Aug 2023
Improving Generalization of Image Captioning with Unsupervised Prompt
  Learning
Improving Generalization of Image Captioning with Unsupervised Prompt Learning
Hongchen Wei
Zhenzhong Chen
VLM
79
3
0
05 Aug 2023
General-Purpose Multi-Modal OOD Detection Framework
General-Purpose Multi-Modal OOD Detection Framework
Viet Duong
Qiong Wu
Zhengyi Zhou
Eric Zavesky
Jiahe Chen
Xiangzhou Liu
Wen-Ling Hsu
Huajie Shao
OODD
78
2
0
24 Jul 2023
Learning Adversarial Semantic Embeddings for Zero-Shot Recognition in
  Open Worlds
Learning Adversarial Semantic Embeddings for Zero-Shot Recognition in Open Worlds
Tianqi Li
Guansong Pang
Xiao Bai
Jingyi Zheng
Lei Zhou
Xin Ning
VLM
86
30
0
07 Jul 2023
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
Uddeshya Upadhyay
Shyamgopal Karthik
Massimiliano Mancini
Zeynep Akata
MLLMVLM
86
4
0
01 Jul 2023
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image
  Editing
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Kai Zhang
Lingbo Mo
Wenhu Chen
Huan Sun
Yu-Chuan Su
EGVM
226
277
0
16 Jun 2023
Waffling around for Performance: Visual Classification with Random Words
  and Broad Concepts
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts
Karsten Roth
Jae Myung Kim
A. Sophia Koepke
Oriol Vinyals
Cordelia Schmid
Zeynep Akata
VLM
95
76
0
12 Jun 2023
Towards Unified Text-based Person Retrieval: A Large-scale
  Multi-Attribute and Language Search Benchmark
Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
Shuyu Yang
Yinan Zhou
Yaxiong Wang
Yujiao Wu
Li Zhu
Zhedong Zheng
VLMDiffM
144
92
0
05 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and
  Outlook of Recent Work
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
100
0
0
02 Jun 2023
12345678
Next