ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.05644
  4. Cited By
Training Vision Transformers for Image Retrieval

Training Vision Transformers for Image Retrieval

10 February 2021
Alaaeldin El-Nouby
Natalia Neverova
Ivan Laptev
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training Vision Transformers for Image Retrieval"

40 / 40 papers shown
Title
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
Guanqi Zhan
Yuanpei Liu
Kai Han
Weidi Xie
Andrew Zisserman
VLM
243
0
0
21 Feb 2025
Triplet Synthesis For Enhancing Composed Image Retrieval via Counterfactual Image Generation
Kenta Uesugi
Naoki Saito
Keisuke Maeda
Takahiro Ogawa
Miki Haseyama
44
0
0
22 Jan 2025
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Chancharik Mitra
Brandon Huang
Tianning Chai
Zhiqiu Lin
Assaf Arbelle
Rogerio Feris
Leonid Karlinsky
Trevor Darrell
Deva Ramanan
Roei Herzig
VLM
137
4
0
28 Nov 2024
HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean
  Space
HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space
Jacob Fein-Ashley
Ethan Feng
Minh Pham
32
3
0
25 Sep 2024
Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Yun Yue
Fangzhou Lin
Guanyi Mou
Ziming Zhang
SSL
34
1
0
23 Apr 2024
Towards Improved Proxy-based Deep Metric Learning via Data-Augmented
  Domain Adaptation
Towards Improved Proxy-based Deep Metric Learning via Data-Augmented Domain Adaptation
Li Ren
Chen Chen
Liqiang Wang
Kien Hua
46
7
0
01 Jan 2024
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and
  Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
36
6
0
27 Sep 2023
Masked Momentum Contrastive Learning for Zero-shot Semantic
  Understanding
Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding
Jiantao Wu
Shentong Mo
Muhammad Awais
Sara Atito
Zhenhua Feng
J. Kittler
VLM
36
4
0
22 Aug 2023
Coarse-to-Fine: Learning Compact Discriminative Representation for
  Single-Stage Image Retrieval
Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
Yunquan Zhu
Xinkai Gao
Bo Ke
Ruizhi Qiao
Xing Sun
26
4
0
08 Aug 2023
Reading Radiology Imaging Like The Radiologist
Reading Radiology Imaging Like The Radiologist
Yuhao Wang
MedIm
39
0
0
12 Jul 2023
Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Yuqi Zhang
Qi Qian
Hongsong Wang
Chong Liu
Weihua Chen
Fan Wang
29
16
0
15 Jun 2023
Efficient OCR for Building a Diverse Digital History
Efficient OCR for Building a Diverse Digital History
Jacob Carlson
Tom Bryan
Melissa Dell
38
11
0
05 Apr 2023
MABNet: Master Assistant Buddy Network with Hybrid Learning for Image
  Retrieval
MABNet: Master Assistant Buddy Network with Hybrid Learning for Image Retrieval
Rohit Agarwal
Gyanendra Das
Saksham Aggarwal
Alexander Horsch
Dilip K. Prasad
26
0
0
06 Mar 2023
Image Segmentation-based Unsupervised Multiple Objects Discovery
Image Segmentation-based Unsupervised Multiple Objects Discovery
Sandra Kara
Hejer Ammar
Florian Chabot
Q. C. Pham
OCL
24
6
0
20 Dec 2022
Co-training $2^L$ Submodels for Visual Recognition
Co-training 2L2^L2L Submodels for Visual Recognition
Hugo Touvron
Matthieu Cord
Maxime Oquab
Piotr Bojanowski
Jakob Verbeek
Hervé Jégou
VLM
37
9
0
09 Dec 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for
  Efficient Vision Transformer Training
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
34
22
0
19 Nov 2022
Boosting vision transformers for image retrieval
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
34
32
0
21 Oct 2022
General Image Descriptors for Open World Image Retrieval using ViT CLIP
General Image Descriptors for Open World Image Retrieval using ViT CLIP
Marcos V. Conde
Ivan Aerlic
Simon Jégou
CLIP
34
2
0
20 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
25
3
0
09 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
34
3
0
09 Oct 2022
Supervised Metric Learning to Rank for Retrieval via Contextual
  Similarity Optimization
Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization
Christopher Liao
Theodoros Tsiligkaridis
Brian Kulis
SSL
34
5
0
04 Oct 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid
  Counterfactual Training for Robust Content-based Image Retrieval
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
55
6
0
09 Jul 2022
Learning Sequential Descriptors for Sequence-based Visual Place
  Recognition
Learning Sequential Descriptors for Sequence-based Visual Place Recognition
R. Mereu
Gabriele Trivigno
Gabriele Berton
Carlo Masone
Barbara Caputo
21
30
0
08 Jul 2022
BodyMap: Learning Full-Body Dense Correspondence Map
BodyMap: Learning Full-Body Dense Correspondence Map
A. Ianina
N. Sarafianos
Yuanlu Xu
Ignacio Rocco
Tony Tung
3DH
30
14
0
18 May 2022
Residual Mixture of Experts
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
27
36
0
20 Apr 2022
Hyperbolic Vision Transformers: Combining Improvements in Metric
  Learning
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning
Aleksandr Ermolov
L. Mirvakhabova
Valentin Khrulkov
N. Sebe
Ivan Oseledets
36
100
0
21 Mar 2022
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation
  Protocol for Segment-level Video Copy Detection
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection
Sifeng He
Xudong Yang
Chenhan Jiang
Gang Liang
Wei Zhang
...
Kaiming Huang
Yuan Cheng
Feng Qian
Xiaobo Zhang
Lei Yang
26
12
0
05 Mar 2022
A Self-Supervised Descriptor for Image Copy Detection
A Self-Supervised Descriptor for Image Copy Detection
Ed Pizzi
Sreya . Dutta Roy
Sugosh Nagavara Ravindra
Priya Goyal
Matthijs Douze
SSL
34
117
0
21 Feb 2022
Scene-Adaptive Attention Network for Crowd Counting
Scene-Adaptive Attention Network for Crowd Counting
Xing Wei
Yuanrui Kang
Jihao Yang
Yunfeng Qiu
Dahu Shi
Wenming Tan
Yihong Gong
ViT
27
18
0
31 Dec 2021
All the attention you need: Global-local, spatial-channel attention for
  image retrieval
All the attention you need: Global-local, spatial-channel attention for image retrieval
Chull Hwan Song
Hye Joo Han
Yannis Avrithis
18
39
0
16 Jul 2021
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization
Jun Wang
Xiaohan Yu
Yongsheng Gao
ViT
43
105
0
06 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
51
959
0
01 Jul 2021
XCiT: Cross-Covariance Image Transformers
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
42
499
0
17 Jun 2021
Person Re-Identification with a Locally Aware Transformer
Person Re-Identification with a Locally Aware Transformer
Charu Sharma
S. R. Kapil
David Chapman
ViT
48
45
0
07 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
50
4,841
0
31 May 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,224
0
22 Apr 2021
SiT: Self-supervised vIsion Transformer
SiT: Self-supervised vIsion Transformer
Sara Atito Ali Ahmed
Muhammad Awais
J. Kittler
ViT
39
139
0
08 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
988
0
31 Mar 2021
Deep Learning for Instance Retrieval: A Survey
Deep Learning for Instance Retrieval: A Survey
Wei Chen
Yu Liu
Weiping Wang
E. Bakker
Theodoros Georgiou
Paul Fieguth
Li Liu
M. Lew
VLM
13
145
0
27 Jan 2021
Investigating the Vision Transformer Model for Image Retrieval Tasks
Investigating the Vision Transformer Model for Image Retrieval Tasks
S. Gkelios
Y. Boutalis
S. Chatzichristofis
VLM
ViT
26
30
0
11 Jan 2021
1