ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.15442
  4. Cited By
Shifting More Attention to Visual Backbone: Query-modulated Refinement
  Networks for End-to-End Visual Grounding

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

29 March 2022
Jiabo Ye
Junfeng Tian
Ming Yan
Xiaoshan Yang
Xuwu Wang
Ji Zhang
Liang He
Xin Lin
    ObjD
ArXivPDFHTML

Papers citing "Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding"

35 / 35 papers shown
Title
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
Jingchao Wang
Hong Wang
Wenlong Zhang
Kunhua Ji
Dingjiang Huang
Yefeng Zheng
ObjD
50
0
0
22 Apr 2025
Efficient Adaptation For Remote Sensing Visual Grounding
Efficient Adaptation For Remote Sensing Visual Grounding
Hasan Moughnieh
Mohamad Chalhoub
Hasan Nasrallah
Cristiano Nattero
Paolo Campanella
Giovanni Nico
A. Ghandour
51
0
0
29 Mar 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Liangtao Shi
Ting Liu
Xiantao Hu
Yue Hu
Quanjun Yin
Richang Hong
ObjD
54
0
0
24 Feb 2025
Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Yifang Xu
Yunzhuo Sun
Benxiang Zhai
Zien Xie
Youyao Jia
S. Du
63
2
0
18 Jan 2025
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Ting Liu
Zunnan Xu
Yue Hu
Liangtao Shi
Zhiqiang Wang
Quanjun Yin
67
2
0
03 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
67
4
0
31 Dec 2024
To Predict or Not To Predict? Proportionally Masked Autoencoders for
  Tabular Data Imputation
To Predict or Not To Predict? Proportionally Masked Autoencoders for Tabular Data Imputation
Jungkyu Kim
Kibok Lee
Taeyoung Park
52
0
0
26 Dec 2024
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
227
0
0
01 Dec 2024
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive
  Position Correction for Visual Grounding
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding
Minghong Xie
Ming Wang
Huafeng Li
Yafei Zhang
Dapeng Tao
Z. Yu
ObjD
40
1
0
31 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with
  Mask Referring Modeling
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
40
5
0
10 Oct 2024
Make Graph-based Referring Expression Comprehension Great Again through
  Expression-guided Dynamic Gating and Regression
Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression
Jingcheng Ke
Dele Wang
Jun-Cheng Chen
I-Hong Jhuo
Chia-Wen Lin
Yen-Yu Lin
35
0
0
05 Sep 2024
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
Weitai Kang
Mengxue Qu
Yunchao Wei
Yan Yan
44
6
0
03 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
45
4
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
41
9
0
03 Jul 2024
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Wei Su
Peihan Miao
Huanzhang Dou
Xi Li
ObjD
50
7
0
26 Jun 2024
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual
  Grounding
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
34
8
0
20 Apr 2024
Unifying Visual and Vision-Language Tracking via Contrastive Learning
Unifying Visual and Vision-Language Tracking via Contrastive Learning
Yinchao Ma
Yuyang Tang
Wenfei Yang
Tianzhu Zhang
Jinpeng Zhang
Mengxue Kang
ObjD
23
12
0
20 Jan 2024
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal
  Distillation
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
Jiaxi Wang
Wenhui Hu
Xueyang Liu
Beihu Wu
Yuting Qiu
Yingying Cai
26
0
0
29 Dec 2023
Cycle-Consistency Learning for Captioning and Grounding
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
45
7
0
23 Dec 2023
Mono3DVG: 3D Visual Grounding in Monocular Images
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan
Yuan. Yuan
Zhitong Xiong
MDE
36
9
0
13 Dec 2023
Language-Guided Diffusion Model for Visual Grounding
Language-Guided Diffusion Model for Visual Grounding
Sijia Chen
Baochun Li
37
5
0
18 Aug 2023
Language Adaptive Weight Generation for Multi-task Visual Grounding
Language Adaptive Weight Generation for Multi-task Visual Grounding
Wei Su
Peihan Miao
Huanzhang Dou
Gaoang Wang
Liang Qiao
Zheyang Li
Xi Li
ObjD
27
32
0
06 Jun 2023
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual
  Grounding
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding
Chenchi Zhang
Jun Xiao
Lei Chen
Jian Shao
Long Chen
VLM
LRM
34
2
0
19 May 2023
CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding
CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding
Linhui Xiao
Xiaoshan Yang
Fang Peng
Ming Yan
Yaowei Wang
Changsheng Xu
ObjD
VLM
35
30
0
15 May 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
  Textual Guidance
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma
Xiaioqing Zhang
Xiaoshuai Sun
Jiayi Ji
Haowei Wang
Guannan Jiang
Weilin Zhuang
Rongrong Ji
25
39
0
28 Mar 2023
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and
  Grounding
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
50
26
0
28 Nov 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
78
107
0
23 Oct 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual
  Grounding
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
30
23
0
28 Sep 2022
TransVG++: End-to-End Visual Grounding with Language Conditioned Vision
  Transformer
TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer
Jiajun Deng
Zhengyuan Yang
Daqing Liu
Tianlang Chen
Wen-gang Zhou
Yanyong Zhang
Houqiang Li
Wanli Ouyang
ViT
35
50
0
14 Jun 2022
Training Vision-Language Transformers from Captions
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
174
11
0
19 May 2022
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of
  One-Stage Referring Expression Comprehension
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension
Gen Luo
Yiyi Zhou
Jiamu Sun
Xiaoshuai Sun
Rongrong Ji
ObjD
21
10
0
17 Apr 2022
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language
  Matching
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching
Chenchi Zhang
Wenbo Ma
Jun Xiao
Hanwang Zhang
Jian Shao
Yueting Zhuang
Long Chen
29
4
0
12 May 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
319
1,525
0
27 Feb 2021
A Real-Time Cross-modality Correlation Filtering Method for Referring
  Expression Comprehension
A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension
Yue Liao
Si Liu
Guanbin Li
Fei Wang
Yanjie Chen
Chao Qian
Bo-wen Li
ObjD
64
174
0
16 Sep 2019
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
167
1,465
0
06 Jun 2016
1