ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.12872
  4. Cited By
End-to-End Object Detection with Transformers

End-to-End Object Detection with Transformers

26 May 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
    ViT
    3DV
    PINN
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Transformers"

50 / 5,124 papers shown
Title
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
Artem Zholus
Carl Doersch
Yi Yang
Skanda Koppula
Viorica Patraucean
Xu He
Ignacio Rocco
Mehdi S. M. Sajjadi
Sarath Chandar
Ross Goroshin
30
0
0
08 Apr 2025
PromptHMR: Promptable Human Mesh Recovery
PromptHMR: Promptable Human Mesh Recovery
Yufu Wang
Yu Sun
Priyanka Patel
Kostas Daniilidis
Michael J. Black
Muhammed Kocabas
3DH
54
0
0
08 Apr 2025
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
Zhenteng Li
Sheng Lian
Dengfeng Pan
Y. Wang
Wei Liu
56
0
0
08 Apr 2025
Transferable Mask Transformer: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation
Transferable Mask Transformer: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation
Enming Zhang
Z. Li
Yanru Wu
J. Wang
Yang Tan
Ruizhe Zhao
Guan Wang
Yang Li
ViT
33
0
0
08 Apr 2025
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Sohyun Lee
N. Kim
Juwon Kang
Seong Joon Oh
Suha Kwak
89
0
0
07 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Thanos Delatolas
Vicky S. Kalogeiton
Dim P. Papadopoulos
DiffM
VOS
48
1
0
07 Apr 2025
Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection
Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection
Zhenxing Ming
J. S. Berrio
Mao Shan
Stewart Worrall
3DPC
42
1
0
07 Apr 2025
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Multimodal Agricultural Agent Architecture (MA3): A New Paradigm for Intelligent Agricultural Decision-Making
Zhuoning Xu
Jian Xu
M. Zhang
P. Wang
Chao Deng
Cheng-Lin Liu
26
0
0
07 Apr 2025
REVEAL: Relation-based Video Representation Learning for Video-Question-Answering
REVEAL: Relation-based Video Representation Learning for Video-Question-Answering
Sofian Chaybouti
Walid Bousselham
Moritz Wolter
Hilde Kuehne
110
0
0
07 Apr 2025
BoxSeg: Quality-Aware and Peer-Assisted Learning for Box-supervised Instance Segmentation
BoxSeg: Quality-Aware and Peer-Assisted Learning for Box-supervised Instance Segmentation
Jinxiang Lai
Wenlong Wu
Jiawei Zhan
Jian Li
Bin-Bin Gao
J. Liu
Jie Zhang
Song Guo
ISeg
39
0
0
07 Apr 2025
GAMDTP: Dynamic Trajectory Prediction with Graph Attention Mamba Network
GAMDTP: Dynamic Trajectory Prediction with Graph Attention Mamba Network
Yunxiang Liu
Hongkuo Niu
Jianlin Zhu
27
0
0
07 Apr 2025
Inland Waterway Object Detection in Multi-environment: Dataset and Approach
Inland Waterway Object Detection in Multi-environment: Dataset and Approach
Shanshan Wang
Haixiang Xu
Hui Feng
Xiaoqian Wang
Pei Song
Sijie Liu
Jianhua He
26
0
0
07 Apr 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
Jing Bi
Chao Huang
Susan Liang
Daiki Shimada
...
Jinxi He
Liu He
Zeliang Zhang
Jiebo Luo
Chenliang Xu
37
0
0
07 Apr 2025
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection
Jiancheng Pan
Yanxing Liu
Xiao He
Long Peng
Jiahao Li
Yuze Sun
Xiaomeng Huang
33
0
0
06 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Y. Li
Yanhong Zeng
Yuwei Guo
D. Lin
Tianfan Xue
Bo Dai
VGen
24
0
0
05 Apr 2025
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
Xiao-Hui Li
Fei Yin
Cheng-Lin Liu
27
0
0
05 Apr 2025
Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
Bingxu Wang
Kunzhi Cai
Yuqi Zhang
Yachong Guo
Zeyi Zhou
Wenjiao Li
Yachong Guo
Wei Wang
Qing Zhou
MedIm
34
0
0
05 Apr 2025
A Modular Energy Aware Framework for Multicopter Modeling in Control and Planning Applications
A Modular Energy Aware Framework for Multicopter Modeling in Control and Planning Applications
Sebastian Gasche
Christian Kallies
Andreas Himmel
R. Findeisen
36
0
0
04 Apr 2025
Control Map Distribution using Map Query Bank for Online Map Generation
Control Map Distribution using Map Query Bank for Online Map Generation
Ziming Liu
Leichen Wang
Ge Yang
Xinrun Li
Xingtao Hu
Hao Sun
Guangyu Gao
31
0
0
04 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
146
0
0
04 Apr 2025
ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving
ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving
Sheng Yang
Tong Zhan
Shichen Qiao
Jicheng Gong
Qing Yang
Jian Wang
Yanfeng Lu
3DPC
39
0
0
04 Apr 2025
Pyramid-based Mamba Multi-class Unsupervised Anomaly Detection
Pyramid-based Mamba Multi-class Unsupervised Anomaly Detection
Nasar Iqbal
Niki Martinel
Mamba
53
0
0
04 Apr 2025
Mathematical Modeling of Option Pricing with an Extended Black-Scholes Framework
Mathematical Modeling of Option Pricing with an Extended Black-Scholes Framework
Nikhil Shivakumar Nayak
54
2
0
04 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
66
1
0
03 Apr 2025
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu
Jiayi Zhang
Jiaxin Chen
Jinyang Guo
Di Huang
Yunhong Wang
MQ
45
1
0
03 Apr 2025
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Boseung Jeong
Jicheol Park
Sungyeon Kim
Suha Kwak
36
0
0
03 Apr 2025
CornerPoint3D: Look at the Nearest Corner Instead of the Center
CornerPoint3D: Look at the Nearest Corner Instead of the Center
Ruixiao Zhang
Runwei Guan
X. Chen
Adam Prugel-Bennett
Xiaohao Cai
3DPC
50
0
0
03 Apr 2025
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan
A. Ahmed
Mohamad Alansari
Neha Gour
Abderaouf Behouch
...
Muzammal Naseer
Juergen Gall
Mohammed Bennamoun
Ernesto Damiani
N. Werghi
47
0
0
03 Apr 2025
GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes
GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes
Pradyumn Goyal
Dmitry Petrov
Sheldon Andrews
Yizhak Ben-Shabat
Hsueh-Ti Derek Liu
E. Kalogerakis
ViT
36
0
0
03 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
82
0
0
03 Apr 2025
Data-Driven Object Tracking: Integrating Modular Neural Networks into a Kalman Framework
Data-Driven Object Tracking: Integrating Modular Neural Networks into a Kalman Framework
Christian Alexander Holz
Christian Bader
Markus Enzweiler
Matthias Drüppel
VOT
47
0
0
03 Apr 2025
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Chang-Bin Zhang
Jinhong Ni
Yujie Zhong
Kai Han
3DV
VLM
69
0
0
02 Apr 2025
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Y. Wang
42
0
0
02 Apr 2025
Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment
Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment
Ziteng Cui
Xuangeng Chu
Tatsuya Harada
3DGS
60
0
0
02 Apr 2025
Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians
Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians
Jiamin Wu
Hongyang Li
Xiaoke Jiang
Yuan Yao
Lei Zhang
3DGS
51
0
0
01 Apr 2025
Zero-Shot 4D Lidar Panoptic Segmentation
Zero-Shot 4D Lidar Panoptic Segmentation
Yushan Zhang
Aljosa Osep
Laura Leal-Taixé
Tim Meinhardt
3DPC
47
1
0
01 Apr 2025
NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds
NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds
Mahan Rafidashti
Ji Lan
M. Fatemi
Junsheng Fu
Lars Hammarstrand
Lennart Svensson
39
0
0
01 Apr 2025
Archival Faces: Detection of Faces in Digitized Historical Documents
Archival Faces: Detection of Faces in Digitized Historical Documents
Marek Vaško
Adam Herout
Michal Hradiš
CVBM
65
0
0
01 Apr 2025
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification
Salim Khazem
Jérémy Fix
C´edric Pradalier
41
0
0
01 Apr 2025
CoMatch: Dynamic Covisibility-Aware Transformer for Bilateral Subpixel-Level Semi-Dense Image Matching
CoMatch: Dynamic Covisibility-Aware Transformer for Bilateral Subpixel-Level Semi-Dense Image Matching
Zizhuo Li
Yifan Lu
Linfeng Tang
S. Zhang
Jiayi Ma
52
1
0
31 Mar 2025
Bridge the Gap Between Visual and Linguistic Comprehension for Generalized Zero-shot Semantic Segmentation
Bridge the Gap Between Visual and Linguistic Comprehension for Generalized Zero-shot Semantic Segmentation
Xiaoqing Guo
W. J. Li
Yixuan Yuan
55
0
0
31 Mar 2025
A Concise Survey on Lane Topology Reasoning for HD Mapping
A Concise Survey on Lane Topology Reasoning for HD Mapping
Yi Yao
Miao Fan
Shengtong Xu
Haoyi Xiong
Xiangzeng Liu
Wenbo Hu
Wenbing Huang
3DV
26
0
0
31 Mar 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
42
0
0
31 Mar 2025
A Benchmark for Vision-Centric HD Mapping by V2I Systems
A Benchmark for Vision-Centric HD Mapping by V2I Systems
Miao Fan
Shanshan Yu
Shengtong Xu
Kun Jiang
Haoyi Xiong
Xiangzeng Liu
3DV
46
0
0
31 Mar 2025
SmartScan: An AI-based Interactive Framework for Automated Region Extraction from Satellite Images
SmartScan: An AI-based Interactive Framework for Automated Region Extraction from Satellite Images
S. Nagendra
Kashif Rashid
38
0
0
31 Mar 2025
Video-based Traffic Light Recognition by Rockchip RV1126 for Autonomous Driving
Video-based Traffic Light Recognition by Rockchip RV1126 for Autonomous Driving
Miao Fan
Xuxu Kong
Shengtong Xu
Haoyi Xiong
Xiangzeng Liu
ViT
46
0
0
31 Mar 2025
Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment
Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment
Masato Tamura
34
0
0
31 Mar 2025
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
Xingcheng Zhou
Xuyuan Han
Feng Yang
Yunpu Ma
Alois C. Knoll
VLM
53
1
0
30 Mar 2025
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing
Hongxiang Jiang
Jihao Yin
Qixiong Wang
Jiaqi Feng
Guo Chen
48
0
0
30 Mar 2025
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
36
0
0
30 Mar 2025
Previous
123456...101102103
Next