ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.13948
  4. Cited By
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual
  Tasks

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

28 September 2022
Zhiyang Chen
Yousong Zhu
Zhaowen Li
Fan Yang
Wei Li
Haixin Wang
Chaoyang Zhao
Liwei Wu
Rui Zhao
Jinqiao Wang
Ming Tang
    VLM
    VOS
ArXivPDFHTML

Papers citing "Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks"

10 / 10 papers shown
Title
FreConv: Frequency Branch-and-Integration Convolutional Networks
FreConv: Frequency Branch-and-Integration Convolutional Networks
Zhaowen Li
Xu Zhao
Peigeng Ding
Zongxin Gao
Yuting Yang
Ming Tang
Jinqiao Wang
26
2
0
10 Apr 2023
Token Boosting for Robust Self-Supervised Visual Transformer
  Pre-training
Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Tianjiao Li
Lin Geng Foo
Ping Hu
Xindi Shang
Hossein Rahmani
Zehuan Yuan
J. Liu
40
7
0
09 Apr 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense
  Video Captioning
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
34
220
0
27 Feb 2023
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
189
499
0
22 Feb 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
146
728
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,434
0
11 Nov 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
241
344
0
22 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
317
5,775
0
29 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,622
0
24 Feb 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,918
0
31 Dec 2020
1