ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00808
  4. Cited By
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

2 January 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
    SyDa
ArXivPDFHTML

Papers citing "ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders"

50 / 328 papers shown
Title
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi
Fuxiao Liu
Shihao Wang
Shijia Liao
Subhashree Radhakrishnan
...
Andrew Tao
Andrew Tao
Zhiding Yu
Guilin Liu
Guilin Liu
MLLM
38
55
0
28 Aug 2024
GenFormer -- Generated Images are All You Need to Improve Robustness of
  Transformers on Small Datasets
GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets
Sven Oehri
Nikolas Ebert
Ahmed Abdullah
Didier Stricker
Oliver Wasenmüller
ViT
28
5
0
26 Aug 2024
VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation
  Models
VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models
Wentao Wu
Fanghua Hong
Xiao Wang
Chenglong Li
Jin Tang
VLM
62
1
0
23 Aug 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform
  Generation
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
OOD
DiffM
AI4TS
55
5
0
14 Aug 2024
Whitening Consistently Improves Self-Supervised Learning
Whitening Consistently Improves Self-Supervised Learning
András Kalapos
Bálint Gyires-Tóth
SSL
47
0
0
14 Aug 2024
CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture
CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture
András Kalapos
Bálint Gyires-Tóth
42
2
0
14 Aug 2024
Masked Image Modeling: A Survey
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
72
7
0
13 Aug 2024
MV2DFusion: Leveraging Modality-Specific Object Semantics for
  Multi-Modal 3D Detection
MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection
Zitian Wang
Zehao Huang
Yulu Gao
Naiyan Wang
Si Liu
3DPC
51
4
0
12 Aug 2024
Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face
  Manipulation Detection and Localization
Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization
Changtao Miao
Qi Chu
Tao Gong
Zhentao Tan
Zhenchao Jin
Wanyi Zhuang
Man Luo
Honggang Hu
Nenghai Yu
CVBM
54
1
0
05 Aug 2024
Medical SAM 2: Segment medical images as video via Segment Anything
  Model 2
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Jiayuan Zhu
Yunli Qi
A. El Abbadi
VLM
MedIm
42
68
0
01 Aug 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
31
22
0
31 Jul 2024
DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain
  Feature Extraction and Interaction Attention
DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention
Wei Wang
Jixing He
Xin Wang
52
0
0
30 Jul 2024
Advancing Multimodal Large Language Models in Chart Question Answering
  with Visualization-Referenced Instruction Tuning
Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning
Xingchen Zeng
Haichuan Lin
Yilin Ye
Wei Zeng
57
15
0
29 Jul 2024
UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video,
  and Audio-Visual Content
UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Yuhang Cao
Xiongkuo Min
Yixuan Gao
Wei Sun
Weisi Lin
Guangtao Zhai
51
2
0
29 Jul 2024
Flexible graph convolutional network for 3D human pose estimation
Flexible graph convolutional network for 3D human pose estimation
Abu Taib u
26 A.BenHamza
3DH
GNN
31
0
0
26 Jul 2024
Estimating Earthquake Magnitude in Sentinel-1 Imagery via Ranking
Estimating Earthquake Magnitude in Sentinel-1 Imagery via Ranking
Daniele Rege Cambrin
Isaac Corley
Paolo Garza
Peyman Najafirad
33
0
0
25 Jul 2024
The Group Robustness is in the Details: Revisiting Finetuning under
  Spurious Correlations
The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations
Tyler LaBonte
John C. Hill
Xinchen Zhang
Vidya Muthukumar
Abhishek Kumar
AAML
41
0
0
19 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
59
0
0
18 Jul 2024
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an
  Efficient Alternative to Attention in ViTs
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
Yunling Zheng
Zeyi Xu
Fanghui Xue
Biao Yang
Jiancheng Lyu
Shuai Zhang
Y. Qi
Jack Xin
61
0
0
16 Jul 2024
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Hao Ding
Tuxun Lu
Yuqian Zhang
Ruixing Liang
Hongchao Shu
...
Bo Wang
Marcos Fernández-Rodríguez
Estevao Lima
João L. Vilaça
Mathias Unberath
65
4
0
16 Jul 2024
DANIEL: A fast Document Attention Network for Information Extraction and
  Labelling of handwritten documents
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documents
Thomas Constum
Pierrick Tranouez
Thierry Paquet
32
5
0
12 Jul 2024
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on
  Robustness
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Honghao Chen
Yurong Zhang
Xiaokun Feng
Xiangxiang Chu
Kaiqi Huang
AAML
42
5
0
12 Jul 2024
Adaptive Parametric Activation
Adaptive Parametric Activation
Konstantinos Panagiotis Alexandridis
Jiankang Deng
Anh Nguyen
Shan Luo
43
2
0
11 Jul 2024
MNeRV: A Multilayer Neural Representation for Videos
MNeRV: A Multilayer Neural Representation for Videos
Qingling Chang
Haohui Yu
Shuxuan Fu
Zhiqiang Zeng
Chuangquan Chen
38
0
0
10 Jul 2024
AnatoMask: Enhancing Medical Image Segmentation with
  Reconstruction-guided Self-masking
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Yuheng Li
Tianyu Luan
Yizhou Wu
Shaoyan Pan
Yenho Chen
Xiaofeng Yang
40
5
0
09 Jul 2024
Isomorphic Pruning for Vision Models
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
42
6
0
05 Jul 2024
reBEN: Refined BigEarthNet Dataset for Remote Sensing Image Analysis
reBEN: Refined BigEarthNet Dataset for Remote Sensing Image Analysis
Kai Norman Clasen
Leonard Hackel
Tom Burgert
Gencer Sumbul
Begüm Demir
Volker Markl
68
12
0
04 Jul 2024
Precision at Scale: Domain-Specific Datasets On-Demand
Precision at Scale: Domain-Specific Datasets On-Demand
Jesús M. Rodríguez-de-Vera
Imanol G. Estepa
Ignacio Sarasúa
Bhalaji Nagarajan
Petia Radeva
45
2
0
03 Jul 2024
AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
Yuxiang Chai
Siyuan Huang
Yazhe Niu
Han Xiao
Liang Liu
Dingyu Zhang
Peng Gao
Shuai Ren
Hongsheng Li
LLMAG
46
27
0
03 Jul 2024
ZEAL: Surgical Skill Assessment with Zero-shot Tool Inference Using
  Unified Foundation Model
ZEAL: Surgical Skill Assessment with Zero-shot Tool Inference Using Unified Foundation Model
Satoshi Kondo
18
0
0
03 Jul 2024
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Jianzhu Guo
Dingyun Zhang
Xiaoqiang Liu
Zhizhou Zhong
Yuan Zhang
Pengfei Wan
Di Zhang
VGen
65
54
0
03 Jul 2024
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
Hieu T. Nguyen
Yiwen Chen
Vikram S. Voleti
Varun Jampani
Huaizu Jiang
58
0
0
28 Jun 2024
SimTxtSeg: Weakly-Supervised Medical Image Segmentation with Simple Text
  Cues
SimTxtSeg: Weakly-Supervised Medical Image Segmentation with Simple Text Cues
Yuxin Xie
Tao Zhou
Yi Zhou
Geng Chen
VLM
MedIm
29
1
0
27 Jun 2024
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
Jan van Gemert
VLM
44
0
0
26 Jun 2024
Fuzzy Attention-based Border Rendering Network for Lung Organ
  Segmentation
Fuzzy Attention-based Border Rendering Network for Lung Organ Segmentation
Sheng Zhang
Yang Nan
Yingying Fang
Shiyi Wang
Xiaodan Xing
Zhifan Gao
Guang Yang
MedIm
35
0
0
23 Jun 2024
Liveness Detection in Computer Vision: Transformer-based Self-Supervised
  Learning for Face Anti-Spoofing
Liveness Detection in Computer Vision: Transformer-based Self-Supervised Learning for Face Anti-Spoofing
Arman Keresh
Pakizar Shamoi
54
5
0
19 Jun 2024
SwinStyleformer is a favorable choice for image inversion
SwinStyleformer is a favorable choice for image inversion
Jiawei Mao
Guangyi Zhao
Xuesong Yin
Yuanqi Chang
ViT
43
0
0
19 Jun 2024
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation
  Models
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models
Yongtao Ge
Guangkai Xu
Zhiyue Zhao
Libo Sun
Zheng Huang
Yanlong Sun
Hao Chen
Chunhua Shen
MDE
42
3
0
18 Jun 2024
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Yuanjun Lv
Hai Li
Ying Yan
Junhui Liu
Danming Xie
Lei Xie
48
1
0
12 Jun 2024
Low-Complexity Acoustic Scene Classification Using Parallel
  Attention-Convolution Network
Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network
Yanxiong Li
Jiaxin Tan
Guoqing Chen
Jialong Li
Yongjie Si
Qianhua He
37
0
0
12 Jun 2024
A$^{2}$-MAE: A spatial-temporal-spectral unified remote sensing
  pre-training method based on anchor-aware masked autoencoder
A2^{2}2-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder
Lixian Zhang
Yi Zhao
Runmin Dong
Jinxiao Zhang
Shuai Yuan
...
Weijia Li
Wei Liu
Wayne Zhang
Xue Jiang
Haohuan Fu
46
4
0
12 Jun 2024
SRC-Net: Bi-Temporal Spatial Relationship Concerned Network for Change
  Detection
SRC-Net: Bi-Temporal Spatial Relationship Concerned Network for Change Detection
Hongjia Chen
Xin Xu
Fangling Pu
57
6
0
09 Jun 2024
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning
  and Manipulation
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation
Jiaming Liu
Mengzhen Liu
Zhenyu Wang
Lily Lee
Kaichen Zhou
Pengju An
Senqiao Yang
Renrui Zhang
Yandong Guo
Shanghang Zhang
LM&Ro
LRM
Mamba
32
8
0
06 Jun 2024
Real-Time Spacecraft Pose Estimation Using Mixed-Precision Quantized
  Neural Network on COTS Reconfigurable MPSoC
Real-Time Spacecraft Pose Estimation Using Mixed-Precision Quantized Neural Network on COTS Reconfigurable MPSoC
Julien Posso
Guy Bois
Yvon Savaria
35
0
0
06 Jun 2024
JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against
  Diffusion Model Edits
JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits
Minzhou Pan
Yi Zeng
Xue Lin
Ning Yu
Cho-Jui Hsieh
Peter Henderson
Ruoxi Jia
WIGM
48
3
0
06 Jun 2024
DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for
  Target Detection in SAR Images
DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images
Yimian Dai
Minrui Zou
Yuxuan Li
Xiang Li
Kang Ni
Jian Yang
27
4
0
05 Jun 2024
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate
  Control
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control
Ye-Xin Lu
Yang Ai
Zheng-Yan Sheng
Zhen-Hua Ling
23
1
0
04 Jun 2024
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction
  and Waveform Generation
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
Hui-Peng Du
Ye-Xin Lu
Yang Ai
Zhen-Hua Ling
43
3
0
04 Jun 2024
fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity
  perception in image embeddings
fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings
Tillmann Ohm
Andres Karjus
Mikhail Tamm
Maximilian Schich
44
1
0
03 Jun 2024
Hybrid-Parallel: Achieving High Performance and Energy Efficient
  Distributed Inference on Robots
Hybrid-Parallel: Achieving High Performance and Energy Efficient Distributed Inference on Robots
Zekai Sun
Xiuxian Guan
Junming Wang
Haoze Song
Yuhao Qing
Tianxiang Shen
Dong Huang
Fangming Liu
Heming Cui
34
0
0
29 May 2024
Previous
1234567
Next