87
0

CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis

Main:9 Pages
6 Figures
Bibliography:3 Pages
5 Tables
Appendix:5 Pages
Abstract

Spectral imaging offers promising applications across diverse domains, including medicine and urban scene understanding, and is already established as a critical modality in remote sensing. However, variability in channel dimensionality and captured wavelengths among spectral cameras impede the development of AI-driven methodologies, leading to camera-specific models with limited generalizability and inadequate cross-camera applicability. To address this bottleneck, we introduce CARL\textbf{CARL}, a model for C\textbf{C}amera-A\textbf{A}gnostic R\textbf{R}epresentation L\textbf{L}earning across RGB, multispectral, and hyperspectral imaging modalities. To enable the conversion of a spectral image with any channel dimensionality to a camera-agnostic embedding, we introduce wavelength positional encoding and a self-attention-cross-attention mechanism to compress spectral information into learned query representations. Spectral-spatial pre-training is achieved with a novel spectral self-supervised JEPA-inspired strategy tailored to CARL. Large-scale experiments across the domains of medical imaging, autonomous driving, and satellite imaging demonstrate our model's unique robustness to spectral heterogeneity, outperforming on datasets with simulated and real-world cross-camera spectral variations. The scalability and versatility of the proposed approach position our model as a backbone for future spectral foundation models.

View on arXiv
@article{baumann2025_2504.19223,
  title={ CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis },
  author={ Alexander Baumann and Leonardo Ayala and Silvia Seidlitz and Jan Sellner and Alexander Studier-Fischer and Berkin Özdemir and Lena Maier-Hein and Slobodan Ilic },
  journal={arXiv preprint arXiv:2504.19223},
  year={ 2025 }
}
Comments on this paper