AHAP: Reconstructing Arbitrary Humans from Arbitrary Perspectives with Geometric Priors

27 February 2026

Xiaozhen Qiao

Wenjia Wang

Zhiyuan Zhao

Jiacheng Sun

Ping Luo

Hongyuan Zhang

Xuelong Li

3DH

ArXiv (abs)PDF HTML

Main:7 Pages

10 Figures

Bibliography:3 Pages

12 Tables

Appendix:9 Pages

Abstract

Reconstructing 3D humans from images captured at multiple perspectives typically requires pre-calibration, like using checkerboards or MVS algorithms, which limits scalability and applicability in diverse real-world scenarios. In this work, we present \textbf{AHAP} (Reconstructing \textbf{A}rbitrary \textbf{H}umans from \textbf{A}rbitrary \textbf{P}erspectives), a feed-forward framework for reconstructing arbitrary humans from arbitrary camera perspectives without requiring camera calibration. Our core lies in the effective fusion of multi-view geometry to assist human association, reconstruction and localization. Specifically, we use a Cross-View Identity Association module through learnable person queries and soft assignment, supervised by contrastive learning to resolve cross-view human identity association. A Human Head fuses cross-view features and scene context for SMPL prediction, guided by cross-view reprojection losses to enforce body pose consistency. Additionally, multi-view geometry eliminates the depth ambiguity inherent in monocular methods, providing more precise 3D human localization through multi-view triangulation. Experiments on EgoHumans and EgoExo4D demonstrate that AHAP achieves competitive performance on both world-space human reconstruction and camera pose estimation, while being 180 $\times$ faster than optimization-based approaches.

View on arXiv

Comments on this paper