ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.00739
19
0

Through-Wall Pose Imaging in Real-Time with a Many-to-Many Encoder/Decoder Paradigm

15 March 2019
Kevin Meng
Yuxian Meng
    3DH
ArXivPDFHTML
Abstract

Overcoming the visual barrier and developing "see-through vision" has been one of mankind's long-standing dreams. Unlike visible light, Radio Frequency (RF) signals penetrate opaque obstructions and reflect highly off humans. This paper establishes a deep-learning model that can be trained to reconstruct continuous video of a 15-point human skeleton even through visual occlusion. The training process adopts a student/teacher learning procedure inspired by the Feynman learning technique, in which video frames and RF data are first collected simultaneously using a co-located setup containing an optical camera and an RF antenna array transceiver. Next, the video frames are processed with a computer-vision-based gait analysis "teacher" module to generate ground-truth human skeletons for each frame. Then, the same type of skeleton is predicted from corresponding RF data using a "student" deep-learning model consisting of a Residual Convolutional Neural Network (CNN), Region Proposal Network (RPN), and Recurrent Neural Network with Long-Short Term Memory (LSTM) that 1) extracts spatial features from RF images, 2) detects all people present in a scene, and 3) aggregates information over many time-steps, respectively. The model is shown to both accurately and completely predict the pose of humans behind visual obstruction solely using RF signals. Primary academic contributions include the novel many-to-many imaging methodology, unique integration of RPN and LSTM networks, and original training pipeline.

View on arXiv
Comments on this paper