A Fast Foveated Fully Convolutional Network Model for Human Peripheral Vision

14 June 2017

Lex Fridman

Abstract

Visualizing the information available to a human observer in a single glance at an image provides a powerful tool for evaluating models of full-field human vision. The hard part is human-realistic visualization of the periphery. Degradation of information with distance from fixation is far more complex than a mere reduction of acuity that might be mimicked using blur with a standard deviation that linearly increases with eccentricity. Rather, behaviorally-validated models hypothesize that peripheral vision measures a large number of local texture statistics in pooling regions that overlap, grow with eccentricity, and tile the visual field. We propose a "foveated" variant of a fully convolutional network that approximates one such model. Our approach achieves a 21,000 fold reduction in average running time (from 4.2 hours to 0.7 seconds per image), and statistically similar results to the behaviorally-validated model.

View on arXiv

Comments on this paper