95
34

An analysis of energy-optimized lattice-Boltzmann CFD simulations from the chip to the highly parallel level

Abstract

The lattice-Boltzmann method (LBM) is an algorithm for CFD simulations that has gained popularity due to its ease of implementation and suitability for complex geometries. Its scalability on multicore chips is often limited due to its low computational intensity, leading to interesting characteristics regarding optimal performance and energy to solution on the chip and highly parallel levels. In this paper we perform a thorough analysis of a two-relaxation-time (TRT) model in a sparse lattice representation on the Intel Sandy Bridge processor. Starting from a single-core performance model we can describe the intra-chip saturation characteristics of the implementation and its optimal operating point in terms of energy to solution as a function of the propagation method, the clock frequency, and the SIMD vectorization. We then show if and how these findings may be extrapolated to the massively parallel level on a petascale-class machine, and quantify the energy-saving potential of various optimizations.

View on arXiv
Comments on this paper