Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices

Abstract
Two highly optimized implementations of the lattice Boltzmann algorithm with indirect addressing are presented based on the two-step one-grid algorithm and the Bailey's et. al AA-pattern. These implementations can for cetain access patterns avoid the indirect access, thus reducing the data transfer per lattice update (bytes/LUP). Furthermore, they feature a partial vectorization, which normally is hindered by indirect access.
View on arXivComments on this paper