stencil

Measure the performance of different halo-exchange approaches in MPI for a regular stencil code

Input Parameters

nrange
Range of problem sizes. See below
energy
Amount of energy to input into the simulation
niters
Number of iterations to perform
px
Number of processes in the x-direction
py
Number of processes in the y-direction
filename
Name of output file (stdout if not provided)

Output Results

The output is printed as a tab-separated table, indicating the time for the different operations and halo-exchange choices, as well as an achieved computation rate. The default output is to stdout, but can be changed with a commandline argument.

Notes

The nrange parameter provides the size of a size of the mesh, and may be either a single value, a comma-separated list, or a range in the form start:end:increment.

The program should be compiled with optimization and with vectorization enabled if the computation rates are to be used. Note that the computation code does not compute the heat per iteration unless the code is compiled with -DCOMPUTE_HEAT_EACH_ITERATION=1. Including this code causes the Cray compiler (at least) to fail to vectorize the application of the stencil in the same loop, causing a significant loss in performance.