#include <PSkelStencil.h>
Public Member Functions | |
void | runSequential () |
void | runCPU (size_t numThreads=0) |
void | runGPU (size_t GPUBlockSize=0) |
void | runTilingGPU (size_t tilingWidth, size_t tilingHeight, size_t tilingDepth, size_t GPUBlockSize=0) |
void | runAutoGPU (size_t GPUBlockSize=0) |
void | runIterativeSequential (size_t iterations) |
void | runIterativeCPU (size_t iterations, size_t numThreads=0) |
void | runIterativeGPU (size_t iterations, size_t GPUBlockSize=0) |
void | runIterativeTilingGPU (size_t iterations, size_t tilingWidth, size_t tilingHeight, size_t tilingDepth, size_t innerIterations=1, size_t GPUBlockSize=0) |
void | runIterativeAutoGPU (size_t iterations, size_t GPUBlockSize=0) |
Protected Member Functions | |
virtual void | runSeq (Array in, Array out)=0 |
virtual void | runOpenMP (Array in, Array out, size_t numThreads)=0 |
void | runCUDA (Array, Array, int) |
void | runIterativeTilingCUDA (Array in, Array out, StencilTiling< Array, Mask > tiling, size_t GPUBlockSize) |
Protected Attributes | |
Array | input |
Array | output |
Args | args |
Mask | mask |
Class that implements the basic functionalities supported by the stencil skeletons.
void PSkel::StencilBase< Array, Mask, Args >::runAutoGPU | ( | size_t | GPUBlockSize = 0 | ) |
Executes in GPU a single iteration of the stencil computation. If the data is larger than the memory available in the GPU, this function automatically selects a tiling execution of the stencil computation.
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runCPU | ( | size_t | numThreads = 0 | ) |
Executes in CPU, using multithreads, a single iteration of the stencil computation.
[in] | numThreads | the number of threads used for processing the stencil kernel. if numThreads is 0, the number of threads is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runGPU | ( | size_t | GPUBlockSize = 0 | ) |
Executes in GPU a single iteration of the stencil computation. This function does not handle data larger than the memory available in the GPU (see runAutoGPU.)
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runIterativeAutoGPU | ( | size_t | iterations, |
size_t | GPUBlockSize = 0 |
||
) |
Executes in GPU multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input. If the data is larger than the memory available in the GPU, this function automatically selects a tiling execution of the stencil computation, including the number of iterations to be consecutivelly executed on GPU.
[in] | iterations | the number of iterations to be computed. |
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runIterativeCPU | ( | size_t | iterations, |
size_t | numThreads = 0 |
||
) |
Executes in CPU, using multithreads, multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input.
[in] | iterations | the number of iterations to be computed. |
[in] | numThreads | the number of threads used for processing the stencil kernel. if numThreads is 0, the number of threads is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runIterativeGPU | ( | size_t | iterations, |
size_t | GPUBlockSize = 0 |
||
) |
Executes in GPU multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input. This function does not handle data larger than the memory available in the GPU (see runIterativeAutoGPU.)
[in] | iterations | the number of iterations to be computed. |
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runIterativeSequential | ( | size_t | iterations | ) |
Executes sequentially in CPU multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input.
[in] | iterations | the number of iterations to be computed. |
void PSkel::StencilBase< Array, Mask, Args >::runIterativeTilingGPU | ( | size_t | iterations, |
size_t | tilingWidth, | ||
size_t | tilingHeight, | ||
size_t | tilingDepth, | ||
size_t | innerIterations = 1 , |
||
size_t | GPUBlockSize = 0 |
||
) |
Executes in GPU multiple iterations of the stencil computation, tiling the input data. At each given iteration, except the first, the previous output is used as input. This function is useful for processing data larger than the memory available in the GPU (see runIterativeAutoGPU.)
[in] | iterations | the number of iterations to be computed. |
[in] | tilingWidth | the width size for each (logical) tile of the input data. |
[in] | tilingHeight | the height size for each (logical) tile of the input data. |
[in] | tilingDepth | the depth size for each (logical) tile of the input data. |
[in] | innerIterations | the number of iterations to be consecutively executed on GPU; the number of iterations executed consecutively on increases the amount of memory required. |
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
void PSkel::StencilBase< Array, Mask, Args >::runSequential | ( | ) |
Executes sequentially in CPU a single iteration of the stencil computation.
void PSkel::StencilBase< Array, Mask, Args >::runTilingGPU | ( | size_t | tilingWidth, |
size_t | tilingHeight, | ||
size_t | tilingDepth, | ||
size_t | GPUBlockSize = 0 |
||
) |
Executes in GPU a single iteration of the stencil computation, tiling the input data. This function is useful for processing data larger than the memory available in the GPU (see runAutoGPU.)
[in] | tilingWidth | the width size for each (logical) tile of the input data. |
[in] | tilingHeight | the height size for each (logical) tile of the input data. |
[in] | tilingDepth | the depth size for each (logical) tile of the input data. |
[in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |