Public Member Functions |
| void | runSequential () |
| void | runCPU (size_t numThreads=0) |
| void | runGPU (size_t GPUBlockSize=0) |
| void | runTilingGPU (size_t tilingWidth, size_t tilingHeight, size_t tilingDepth, size_t GPUBlockSize=0) |
| void | runAutoGPU (size_t GPUBlockSize=0) |
| void | runIterativeSequential (size_t iterations) |
| void | runIterativeCPU (size_t iterations, size_t numThreads=0) |
| void | runIterativeGPU (size_t iterations, size_t GPUBlockSize=0) |
| void | runIterativeTilingGPU (size_t iterations, size_t tilingWidth, size_t tilingHeight, size_t tilingDepth, size_t innerIterations=1, size_t GPUBlockSize=0) |
| void | runIterativeAutoGPU (size_t iterations, size_t GPUBlockSize=0) |
Protected Member Functions |
|
virtual void | runSeq (Array in, Array out)=0 |
|
virtual void | runOpenMP (Array in, Array out, size_t numThreads)=0 |
|
void | runCUDA (Array, Array, int) |
|
void | runIterativeTilingCUDA (Array in, Array out, StencilTiling< Array, Mask > tiling, size_t GPUBlockSize) |
Protected Attributes |
|
Array | input |
|
Array | output |
|
Args | args |
|
Mask | mask |
template<class Array, class Mask, class Args = int>
class PSkel::StencilBase< Array, Mask, Args >
Class that implements the basic functionalities supported by the stencil skeletons.
template<class Array , class Mask , class Args >
Executes in GPU multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input. If the data is larger than the memory available in the GPU, this function automatically selects a tiling execution of the stencil computation, including the number of iterations to be consecutivelly executed on GPU.
- Parameters:
-
| [in] | iterations | the number of iterations to be computed. |
| [in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |
template<class Array , class Mask , class Args >
Executes in CPU, using multithreads, multiple iterations of the stencil computation. At each given iteration, except the first, the previous output is used as input.
- Parameters:
-
| [in] | iterations | the number of iterations to be computed. |
| [in] | numThreads | the number of threads used for processing the stencil kernel. if numThreads is 0, the number of threads is automatically chosen. |
template<class Array , class Mask , class Args >
Executes in GPU multiple iterations of the stencil computation, tiling the input data. At each given iteration, except the first, the previous output is used as input. This function is useful for processing data larger than the memory available in the GPU (see runIterativeAutoGPU.)
- Parameters:
-
| [in] | iterations | the number of iterations to be computed. |
| [in] | tilingWidth | the width size for each (logical) tile of the input data. |
| [in] | tilingHeight | the height size for each (logical) tile of the input data. |
| [in] | tilingDepth | the depth size for each (logical) tile of the input data. |
| [in] | innerIterations | the number of iterations to be consecutively executed on GPU; the number of iterations executed consecutively on increases the amount of memory required. |
| [in] | GPUBlockSize | the block size used for the GPU processing the stencil kernel. if GPUBlockSize is 0, the block size is automatically chosen. |