## Abstract

The sparse grid combination technique provides a framework to solve high-dimensional numerical problems with standard solvers by assembling a sparse grid from many coarse and anisotropic full grids called component grids. Hierarchization is one of the most fundamental tasks for sparse grids. It describes the transformation from the nodal basis to the hierarchical basis. In settings where the component grids have to be frequently combined and distributed in a massively parallel compute environment, hierarchization on component grids is relevant to minimize communication overhead.

We present a cache-oblivious hierarchization algorithm for component grids of the combination technique. It causes ∣∣Gℓ∣∣⋅(1B+(1M√d))

cache misses under the tall cache assumption M=ω(Bd). Here, Gℓ denotes the component grid, d the dimension, M the size of the cache and B the cache line size. This algorithm decreases the leading term of the cache misses by a factor of d compared to the unidirectional algorithm which is the common standard up to now. The new algorithm is also optimal in the sense that the leading term of the cache misses is reduced to scanning complexity, i.e., every degree of freedom has to be touched once. We also present a variant of the algorithm that causes ∣∣Gℓ∣∣⋅(2B+(1M⋅Bd−2√d−1)) cache misses under the assumption M=ω(B)

. The new algorithms have been implemented and outperform previously existing software. In several cases the measured performance is close to the best possible.

The dimension d is assumed to be constant in the

-notation.

We present a cache-oblivious hierarchization algorithm for component grids of the combination technique. It causes ∣∣Gℓ∣∣⋅(1B+(1M√d))

cache misses under the tall cache assumption M=ω(Bd). Here, Gℓ denotes the component grid, d the dimension, M the size of the cache and B the cache line size. This algorithm decreases the leading term of the cache misses by a factor of d compared to the unidirectional algorithm which is the common standard up to now. The new algorithm is also optimal in the sense that the leading term of the cache misses is reduced to scanning complexity, i.e., every degree of freedom has to be touched once. We also present a variant of the algorithm that causes ∣∣Gℓ∣∣⋅(2B+(1M⋅Bd−2√d−1)) cache misses under the assumption M=ω(B)

. The new algorithms have been implemented and outperform previously existing software. In several cases the measured performance is close to the best possible.

The dimension d is assumed to be constant in the

-notation.

Original language | English |
---|---|

Title of host publication | Sparse Grids and Applications - Stuttgart 2014 |

Publisher | Springer |

Publication date | 17 Mar 2016 |

Pages | 103-132 |

ISBN (Print) | 978-3-319-28260-2 |

ISBN (Electronic) | 978-3-319-28262-6 |

DOIs | |

Publication status | Published - 17 Mar 2016 |

Series | Lecture Notes in Computational Science and Engineering |
---|---|

Volume | 109 |

ISSN | 1439-7358 |