Abstract
Spreadsheets are used in industry and academia across all domains and their application ranges from simple accounting to sequencing of amino acids. Some businesses re-use and extend spreadsheets over years and not only complexity but also performance becomes a problem.
In the last two decades, parallel shared-memory multiprocessor computers have become abundant, but they remain notoriously difficult to program correctly. Therefore, it is often not possible for end-users to make full use of their parallel hardware. Declarative, functional programming languages are an alternative to imperative programming languages that makes programming shared-memory multiprocessors much easier: due to the absence of side effects in these languages, compilers and libraries can parallelize programs automatically. The formula language of spreadsheets is a declarative, first-order functional language and therefore potentially allows for an automatic parallelization of spreadsheet calculations.
This thesis explores the design space of automatically parallelizing spreadsheet recalculations. Often times, computations in spreadsheets are structured in a way that corresponds to declarative high-level programming on arrays. Therefore, the focus of this thesis lies on practical data-parallelism in a spreadsheet model of computations. This thesis makes the following contributions:
• we show how to automatically re-write high-level spreadsheet
structures into higher-order functional programs for parallel
execution;
• we contrast and combine our re-writing technique with a
recalculation algorithm that dynamically exploits dataflow
parallelism in spreadsheets;
• we describe a representation of two-dimensional arrays that
pragmatically caters to the needs of high-level array programming; and
• we explore a hypothetical approach to array fusion and laziness in a purely functional, strict spreadsheet language.
In the last two decades, parallel shared-memory multiprocessor computers have become abundant, but they remain notoriously difficult to program correctly. Therefore, it is often not possible for end-users to make full use of their parallel hardware. Declarative, functional programming languages are an alternative to imperative programming languages that makes programming shared-memory multiprocessors much easier: due to the absence of side effects in these languages, compilers and libraries can parallelize programs automatically. The formula language of spreadsheets is a declarative, first-order functional language and therefore potentially allows for an automatic parallelization of spreadsheet calculations.
This thesis explores the design space of automatically parallelizing spreadsheet recalculations. Often times, computations in spreadsheets are structured in a way that corresponds to declarative high-level programming on arrays. Therefore, the focus of this thesis lies on practical data-parallelism in a spreadsheet model of computations. This thesis makes the following contributions:
• we show how to automatically re-write high-level spreadsheet
structures into higher-order functional programs for parallel
execution;
• we contrast and combine our re-writing technique with a
recalculation algorithm that dynamically exploits dataflow
parallelism in spreadsheets;
• we describe a representation of two-dimensional arrays that
pragmatically caters to the needs of high-level array programming; and
• we explore a hypothetical approach to array fusion and laziness in a purely functional, strict spreadsheet language.
Original language | English |
---|
Publisher | IT-Universitetet i København |
---|---|
Number of pages | 199 |
ISBN (Print) | 978-87-7949-015-4 |
Publication status | Published - 2018 |
Series | ITU-DS |
---|---|
Number | 148 |
ISSN | 1602-3536 |