Abstract
In the two decades since the end of Dennard scaling, improving processing capabilities without increasing processor frequency has become a key challenge for computer science researchers. Concurrently, while storage device throughput has increased exponentially, the throughput between CPU and memory has only improved linearly. Computational storage, which moves data operations closer to their physical storage locations, has gained considerable attraction to address this contrast. This interest has recently reached a critical milestone with standardisation efforts by the Storage Networking Industry Association (SNIA) and Non-Volatile Memory Express (NVMe), proposing eBPF, a vendor-neutral lightweight instruction set architecture for program offload. Simultaneously, integrated data analysis (IDA) pipelines have emerged, combining various programming paradigms, cluster resource management systems, data formats, and execution strategies into unified data management frameworks, allowing for more efficient usage of computational storage.
Despite the standardisation efforts in computational storage, the question of effectively utilising eBPF within the storage layer remains open. Few empirical studies evaluate the performance implications and potential benefits of integrating eBPF with computational storage. Lastly, many open questions exist concerning the implementation of a computational storage device running eBPF, including memory management and cache coherency.
In this thesis, we surveyed the current state of computational storage, identifying fundamental limitations, such as the historical lack of standardised interfaces and short-lived, non-stateful memory. To explore these issues, we designed and implemented Delilah, the first public eBPF-based computational storage processor. Through this implementation, we investigated the challenges, opportunities, and performance characteristics of computational storage. Our findings reveal significant issues related to memory management and cache coherence that impact performance. Despite these challenges, when optimised, Delilah demonstrated the potential for improved performance in specific operations, such as filtering.
Despite the standardisation efforts in computational storage, the question of effectively utilising eBPF within the storage layer remains open. Few empirical studies evaluate the performance implications and potential benefits of integrating eBPF with computational storage. Lastly, many open questions exist concerning the implementation of a computational storage device running eBPF, including memory management and cache coherency.
In this thesis, we surveyed the current state of computational storage, identifying fundamental limitations, such as the historical lack of standardised interfaces and short-lived, non-stateful memory. To explore these issues, we designed and implemented Delilah, the first public eBPF-based computational storage processor. Through this implementation, we investigated the challenges, opportunities, and performance characteristics of computational storage. Our findings reveal significant issues related to memory management and cache coherence that impact performance. Despite these challenges, when optimised, Delilah demonstrated the potential for improved performance in specific operations, such as filtering.
Originalsprog | Engelsk |
---|
Forlag | IT-Universitetet i København |
---|---|
Antal sider | 132 |
ISBN (Trykt) | 978-87-7949-525-8 |
Status | Udgivet - 25 okt. 2024 |
Navn | ITU-DS |
---|---|
Nummer | 228 |
ISSN | 1602-3536 |