Data Pipes: Declarative Control over Data Movement

Lukas Vogel, Daniel Ritter, Danica Porobic, Pinar Tözün, Tianzheng Wang, Alberto Lerner

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Today’s storage landscape offers a deep and heterogeneous stack of technologies that promises to meet even the most demanding data-intensive workload needs. The diversity of technologies, however, presents a challenge. Parts of it are not controlled directly by the application, e.g., the cache layers, and the parts that are controlled, often require the programmer to deal with very different transfer mechanisms, such as disk and network APIs. Combining these
different abstractions properly require great skill, and even so, expert-written programs can lead to sub-optimal utilization of the storage stack and present performance unpredictability.
In this paper, we propose to combat these issues with a new programming abstraction called Data Pipes. Data pipes offer a new API that can express data transfers uniformly, irrespective of the source and destination data placements. By doing so, they can orchestrate how data moves over the different layers of the storage stack explicitly and fluidly. We suggest a preliminary implementation of Data Pipes that relies mainly on existing hardware primitives to implement data movements. We evaluate this implementation experimentally and comment on how a full version of Data Pipes could be brought to fruition.
Original languageEnglish
Title of host publicationConference on Innovative Data Systems Research
Number of pages10
Publication date2023
Publication statusPublished - 2023

Keywords

  • Storage Technologies
  • Data-Intensive Workloads
  • Programming Abstractions
  • Data Transfers
  • Storage Stack Optimization

Fingerprint

Dive into the research topics of 'Data Pipes: Declarative Control over Data Movement'. Together they form a unique fingerprint.

Cite this