TY - JOUR
T1 - Apache Wayang: A Unified Data Analytics Framework
AU - Beedkar, Kaustubh
AU - Contreras-Rojas, Bertty
AU - Gavriilidis, Haralampos
AU - Kaoudi, Zoi
AU - Markl, Volker
AU - Pardo-Meza, Rodrigo
AU - Quiane Ruiz, Jorge Arnulfo
N1 - SIGMOD record has a custom licensing agreement, not a standard CC license. The ACM grants some permissions similar to non-commercial CC licenses (e.g., CC BY-NC), but it is specific to ACM's publications and digital library policies. Authors retain ownership and must be contacted for any commercial use.
As part of their custom licensing agreement users can copy and distribute the article for non-commercial, educational, or research purposes.
PY - 2023/11/2
Y1 - 2023/11/2
N2 - The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform(s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture of Wayang, describe its main components, and give an outlook on future directions.
AB - The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform(s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture of Wayang, describe its main components, and give an outlook on future directions.
KW - Unified Data Analytics
KW - Data Processing Platforms
KW - Optimized Data Pipelines
KW - Heterogeneous Systems Integration
KW - Apache Wayang Architecture
KW - Unified Data Analytics
KW - Data Processing Platforms
KW - Optimized Data Pipelines
KW - Heterogeneous Systems Integration
KW - Apache Wayang Architecture
U2 - 10.1145/3631504.3631510
DO - 10.1145/3631504.3631510
M3 - Journal article
SN - 0163-5808
VL - 52
SP - 30
EP - 35
JO - SIGMOD Record
JF - SIGMOD Record
IS - 3
ER -