TY - JOUR
T1 - Data-efficient performance learning for configurable systems
AU - Guo, Jianmei
AU - Yang, Dingyu
AU - Siegmund, Norbert
AU - Apel, Sven
AU - Sarkar, Atri
AU - Valov, Pavel
AU - Czarnecki, Krzysztof
AU - Wasowski, Andrzej
AU - Yu, Huiqun
PY - 2017/11
Y1 - 2017/11
N2 - Many software systems today are configurable, offering customization of functionality by feature selection. Understanding how performance varies in terms of feature selection is key for selecting appropriate configurations that meet a set of given requirements. Due to a huge configuration space and the possibly high cost of performance measurement, it is usually not feasible to explore the entire configuration space of a configurable system exhaustively. It is thus a major challenge to accurately predict performance based on a small sample of measured system variants. To address this challenge, we propose a data-efficient learning approach, called DECART, that combines several techniques of machine learning and statistics for performance prediction of configurable systems. DECART builds, validates, and determines a prediction model based on an available sample of measured system variants. Empirical results on 10 real-world configurable systems demonstrate the effectiveness and practicality of DECART. In particular, DECART achieves a prediction accuracy of 90% or higher based on a small sample, whose size is linear in the number of features. In addition, we propose a sample quality metric and introduce a quantitative analysis of the quality of a sample for performance prediction.
AB - Many software systems today are configurable, offering customization of functionality by feature selection. Understanding how performance varies in terms of feature selection is key for selecting appropriate configurations that meet a set of given requirements. Due to a huge configuration space and the possibly high cost of performance measurement, it is usually not feasible to explore the entire configuration space of a configurable system exhaustively. It is thus a major challenge to accurately predict performance based on a small sample of measured system variants. To address this challenge, we propose a data-efficient learning approach, called DECART, that combines several techniques of machine learning and statistics for performance prediction of configurable systems. DECART builds, validates, and determines a prediction model based on an available sample of measured system variants. Empirical results on 10 real-world configurable systems demonstrate the effectiveness and practicality of DECART. In particular, DECART achieves a prediction accuracy of 90% or higher based on a small sample, whose size is linear in the number of features. In addition, we propose a sample quality metric and introduce a quantitative analysis of the quality of a sample for performance prediction.
KW - Parameter tuning
KW - Model selection
KW - Regression
KW - Configurable systems
KW - Performance prediction
KW - Parameter tuning
KW - Model selection
KW - Regression
KW - Configurable systems
KW - Performance prediction
U2 - 10.1007/s10664-017-9573-6
DO - 10.1007/s10664-017-9573-6
M3 - Journal article
SN - 1382-3256
JO - Empirical Software Engineering
JF - Empirical Software Engineering
ER -