TY - JOUR
T1 - AutoML classifier clustering procedure
AU - Koren, Oded
AU - Hallin, Carina Antonia
AU - Koren, Michal
AU - Issa, Amir A.
PY - 2021/10/23
Y1 - 2021/10/23
N2 - Recommendation systems are one of the main applications of machine learning (ML) used across different industries. This paper presents a new automated machine learning (AutoML) method of providing recommendations by processing data sets using ML algorithms, targeting, and offering cluster recommendations for new observations and as a new decision support method. The AutoML conducts a complete procedure and includes analysis and division of data into an efficient number of clusters. We apply the k-means, using the elbow method to calculate costs per cluster, followed by analyzing the allocation of the data into the clusters, thus providing a method for prediction and for allocating new observations to the relevant clusters (knn). This study includes two experiments using the complete AutoML procedure conducted on a data set, with more than two million records and dozens of attributes. This was done to demonstrate how the AutoML method can be implemented and successfully run with a high-capacity analysis procedure. The motivation was to analyze, examine, assign, and integrate new observations into existing clusters that have been defined. The results showed that the AutoML method provided efficient recommendations for new observations with an accuracy rate of 99.99%. Hence, the AutoML procedure can offer a full system for any organization to efficiently split existing data into clusters, assign to clusters, and predict the cluster allocation of new observations. The significant contribution of this study is a simple method that can achieve fast and high accuracy clustering for ongoing (new) classified data acquired by an organization.
AB - Recommendation systems are one of the main applications of machine learning (ML) used across different industries. This paper presents a new automated machine learning (AutoML) method of providing recommendations by processing data sets using ML algorithms, targeting, and offering cluster recommendations for new observations and as a new decision support method. The AutoML conducts a complete procedure and includes analysis and division of data into an efficient number of clusters. We apply the k-means, using the elbow method to calculate costs per cluster, followed by analyzing the allocation of the data into the clusters, thus providing a method for prediction and for allocating new observations to the relevant clusters (knn). This study includes two experiments using the complete AutoML procedure conducted on a data set, with more than two million records and dozens of attributes. This was done to demonstrate how the AutoML method can be implemented and successfully run with a high-capacity analysis procedure. The motivation was to analyze, examine, assign, and integrate new observations into existing clusters that have been defined. The results showed that the AutoML method provided efficient recommendations for new observations with an accuracy rate of 99.99%. Hence, the AutoML procedure can offer a full system for any organization to efficiently split existing data into clusters, assign to clusters, and predict the cluster allocation of new observations. The significant contribution of this study is a simple method that can achieve fast and high accuracy clustering for ongoing (new) classified data acquired by an organization.
KW - AutoML, Clustering, Classification, Targeting, k-means
U2 - 10.1002/int.22718
DO - 10.1002/int.22718
M3 - Journal article
SN - 0884-8173
JO - International Journal of Intelligent Systems
JF - International Journal of Intelligent Systems
ER -