TY - JOUR
T1 - Optimized CNN Architectures Benchmarking in Hardware-Constrained Edge Devices in IoT Environments
AU - Rosero-Montalvo, Paul D.
AU - Tözün, Pinar
AU - Hernandez, Wilmar
PY - 2024
Y1 - 2024
N2 - Internet of Things (IoT) and edge devices have grown in their application fields due to machine learning (ML) models and their capacity to classify images into previously known labels, working close to the end-user. However, the model might be trained with several convolutional neural network (CNN) architectures that can affect its performance when developed in hardware-constrained environments, such as edge devices. In addition, new training trends suggest using transfer learning techniques to get an excellent feature extractor obtained from one domain and use it in a new domain, which has not enough images to train the whole model. In light of these trends, this work benchmarks the most representative CNN architectures on emerging edge devices, some of which have hardware accelerators. The ML models were trained and optimized using a small set of images obtained in IoT environments and using transfer learning. Our results show that unfreezing until the last 20 layers of the model’s architecture can be fine-tuned correctly to the new set of IoT images depending on the CNN architecture. Additionally, quantization is a suitable optimization technique to shrink 2× or 3× times the model leading to a lighter memory footprint, lower execution time, and battery consumption. Finally, the Coral Dev Board can boost 100× the inference process, and the EfficientNet model architecture keeps the same classification accuracy even when the model is adopted to a hardware-constrained environment.
AB - Internet of Things (IoT) and edge devices have grown in their application fields due to machine learning (ML) models and their capacity to classify images into previously known labels, working close to the end-user. However, the model might be trained with several convolutional neural network (CNN) architectures that can affect its performance when developed in hardware-constrained environments, such as edge devices. In addition, new training trends suggest using transfer learning techniques to get an excellent feature extractor obtained from one domain and use it in a new domain, which has not enough images to train the whole model. In light of these trends, this work benchmarks the most representative CNN architectures on emerging edge devices, some of which have hardware accelerators. The ML models were trained and optimized using a small set of images obtained in IoT environments and using transfer learning. Our results show that unfreezing until the last 20 layers of the model’s architecture can be fine-tuned correctly to the new set of IoT images depending on the CNN architecture. Additionally, quantization is a suitable optimization technique to shrink 2× or 3× times the model leading to a lighter memory footprint, lower execution time, and battery consumption. Finally, the Coral Dev Board can boost 100× the inference process, and the EfficientNet model architecture keeps the same classification accuracy even when the model is adopted to a hardware-constrained environment.
KW - Convolutional Neural Networks (CNNs)
KW - Quantization (signal)
KW - Training
KW - Computational Modeling
KW - Transfer Learning
KW - Data models
KW - Benchmark testing
KW - Convolutional Neural Networks (CNNs)
KW - Quantization (signal)
KW - Training
KW - Computational Modeling
KW - Transfer Learning
KW - Data models
KW - Benchmark testing
U2 - 10.1109/JIOT.2024.3369607
DO - 10.1109/JIOT.2024.3369607
M3 - Journal article
VL - 11
SP - 20357
EP - 20366
JO - IEEE Internet Things J.
JF - IEEE Internet Things J.
IS - 11
ER -