IoT devices play a fundamental role in the machine learning (ML) application pipeline, as they collect rich data for model training using sensors. However, this process can be affected by uncontrollable variables that introduce errors into the data, resulting in a higher computational cost to eliminate them. Thus, selecting the most suitable algorithm for this pre-processing step on-device can reduce ML model complexity and unnecessary bandwidth usage for cloud processing. Therefore, this work presents a new sensor taxonomy with which to deploy data pre-processing on an IoT device by using a specific filter for each data type that the system handles. We define statistical and functional performance metrics to perform filter selection. Experimental results show that the Butterworth filter is a suitable solution for invariant sampling rates, while the Savi–Golay and medium filters are appropriate choices for variable sampling rates.