Machine learning algorithms for pattern visualization in classification tasks and for automatic indoor temperature prediction
The current thesis falls in the scope of Machine Learning, specifically it deals with pattern classification and regression or function approximation. Despite there are many approaches for high-dimensional pattern classification, most of them behave as “black boxes” whose operation mode is difficult or even impossible to explain. This thesis develops methods of dimensionality reduction in order to project or map high-dimensional classification problems into a two-dimensional space (i.e., a plane). Classifiers can thus be used to learn the mapped data and to create two-dimensional maps of the classification problems whose graphic nature makes intuitive and easy to understand. After reviewing the existing methods for dimensionality reduction, several approaches are proposed to map high-dimensional data into the 2D space while minimizing the class overlap. These methods allow to map new patterns, not used during the mapping creation. Eight types of linear, quadratic and polynomial mappings are combined with four class overlap measures. These mappings are compared with other 34 dimensionality reduction methods existing in the literature over a wide collection of 71 classification problems. The best results are achieved by the mapping named Polynomial kernel discriminant analysis with degree 2 (PKDA2), which creates visual and self-explaining maps of the classification problems where a reference classifier (the support vector machine, or SVM) achieves an accuracy only slightly lower than using the original high-dimensional patterns. A web and a standalone graphical interface, developed in the programming languages PHP and Matlab, respectively, are also provided in order to visualize the 2D maps for any classification problem. In the scope of regression, a wide collection of regressors has been applied for the automatic temperature forecasting in household climate systems (HVAC). These systems have a direct impact both in the energy consumption and in the building comfort, so their exact and reliable modeling is very important for the development of energy efficiency plans. The use of regression approaches to forecast the temperature evolution based on internal and external (climatic) conditions would allow to evaluate the impact of changes in HVAC systems from the point of view of comfort. In order to develop an efficient model for the HVAC systems, the current thesis evaluates 40 regressors using a real data set generated in a smart building, the Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS) of the Uni- versidade de Santiago de Compostela. Moreover, different models based on neural networks which allow the automatic re-training have also been developed and compared. This feature brings robustness to the models and allows them: 1) to learn circumstances never seen during training caused by exceptional climatic situations; and 2) to support alterations in the system components caused by errors or changes in the sensor devices.
keywords: Dimensionality reduction, classification, regression, temperature forecasting