PhD Defense: 'Automatic learning algorithms for the visualization of patterns in classification problems and the automatic prediction of temperatures in intelligent buildings'

This thesis develops and applies Machine Learning methods to classification and regression problems. In the former scope, new methods for dimensionality reduction are proposed, specifically oriented to project high-dimensional patterns into a 2D space where the classification problem can be visualized and understood. The patterns projected into 2D can be used to train a classifier in order to create a classification map composed by the regions of the 2D space assigned to each class.

This map allows to visualize and explain the original high-dimensional classification problem and the classifier's way of operation without a large reduction in the classification performance, compared to the original high-dimensional problem. In order to assess the significance of the proposed 2D mappings, we train a state-of-the-art classifier, specifically the Gaussian kernel Support Vector Machine (SVM), on the original high-dimensional data set and in the 2D mapped patterns. In both cases, we evaluate the SVM performance for unseen test patterns not seen during the SVM training (for the original patterns) nor during the mapping calculation (for the 2D mapped patterns).

The proposed mappings outperform the state-of-the-art dimensionality reduction methods, and the expected loss in the classifier performance is fairly limited for most data sets, and it can be considered an affordable price for achieving data visualization. With respect to regression,a wide collection of batch learning regressors is applied to indoor temperature forecasting in smart buildings considering three horizon forecasting (1, 2 and 3 hours). The results show that extraTrees, a random forest based regressor, obtains the best performance with limited degradation from the horizon forecasting point of view. Moreover, different online machine learning models based on neural networks are also evaluated. Online methods allow runtime updates of the regressors in order to improve their future estimations.

Experiments show that this feature brings robustness to the models and allows them to learn circumstances never seen during training caused by exceptional climatic situations and to support alterations in the system components. All the evaluated regressors are tested with data obtained through a USC building sensor network deployed in the scope of the European Life-Opere.