Performance Modeling of MPI Applications Using Model Selection Techniques

A new method for obtaining models of the performance of parallel applications based on statistical analysis is presented in this paper. This method is based on the Akaike’s information criterion (AIC) that provides an objective mechanism to rank different models by means of an experimental data fit. The input of the modeling process is a set of variables and parameters that can a priori influence the performance of the application. This set can be provided by the user. Using this information, the method automatically generates a set of candidate models. These models are fit to the experimental data and the AIC score of each model is calculated. The model with the best AIC score is selected as the best model. Also, using the AIC scores of all candidate models, useful statistical information is provided to help the user to evaluate the quality of the selected model, as well as indications of how to interactively improve this modeling process. As a first case of study, statistical models obtained for different implementations of the broadcast collective communication in Open MPI are shown. These models are very accurate, exceeding its adjustment to theoretical approaches based on the LogGP model. Finally, the NAS Parallel Benchmark is also characterized using this new method with good results in terms of accuracy.

keywords: performance models, MPI, model selection, Akaike's information criterion