Grid, cloud and high performance computing: scientific applications' use cases
The advances in technology and the reduction of computer equipment costs have made possible, in recent decades, the development and continuous improvement of computer systems, especially the capacity of distributed computing systems, which are key to solve both commercial problems and scientific applications with great computational demands This dissertation presents different distributed computing use cases that focus on three specific scientific applications. The works developed and the solutions achieved are not applicable only to these cases but also to other scientific applications with similar characteristics, both for research environments and for operational services. The overall objective has been to simplify the execution of scientific applications on distributed environments with a special emphasis in obtaining the in-time results, according to the requirements of each case. Different cases of distributed computing applied to the resolution of different problems have been studied. In each case, the necessary developments were designed and implemented to solve weaknesses, improve the systems functionalities, and the application and workflows performance on the selected technology. The first case of study was developed within the framework of the RETELAB project [P.1]. The objective of this first case was the creation of a collaborative and distributed work environment configured as a virtual laboratory for the development of interdisciplinary research projects related to oceanographic remote sensing, within the framework of the National Oceanographic Remote Sensing Network (RETEMAR). The aim of the virtual laboratory was to provide the computing capacity and storage, required by this scientific community, through the use of Grid technologies and an user friendly interface. The 2nd case was the experiment "VCOC - Virtual Clusters in Federated Sites" that was one of four experiments that were executed in the first part of the European project BonFIRE. The objective of this experiment was to investigate the feasibility of using multiple Cloud resource providers for the provision of services that require the assignment of clusters of virtual computing nodes intended for the execution of applications that can divide a problem into separate groups of tasks, a type of calculation that fits well to the characteristics of a Cloud infrastructure. The third case study was focused on the development, commissioning and improvement of an operational oceanographic service within the framework of the MyOcean projects [54]. EuroGOOS [55] defines Operational Oceanography as the activity of systematic and long-term routine measurements of the seas and oceans and atmosphere, and their rapid interpretation and dissemination. An Operational Oceanography service usually starts with the transmission of observational data to data assimilation centres, where powerful computers process the data, run numerical forecasting models and produce outputs. These outputs are used to generate final products, such us warnings (coastal floods, ice and storm damage, harmful algal blooms and contaminants, etc.), prediction of primary productivity, ocean currents, etc. Finally, the forecasts products should be distributed rapidly to industrial users, government agencies, and regulatory authorities.
keywords: Grid Computing, Cloud Computing, HPC, Distributed Computing, Virtualization, Operational Oceanographic Services