General Workload Manager: a Task Manager as a Service
During the recent past, the demand on High
Throughput Computing has been increasing because of the new
scientific challenges. Since the access to several computational
resources to manage thousands of simulations can be difficult
for scientists, different initiatives have tried to provide the
scientific community with interfaces that are user-friendly for
several computational resources. Usually, these are designed for
some specific codes and for a given research field, such as
oceanographic, climate modeling and physics, among others. To
overcome this situation, we have developed the General Workload
Manager (GWM), a universal-purpose very light management
system, capable of working with different computing resources
with the least configuration as possible, such as HPC and HTC
clusters, standalone worker nodes, hypervisor-enabled servers,
and cloud platforms. The suggested system is able to deploy
thousands of different simulation tasks using several computing
resources, and collecting the results in an easy way .
keywords:
Publication: Congress
1624015035761
June 18, 2021
/research/publications/general-workload-manager-a-task-manager-as-a-service
During the recent past, the demand on High
Throughput Computing has been increasing because of the new
scientific challenges. Since the access to several computational
resources to manage thousands of simulations can be difficult
for scientists, different initiatives have tried to provide the
scientific community with interfaces that are user-friendly for
several computational resources. Usually, these are designed for
some specific codes and for a given research field, such as
oceanographic, climate modeling and physics, among others. To
overcome this situation, we have developed the General Workload
Manager (GWM), a universal-purpose very light management
system, capable of working with different computing resources
with the least configuration as possible, such as HPC and HTC
clusters, standalone worker nodes, hypervisor-enabled servers,
and cloud platforms. The suggested system is able to deploy
thousands of different simulation tasks using several computing
resources, and collecting the results in an easy way . - G. Indalecio, F. Gomez-Folgar and A.J. Garcia-Loureiro - 10.1109/ICCW.2015.7247451
publications_en