Filling Gaps for Faster Simulations
In the course of the Federal Cluster of Excellence MERGE computer scientists of the TU Chemnitz are developing a new process for the distribution of parallel simulations in heterogeneous computer clusters
In the Federal Cluster of Excellence MERGE, scientists of the TU Chemnitz are developing new lightweight structures, research for new materials and manufacturing processes.
In the MERGE research area “modelling, integrative simulation and optimisation” (IRD F), mechanical engineers, mathematicians, and computer scientists are currently trying computer-aided to predict the complete product development process, in order to avoid complex test series according to the principle of trial and error. The necessary simulation programs ought to be fast and working with minimal memory requirement.
The majority of big computer centres in companies and research institutes are a variety of diverse, complex computer systems, so-called heterogeneous clusters that show different capacities and storage capacities. In order to use the available resources simultaneously and efficiently, the calculation programs for simulations are distributed over the computers. With the help of so-called Scheduling processes, schedules are generated that show the distribution of simulations on the available computers. The goal is to determine a schedule previous to the calculation that reduces the total running time of all simulations to the least.
Already existing procedures are not suitable for the distribution of such simulation calculations onto several computer systems in a cluster. Thus, the real runtime behaviour of the applied simulation processes often significantly differentiate to the assumptions made with the Scheduling processes. This is due to the fact that the high quantity of data of the simulations restrict an acceleration of additional computer resources (e.g. process cores) to a limited extent. Furthermore, for the Federal Cluster of Excellence MERGE many different simulations are necessary for the optimisation of compounds in lightweight structures, which additionally increases the running time of the calculations. That is why in practice the total running time is often significantly longer than expected and the clusters are only inefficiently used.
Under the leadership of Prof. Dr. Gudula Rünger, a team of the Professorship of Practical Computer Science of the TU Chemnitz investigated specifically those detriments and developed a new method. The result is the so called Water-Level scheduling. The name is based on the presentation of the order and distribution of individual simulations in a Gantt chart. Robert Dietze, participant in the research group, illustrates the approach on the process: "Metaphorically speaking, the already existing gaps will be filled by simulations that had not yet been distributed. In order to make use of all the capacities, the landscape of the Gantt will be flooded".
In addition to the distribution of the simulations on the computer systems, the method can also identify if or when it is favourable to perform a high number of simulations with low parallelism (i.e. slow) or to perform a low number of simulations with high parallelism (i.e. accelerated). "This way, our Water-Level procedure does not only model the total running time, but also integrates the parallel running times of the individual simulation”, Dietz adds.
With the help of this new developed procedure, the deficiencies of existing Scheduling processes can be rectified. The assumed running time is more equivalent to the actually reached time. Plus, the Water-Level principle takes the varying computing powers of the used computer systems into account and plans more efficiently. This way it is possible to apply a variety of different computer systems, such as heterogeneous clusters, for flexible simulation-based optimisation of compounds and an optimal usage.
The scientists have published their results in the Journal of Supercomputing:
Dietze, R.; Hofmann, M.; Rünger, G.: Water-Level scheduling for parallel tasks in compute-intensive application components. Journal of Supercomputing, Special Issue on Sustainability on Ultrascale Computing Systems and Applications 2016, S. 1-22
DOI: 10.1007/s11227-016-1711-1
(Author: Jana Mischke, Translation: Alissa Hölzel)
Mario Steinebach
26.01.2017