Lessons Learnt Porting Parallelisation Techniques for Irregular Codes to NUMA Systems
This work presents a study undertaken to characterise the behaviour of some parallelisation techniques for irregular codes, previously developed for SMP architectures, on a several-node SMP NUMA system. The main objective is to determine the performance effect of bus con- tention and cache coherency in such a complex architecture. Results show that: (1) cores which share a socket can be considered as independent processors in this context; (2) for big data sizes, the effect of sharing a bus degrades the performance but masks the cache coherency effects and (3) the NUMA-ratio is a critical factor on irregular codes. These results allow us to study the effect in performance of the thread-to-core mappings and memory allocation policies. Keywords-Irregular Codes, Itanium2, Hardware Counters.
keywords:
Publication: Congress
1624015017612
June 18, 2021
/research/publications/lessons-learnt-porting-parallelisation-techniques-for-irregular-codes-to-numa-systems
This work presents a study undertaken to characterise the behaviour of some parallelisation techniques for irregular codes, previously developed for SMP architectures, on a several-node SMP NUMA system. The main objective is to determine the performance effect of bus con- tention and cache coherency in such a complex architecture. Results show that: (1) cores which share a socket can be considered as independent processors in this context; (2) for big data sizes, the effect of sharing a bus degrades the performance but masks the cache coherency effects and (3) the NUMA-ratio is a critical factor on irregular codes. These results allow us to study the effect in performance of the thread-to-core mappings and memory allocation policies. Keywords-Irregular Codes, Itanium2, Hardware Counters. - Juan A. Lorenzo, Juan C. Pichel, David LaFrance-Linden, Francisco F. Rivera, David E. Singh - 10.1109/PDP.2010.66
publications_en