Hardware Counters Based Analysis of Memory Accesses in SMPs
Modern microprocessors incorporate Hardware Counters (HC) that provide useful information with low overhead. HC are not commonly used because of the lack of tools to get their information in an easy way. In this paper, a set of tools to simplify the accessing and programming of Intel Itanium 2 EARs (Event Address Registers) is presented. The aim of these tools is to characterise the memory accesses of parallel codes, in multicore systems, in which the cache hierarchy can greatly influence the performance. The first tool allows the user to insert in the code, in a simple and transparent way, the instructions needed to monitor and manage hardware counters. Two versions of this tool have been implemented. The first one is a command line tool that takes as input a C source file with appropriate directives and outputs it with the monitoring code added. The other one is a graphical interface that allows the user to select the parts of the code to analise. The second tool takes the information gathered by the monitored parallel code provided by the hardware counters and displays it graphically. This tool shows the information in a comprehensive but simple way, allowing the user to adjust the level of detail. These tools were used to carry out a study of parallel irregular codes. Although this study has been made in a specific environment, the tools here presented can be used in any system as long as it is based on hardware counters present in current processors.
keywords:
Publication: Congress
1624015017293
June 18, 2021
/research/publications/hardware-counters-based-analysis-of-memory-accesses-in-smps
Modern microprocessors incorporate Hardware Counters (HC) that provide useful information with low overhead. HC are not commonly used because of the lack of tools to get their information in an easy way. In this paper, a set of tools to simplify the accessing and programming of Intel Itanium 2 EARs (Event Address Registers) is presented. The aim of these tools is to characterise the memory accesses of parallel codes, in multicore systems, in which the cache hierarchy can greatly influence the performance. The first tool allows the user to insert in the code, in a simple and transparent way, the instructions needed to monitor and manage hardware counters. Two versions of this tool have been implemented. The first one is a command line tool that takes as input a C source file with appropriate directives and outputs it with the monitoring code added. The other one is a graphical interface that allows the user to select the parts of the code to analise. The second tool takes the information gathered by the monitored parallel code provided by the hardware counters and displays it graphically. This tool shows the information in a comprehensive but simple way, allowing the user to adjust the level of detail. These tools were used to carry out a study of parallel irregular codes. Although this study has been made in a specific environment, the tools here presented can be used in any system as long as it is based on hardware counters present in current processors. - Oscar G. Lorenzo, Tomás F. Pena, José C. Cabaleiro, Juan C. Pichel, Juan A. Lorenzo, Francisco F. Rivera - 10.1109/ISPA.2012.89
publications_en