In Partial Fulfillment of the Requirements for the Degree of
Master of Science
Will defend his thesis
OpenMP is the de-facto standard for shared memory parallel programming. It offers a straightforward approach to utilizing multicores platforms without the need to explicitly manage multiple threads. With respect to their performance, OpenMP applications are rather sensitive to the executing threads' memory accesses both at cache level as well as at page level. Large thread counts magnify the negative impact of memory bottlenecks at both levels. However, many novice OpenMP developers are not aware of performance problems due to memory bottlenecks and how to overcome them.
To address these problems, and particularly to support the untrained OpenMP application developer, we have developed transparent support for overcoming memory bottlenecks in OpenMP applications. To achieve this, we have extended DARWIN, our dynamic optimization framework for OpenMP code. We have enhanced its design to increase its modularity and have added features to support performance analysis. We have demonstrated how this framework can be used by creating and deploying strategies to overcome data locality problem in OpenMP applications running on a ccNUMA platform and to detect and overcome performance problem caused by false sharing. The experimental results show that we were able to gather meaningful performance-related information with low overheads and that we can utilize this information to find the source of the performance bottleneck for both problems. The results also show that most applications experience noticeable performance improvements as a result of the optimization performed.
Date: Thursday, December 1, 2011
Time: 3:30 PM
Place: 550-PGH
Faculty, students, and the general public are invited.
Advisor: Prof. Barbara Chapman