Loop Transformations for Performance and Message Latency Hiding in Parallel Object-oriented Frameworks
Author:
Publisher:
Published: 2006
Total Pages:
ISBN-13:
DOWNLOAD EBOOKApplication codes reliably achieve performance far less than the advertised capabilities of existing architectures, and this problem is worsening with increasingly-parallel machines. For large-scale numerical applications, stencil operations often impose the greater part of the computational cost, and the primary sources of inefficiency are the costs of message passing and poor cache utilization. This paper proposes and demonstrates optimizations for stencil and stencil-like computations for both serial and parallel environments that ameliorate these sources of inefficiency. Additionally, the authors argue that when stencil-like computations are encoded at a high level using object-oriented parallel array class libraries these optimizations, which are beyond the capability of compilers, may be automated.