HPCToolkit
A toolkit that supports measurement, analysis, attribution, and inspection of application performance on CPU and GPU-accelerated architectures
Area: Development tools
CASS member: STEP
Description
HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to GPU-accelerated supercomputers. By using statistical sampling of timers and hardware performance counters on CPUs, HPCToolkit measures a program’s CPU work, resource consumption, and inefficiency. It attributes performance metrics to the full calling context in which they occur. By monitoring GPU operations, gathering instruction-level metrics within GPU kernels, and attributing the costs of GPU work to heterogeneous calling contexts, HPCToolkit provides insight into the performance of GPU-accelerated codes. HPCToolkit works with multilingual, fully optimized, dynamically-linked applications. HPCToolkit is designed for use on large parallel systems. HPCToolkit’s presentation tools enable rapid analysis of a program’s execution costs, inefficiency, and scaling characteristics both within and across nodes of a parallel system. HPCToolkit supports measurement and analysis of serial codes, multithreaded codes (e.g. pthreads, OpenMP), MPI, and hybrid (MPI+threads) parallel codes, as well as GPU-accelerated codes that offload computation to AMD, Intel, or NVIDIA GPUs.
Target audience
HPCToolkit is designed for use by developers working on parallel applications, frameworks, runtime systems, and tools for CPU and GPU-accelerated systems.
Package links
- E4S: hpctoolkit
- Spack: hpctoolkit