Alumni Project

PERC Performance Modeling

PI: David H. Bailey (LBNL); Co-PIs: Bronis de Supinski (LLNL), Jack Dongarra (U. Tenn.), Thomas Dunigan (ORNL), Paul Hovland (ANL), Jeffrey Hollingsworth (U. Mar.), Boyanna Norris (ANL), Daniel Quinlan (LLNL), Celso Mendes (U. Ill.), Shirley Moore (U. Tenn.), Daniel Reed (U. Ill.), Allan Snavely (SDSC), Erich Strohmaier (LBNL), Jeffrey Vetter (LLNL), Patrick Worley (ORNL); SciDAC ISICs: David Brown (TSTT); Phil Colella (APDEC), David Keyes (TOPS); SciDAC Applications: Donald Batchelor (WPI), Mark Gordon (EST), Kwok Ko (AST), Anthony Mezzacappa (TSI), Robert Malone (CCSM), Robert Sugar (QCD)

Summary

PERC is developing performance models to predict and explain the performance of SciDAC applications. Our aim is to advance a framework for automated, time-tractable, performance modeling and to use the resulting understanding to improve applications performance and inform machine procurement and design. PERC researchers are pursuing several distinct performance modeling and analysis strategies, including machine profiles, application signatures, statistical modeling and performance bound analysis.

The performance modeling activity of the PERC project seeks to:

  • Explore a range of modeling methodologies, ranging from statistical methods to advanced “convolution” schemes.
  • Develop a framework with tools to enable automated, time-tractable performance modeling.
  • Use of this framework to model several SciDAC kernels and applications.
  • Do basic research into the factors that affect performance, and validation of underlying modeling methodologies.
  • Extend the framework to model long-running applications and to extrapolate the performance of future systems. PERC researchers have developed a framework of tools to enable automated, time-tractable performance modeling. The components of the framework are:

Machine Profiles: characterizations of the rates at which a machine can (or is projected to) carry out fundamental operations abstract from the particular application.

Application Signatures: characterizations of an application, independent of specific host machine, resulting in detailed summaries of the fundamental operations to be carried out on the application code’s behalf.

Convolution Methods: algebraic mappings of the Application Profile on to the Machine Signature to arrive at a set of performance predictions or bounds.

For Machine Profiles, a set of low-level benchmark “probes” have been developed to measure a set of key machine performance metrics. Data have been gathered across a set of interesting machines (focusing on large DOE systems), and the results applied to modeling SciDAC applications. One tool is the MAPS (Memory Access Pattern Signature) tool for determining the rates at which a system can sustain loads and stores depending on size of problem and memory access pattern.

Figure 1 below shows MAPS results for an IBM Power3 processor. Even prior to carrying out the modeling step, MAPS curves can reveal the basic capabilities of a machine’s memory subsystem. This performance is often the determining factor in how well an application will perform on the machine.

For Application Signatures, tools have been developed to enable tracing of applications and to summarize their operations.

figure 1
Figure 1: A MAPS Curve for the IBM Power 3 Processor

There is special emphasis on acquiring and summarizing details of the application’s memory and communication patterns—these are often critical-path for performance, yet have historically not been well supported in tracing tools. Application signatures have been gathered using the PERC tools for several SciDAC kernels and application codes from Terascale Optimal PDE Simulation (TOPS), the General Atomic and Molecular Electronic Structure System (GAMESS), and the Parallel Ocean Program (POP). The results were used to model, predict, and understand their performance. Tools and techniques for generating performance bounds based on source code analysis have also been developed.

Convolution models have been successfully developed to rapidly estimate the performance of sparse-matrix kernels from PETSc (a TOPS toolkit) and GAMESS. A model including the effect of reduced time step size due to reduction in grid spacing approach has been developed for the Enhanced Virginia Hydrodynamics (EVH1) code. The rapidity of our modeling approach is in contrast to traditional methods that have rendered full-application performance prediction intractable. Rapidity enables parameter sweeping to explore multiple algorithms and/or target machines for an application. Also, our automation for gathering and combining machine and application information stands in contrast to approaches that require in-depth analysis of codes by teams of computer and domain scientists.

Convolution-based performance estimates have proven accurate across a diverse set of architectures and machines using Power3, Power4, and Alpha processors and can predict performance to within a few percent in almost all tested cases. The tools that embody this framework are publicly available from the PERC web site http://perc.nersc.gov.

Modeling methods have been conveyed to the wider community via a series of tutorials, as in a recent SC’02 tutorial. The details of the performance modeling meth-odology and research results are published in workshops and conferences as in recent SC’02 technical papers. The PERC modeling activities are further coordinated with the computational science projects and with other Integrated Software Infrastructure Centers (ISICs). These collaborations provide motivating applications for modeling and are intended to provide feedback to inform code tuning, guide applications to the machines best suited for them, and to inform the procurement process.

For further information please contact:
Dr. Allan Snavely, PERC Project Lead for Modeling
San Diego Supercomputer Center
Tel: 858-534-5158
Email: allans@sdsc.edu

back to project page

 


Home  |  ASCR  |  Contact Us  |  DOE disclaimer