Alumni Project

TOPS Scalable Eigensolvers for Accelerator Design and Thermochemistry

PIs: P. Husbands2, S. Li2,3, M. Minkoff1, E. Ng2, C. Yang2, Principal affiliates: G. Golub (AST), K. Ko (AST), Y. Sun (AST), A. Wagner (ASCTKD)

1Argonne National Laboratory, 2Lawrence Berkeley National Lab, 3U. California-Berkeley

In support of the design of next generation accelerators and molecular quantum mechanical analysis, the Terascale Optimal PDE Simulations (TOPS) project is designing next generation large-scale eigenanalysis software.

The Next Linear Collider (NLC) represents a major investment by DOE in high-energy physics. Because of the costs involved in building the accelerating structure, reliable and accurate numerical simulation of its characteristics before construction is essential. To this end, researchers at the Stanford Linear Accelerator Center (SLAC) have developed several codes that simulate different aspects of the structure.

Figure 1

Figure 1. Model of a 47-cell section of the 206-cell Next Linear Collider structure

One such code, Omega3P, is used in accelerator cavity design. This code calculates cavity mode frequencies and field vectors by solving a generalized (real and symmetric) eigenvalue problem arising from a finite element discretization of Maxwell’s equations. The eigenvalues of interest in this problem are in the interior of the spectrum and are often tightly clustered. In order to confirm their results and improve the current software, TOPS has been working on an alternative eigensolver, an implementation of the Exact Shift-Invert Lanczos (ESIL) method. This method is known to be very reliable, but it presents several engineering challenges when applied to problems of the scale needed by SLAC.

At the heart of the ESIL eigensolver are the parallel factorization and triangular solution of sparse matrices. These operations are implemented in TOPS’ SuperLU package, a leading high-performance scalable solver for sparse linear systems. The software has been carefully designed for optimal performance in on modern architectures. The sequential version of the code has achieved up to 40% of the peak megaflop rate on hierarchical memory machines. The parallel version has also achieved high performance, up to a hundred-fold speedup on large matrices. SuperLU has been widely adopted in research and commercial use, with more than a thousand users. Several popular vendors, including Boeing BCSLIB-EXT, HP-Compaq, Mathematica, Matlab, NAG, and Sun, have incorporated SuperLU or are investigating its use in their mathematical libraries. One recent application is in the solution of a long-standing problem of scattering in a quantum system of three charged particles. Parallel SuperLU played a critical role in building the preconditioners for solving the complex, nonsymmetric, and very ill-conditioned linear systems, up to the order of 8 million, enabling a breakthrough result that was reported in a cover article of Science (24 Dec 1999).

SuperLU is paired with PARPACK, a parallel implementation of the Implicitly Restarted Arnoldi Method to obtain a parallel implementation of the ESIL algorithm. Depending on the size of the structure under study, the discretization of the equations, and the order of the elements used, the matrices involved can be very large, with up to 300 million nonzeros. On the IBM SP at NERSC, we have tested many problems in Omega3P, the largest being the 47-cell structure shown in Figure 1. For this structure we need to calculate a few (16) interior eigenvalues and associated eigenvectors of a matrix of order 1.3M with 20M nonzero entries (see Figure 2). Although the triangular factors of this matrix contain over 350M nonzeros each, the ESIL code found the eigenvalues to required accuracy on 48 processors in less than half the time of the Omega3P’s default eigensolver, an implementation of the Filtering Algorithm.

figure 2

Figure 2. City plot of the matrix arising from the 47-cell structure.

Our ESIL code is currently integrated as a run-time option in Omega3P and more comparisons on larger problems are in progress. Our method takes advantage not only of the high floating point execution rate of modern parallel machines but also of the tremendous amounts of available memory. (For memory-limited contexts, we are also exploring the improvement of the Filtering and Jacobi-Davidson algorithms.)
Current matrices are real and symmetric because we consider only lossless cavities. When losses are taken into account the matrices become complex and symmetric. Because both SuperLU and PARPACK provide options for complex matrices, our Exact Shift-Invert solver provides a quick solution path to solving these problems.

Future work on SuperLU includes improving the performance of the triangular solution phase. This is important in contexts similar to ours where only one factorization is performed, but many triangular solutions are needed. We also plan to leverage the memory hierarchy tuning techniques being developed in TOPS. For integration in PETSc, we will also add incomplete factorization, and fully parallelize the symbolic factorization routine, for better memory scalability.

TOPS is also supporting the SciDAC project "Advanced Software for the Calculation of Thermochemistry, Kinetics, and Dynamics" with the development of a multigrid-like preconditioner for extending the Davidson iterative eigenanalysis method. This approach, Subspace Projected Approximated Methods (SPAM), generates a sequence of approximating preconditioners. It is being applied to molecular quantum vibrational analysis. We have also developed software for Hermitian eigenvalue problems and real generalized eigenvalue problems based upon this approach.

The TOPS project webpage may be found at http://www.tops-scidac.org.

For further information on this subject contact:
Prof. David E. Keyes, Project Lead
Old Dominion University
Phone: 757-683-3882
dkeyes@odu.edu

back to project page

 


Home  |  ASCR  |  Contact Us  |  DOE disclaimer