Scalable Systems

In order to address the lack of software for the effective management and utilization of terascale computational resources, we have established a Scalable Systems Software Center. The virtual center has a multi-institution, multi-disciplinary group composed of experts from around the country working as single team to develop an integrated suite of machine independent, scalable systems software components needed for the Scientific Discovery through Advanced Computing (SciDAC) initiative. The goal is to provide open source solutions that work for small as well as large-scale systems.

High-end systems software is a key area of need on the large DOE systems. The systems software problems for teraop class computers with thousands of processors are significantly more difficult than for small-scale systems with respect to fault-tolerance, reliability, manageability, and ease of use for systems administrators and users. Layered on top of these are issues of security, heterogeneity and scalability found in today’s large computer centers. The computer industry is not going to solve these problems because business trends push them towards smaller systems aimed at web serving, database farms, and departmental sized systems. In the longer term, the operating system issues faced by next generation petaop class computers will require research into innovative approaches to systems software that must be started today in order to be ready when these systems arrive.

The Scalable Systems Software Center will produce an integrated suite of systems software and tools for the effective management and utilization of terascale computational resources particularly those at the DOE facilities. The first step in this process will be to work together with vendors and system administrators to specify an agreed upon set of interfaces between the system software components. The Center will make a standard software distribution available as open source based on these standard interfaces. Wherever possible the components will be machine and operating system independent.

Status of the project can be found in the project notebooks.


URL http://www.scidac.org/ScalableSystems/index.html
Updated: Monday, 25-Apr-2005 11:36:29 EDT
Webmaster for this subtree: Al Geist gst@ornl.gov