Alumni Project

Optimizing Performance and Enhancing Functionality of Distributed Applications using Logistical Networking

Micah Beck, Univ. of Tennessee
Jack Dongarra, Univ. of Tennessee
James Plank, Univ. of Tennessee
Rich Wolski, Univ. of California at Santa Barbara

Summary

Logistical Networking (LN) is a new way of synthesizing networking and storage to create a communication infrastructure that provides superior control of data movement and state management for distributed applications of all kinds. This project is exploring Logistical Networking for the SciDAC community, and creating the advanced software technologies and storage-enabled network infrastructure to provide fast, efficient, and reliable data delivery to support the high-performance applications used by SciDAC collaborators.

Logistical Networking (LN) technologies offer a highly scalable means to manage distributed content using shared network storage. Our software tools allow users to deploy local IBP storage “depot(s)” or utilize shared IBP storage deployed worldwide to easily accomplish long haul data transfers, temporary storage of large data sets (on the order of terabytes), pre-positioning of data for fast on-demand delivery, and high performance content distribution such as streaming video. Of immediate interest to SciDAC collaborators is the ease with which LN facilitates the transfer of massive data sets.

Logistical Networking software currently in use by the SciDAC community include:

1.) Internet Backplane Protocol (IBP) : a low-level mechanism for managing remote storage as a sharable network resource through deployment and shared use of lightweight, time limited storage allocations called storage “depots.” IBP v1.4 utilizes multiple disk resources (disk, ram, and the user-level file system) and persistent client-server connections to enhance performance. Version 1.4 also supports NFU (Network Functional Unit)—special nodes that can perform operations such as XOR and merge on data passing through the node.

2.) exNode : a generalized data structure which holds the metadata necessary to manage distributed content stored on IBP depots and allow file-like structuring of stored data. The exNode records information such as which IBP depots house replicas of data content.

3.) Logistical Backbone (L-Bone) : directory service for registered IBP depots, cataloguing an ever growing international deployment of 330 depots that serve over 36 TB of storage as a shared resource for the scientific community. A second, private directory serves the SciDAC community exclusively (see below). Other recent deployments include Brazil's Rede Nacional de Pesquisa e Ensino's Digital Video Working Group, who are integrating LN into their video delivery service and have deployed IBP on the RNP backbone.

4.) Logistical Runtime System (LoRS Tools) : provides high-level file management capabil-ities, high-performance access, and end-to-end services such as data compression, checksums, and encryption. SciDAC collaborators use the LoRS Tools to store, manage, and retrieve data via the Logistical Network. LoRS v0.82 features stronger encryption with the AES algorithm, supports new functions like download resume, and is available in a native version for Windows. With the next release, LoRS will offer Reed-Solomon error-correction coding to ensure fault tolerance without the need to maintain multiple replicas of stored data, thereby reducing the amount of storage space used from 3-5 times to 1.15-2 times the size of the stored data set.

5.) Data Movers : DataMovers are auxiliary IBP depot modules that support customized or special purpose depot-to depot-transfers. The SABUL DataMover uses a reliable UDP transfer stream along with a TCP flow control channel to provide very high throughput over long-haul transfers. We are also integrating a NetStorager DataMover API, developed by YottaYotta.

6.) Logistical Distribution Network (LoDN) : LoDN is a one-click Java tool for downloading data from IBP storage. Content publishers can utilize LoDN by placing a link to a JNLP file on their website. The JNLP file specifies an exNode (pointer to stored data) to the download tool. Clicking the link starts the LoDN application, which accepts user input (filename and filepath) and retrieves the data. LoDN makes LN access-ible without the need to install or configure anything. LoDN runs on every major operating system, including Windows, Mac, Linux, and even on web-enabled cell phones and PDA's.

SciDAC Collaborations: A primary research drive has been interactions with the Terascale Supernova Initiative (TSI) group. TSI uses Logistical Networking (LN) to share massive data sets between distant collaboration sites. With the LoRS tools, they can now transfer data at speeds up to 220 Mbps between key research sites at ORNL and NCSU. See Figure 1.

A new, private LN infrastructure is in place, designed specifically for TSI and other SciDAC research endeavors. Depots at ORNL, San Diego Supercomputing Center, SUNY-Stony Brook, and NCSU provide 8TBs of storage and form the network's backbone. We are negotiating a placement agreement for a 7 TB disk array at ORNL to complement the other TSI nodes.

Figure 1: The interworking components of the Logistical Runtime System Tools 
(LoRS Tools) in the high performance distribution of data between IBP depots at 
Terascale Supernova Initiative sites at ORNL and NCSU.
Figure 1: The interworking components of the Logistical Runtime System Tools (LoRS Tools) in the high performance distribution of data between IBP depots at Terascale Supernova Initiative sites at ORNL and NCSU.

In the next year we will be working with SciDAC's Fusion Energy Sciences community to address the challenges of large scale collab-oration. Typical fusion plasma experiments require real-time feedback for rapid tuning of experimental parameters, meaning data must be analyzed during the 15 minute intervals between plasma-generating pulses. Such rapid assimil-ation of data is achieved by a geographically dispersed research team. LN is an obvious solution for high performance data distribution and NFU-enabled depots can even provide preliminary data conditioning.

Next twelve months:

•  Asynchronous IBP to allow pipelining of requests between client and server.
•  Release IBP Protocol 2.
•  Determine the best practices for using Reed-Solomon error-correction codes.
•  Install 7 TB disk array at ORNL.
•  Deploy NFU-enabled depots at key Fusion Energy Sciences sites.

Further develop LoDN, integrate existing software into streamlined publishing tool.

back to project page

 


Home  |  ASCR  |  Contact Us  |  DOE disclaimer