Alumni Project
Particle Physics Data Grid
PI’s: Richard Mount, SLAC, Miron Livny, Wisconsin, Harvey Newman, Caltech
Steering Committee: John Huth, Harvard (ATLAS), Tim Adye, RAL (BaBar), Lothar Bauerdick, FNAL (CMS),
Lee Lueking, FNAL (D0), Chip Watson, TJNAF, Jerome Lauret, BNL (STAR),
Miron Livny, Wisconsin (Condor), Jennifer Schopf, ANL (Globus), Ian Foster, ANL (Globus),
Reagan Moore, SDSC (SRB), Arie Shoshani, LBNL (SRM)
Coordinators: Ruth Pordes, FNAL, Doug Olson, LBNL
Liaisons: Paul Avery (iVDGL), Larry Price (HICB), Mike Wilde(GriPhyN), Torre Wenaus, Ian Bird (LCG)
(www.ppdg.net)
Summary
The Particle Physics Data Grid, PPDG, has been pioneering large-scale collaboration between
computer scientists and physicists who have seemingly unbounded needs for data-intensive computing
integrating worldwide resources. Distributed computing for high-energy and nuclear physics
experiments has become both a driver for, and an eager consumer of, computational data-grid
technology. As a SciDAC project, PPDG has focused on the Grid technology advances, ranging from
architecture to robustness, that can be driven by deploy-ment and integration into the demanding
end-to-end applications of the physics scientific process.
The Particle Physics Data Grid, PPDG, is a collaboration of six physics
experiments[1] (ATLAS, BaBar, CMS, D0, STAR, TJNAF) with four leading
computer science projects[1] (Condor, Globus, SRB, SRM).
The PPDG collaboration, was created in 1999 reflecting the bold belief that computer scientists
and physicists could work together to develop and deploy revolutionary technology that will
revolutionize the way we do science. The physicists brought the drive of fundamental science,
seeking answers to "What is the origin of matter?", "Why is there much more matter than antimatter
in the universe?", and "Can we create and measure matter in the bizarre form of a quark-gluon-plasma
that existed when the universe was billionths of a second old?" The physicists also brought vast
data collections to be analyzed by the worldwide intellect of hundreds or thousands of scientists
to answer these questions. Computer scientists brought the promise of Grid technology that could
liberate and empower the scientific intellect through a seamless and powerful computing environment.
Computer scientists in PPDG acknowledge that their not-always-painless introduction to the world
of thousand-scientist collaborations is itself a benefit. In the world of particle physics, the
Grid itself demands intra and intercontinental collaboration between middleware projects .
Driven by the data volume, the geographically dispersed nature of the physics collaborations,
the long lifetime of the analyses process, and the number of scientists in the collaborations,
a distributed computing model has evolved as the most effective means to enable the scientific
productivity of the researchers and their students.
Leading Computer Scientists from the distributed computing community recognized the challenge
of these computing environments as a tremendous opportunity to both develop and demonstrate
data-grid technology as the next dominant form of global IT infrastructure. Several projects
within the US, as well as other countries (especially The European Union), are developing and
promoting grid technology. PPDG plays a unique role in stressing and facilitating early adoption
of these technologies by the experiments to provide production services to their scientists so that
the bugs and missing features that only show their "ugly head" when used "at scale" and "under load"
are identified and communicated back to the computer scientists.
Supported by SciDAC, DOE’s program for Scientific Discovery through Advanced Computing, PPDG
is vertically integrating advances in computer science such as novel mechanisms and policies, grid
middleware, experiment-specific applications and computing, storage and network resources to bring
effective end-to-end capabilities to the scientist’s desk-top.
Part of the PPDG mission is to facilitate the integration effort between several groups and
projects, each with their own set of timescales and driving forces. An effective approach to
this integration is to identify a specific piece of middleware to be deployed in each physics
collaboration that is likely to bring short-term benefits. Teams from the computer science and
physics groups carry out specific development and integration tasks delivering end-to-end capabilities.
Three such accomplishments of PPDG that have been highlighted in news briefings during the first
year of the project:
- Sustained transatlantic transfer of data (from D0, a major Fermilab physics experiment)
authenticated with internationally trusted digital ID’s. [2]
- The first production simulation processing in a grid environment achieved in the US by
US scientists preparing for the CMS experiment at the CERN Laboratory in Switzerland. [3]
- Robust coast-to-coast terabyte-scale file replication achieved by the STAR experiment
studying quark-gluon-plasma physics at Brookhaven. [4]
Some of the additional accomplishments include: US scientists preparing the ATLAS experiment
at CERN have integrated their data challenge application across a grid based on US middleware in
the US and on European middleware in Europe; BaBar has established managed data transfer from the
SLAC laboratory in California to Lyon France using the Storage Resource Broker, SRB; the TJNAF
laboratory has established remote data access as a grid web service.
In keeping with the goals of achieving end-to-end production services, PPDG recognized a missing
connection with site computer-security personnel and so estab-lished a working group of the DOE
laboratory site security teams to identify issues and recommend improvements to the grid security
infrastructure.[5] Collaboration with DOE Science Grid has helped establish the public-key security
services for the US physics grid community and trust relation-ship with European grid resources.
Each computer science group in PPDG has made improvements and extensions to its designs to meet
needs uncovered by PPDG. These improvements are already benefiting numerous other efforts within
DOE’s SciDAC program. To quote PPDG computer scientists "Almost every new feature required
by Particle Physics turns out to be generic!"
PPDG collaborates on other joint activities with the NSF-funded Grid Physics Network, GriPhyN and
International Virtual Data-Grid Laboratory, iVDGL projects under the umbrella name Trillium.
In the next year, all of the experiment groups in PPDG are planning to roll out grid enabled
job scheduling services. Many challenging areas of work require further research and development
to meet our goals: interactive science analysis services; troubleshooting end-to-end functional
and performance of grid applications; and multi-organization authorization and service provision.
1 More about PPDG participants at www.ppdg.net.
2 www.ppdg.net/docs/news/news-item-20feb02.pdf
3 www.ppdg.net/docs/news/news-update-cmstestgrid-17may02.pdf
4 www.ppdg.net/docs/news/news-25sep02.pdf
5 www.ppdg.net/pa/ppdg-pa/siteaa/
back to project page