[tt] CHE: Data Deluge From Collider Prompts Next Big Information Revolution
Premise Checker
<checker at panix.com> on
Sun Sep 14 11:17:49 CEST 2008
Data Deluge From Collider Prompts Next Big Information Revolution
The Chronicle of Higher Education, 8.9.12
http://chronicle.com/weekly/v55/i03/03a01501.htm
By RICHARD MONASTERSKY
When the Large Hadron Collider revs up to full capacity near Geneva, it
will generate about 15 million gigabytes of data each year -- enough to
fill a stack of DVDs more than two miles high.
So much information will be pouring out that it will equal about 1 percent
of the total data produced each year throughout the world, says François
Grey, head of communications for information technology at CERN, the
European particle-physics laboratory where the collider is located.
The collider project will need to sort and store every single bit and then
make them available for physicists on every continent except Antarctica.
To meet this grand challenge, CERN has built up the LHC Computing Grid to
handle the data and provide access for the 7,000 scientists from 500
universities and laboratories around the world who are participating in
the experiment.
Often called the Grid, the distributed computing network will eventually
link up 100,000 processors. About 20 percent of those CPU's sit in long
rows of racks at CERN, with the rest spread around the globe at national
labs and universities.
The computing facilities are distributed like the branches of a tree, with
CERN as the main trunk, or Tier 0. It sends copies of all of the collider
data to 11 major limbs called Tier 1 facilities.
The United States has two of these, at Brookhaven National Laboratory and
at Fermi National Accelerator Laboratory, which each serve one of the
major teams of researchers involved in the collider project.
The bulk of the computing power is spread out beyond these limbs, among
250 smaller branches called Tier 2 centers.
The University of Texas at Arlington is the lead institution for one of
the Tier 2 centers in the United States. The university has devoted 1,000
processors and 500,000 gigabytes of storage to the project, says Kaushik
De, a professor of physics there and the center's coordinator.
When a physicist at a university wants to analyze some collider data, she
submits her job through her computer at her institution. The LHC Grid
software then goes out looking for the data, the programs, and the
computing power she needs for the job.
The request might land at a local Tier 2 facility or it might travel to a
Tier 1 halfway around the world. Once the available processors have
finished the analysis, the Grid sends back the results to her own
computer. "The best analogy for the Grid is a farming cooperative," says
Mr. Grey. "By sharing resources, we can use them more efficiently."
Unlike the World Wide Web, which was developed at CERN, the idea of a grid
for distributed computing was conceived by researchers in the United
States in the 1990s. The fields of astronomy, biomedicine, and earth
sciences are already using computing grids, as are companies like IBM and
Hewlett Packard.
But the LHC Grid will be the biggest test of this strategy yet, says Mr.
Grey. "It's really putting the grid into practice."
More information about the tt
mailing list