Transcript Grids and eScience”
“Grids and eScience”
Mark Hayes Technical Director - Cambridge eScience Centre
GEFD Summer School 2004
Outline of this talk
1. What is “eScience”?! (as opposed to just plain science.) 2. A brief history of the Internet 3. Some examples of succesful eScience 4. Environmental eScience 5. Where all this is heading…
eScience - a definition “eScience is about global collaboration in key areas of science and the next generation of infrastructure that will enable it.”
Dr.John Taylor, Director General of the Research Councils 1998-2003
eScience - my definition “eScience is research into new ways of using the Internet to do science.”
In the beginning…
The computer as a communication device "The collection of people, hardware, and software... will become a node in a geographically distributed computer network…. Through the network... all the large computers can communicate with one another. And through them, all the members of the community can communicate with other people, with programs, with data, or with a selected combination of those resources.”
J.C.R.Licklider, “The Computer as a Communication Device” Science and Technology, April 1968
The ARPAnet in 1970
A brief history of the Internet
1962 – Paul Baran of RAND invents packet switched networking 1968 – Licklider’s vision 1969 – ARPAnet goes online 1973 – Bob Kahn & Vint Cerf invent TCP/IP 1979 – Usenet & MUDs invented 1983 – TCP/IP established as a standard 1987 – number of hosts > 10,000 1989 – number of hosts > 100,000 1989 – Tim Berners-Lee invents the World Wide Web 1992 – number of hosts > 1,000,000 http://www.isoc.org/internet/history/
International connectivity - 1991
International connectivity - 1997
International bandwidth
From “3D geographic network displays” - Cox et al, ACM Sigmod Record - December 1996
What does the Internet look like?
http://www.cybergeography.org/
Using the Internet to do science
• Online publication of papers, pre-prints e.g. http://www.arxiv.org http://www.pubmedcentral.org/ • CPU cycle scavenging, e.g. SETI@home, climateprediction.net
• The Human Genome Project: free access to data • Sloan Digital Sky Survey: online database of astronomical data http://www.sdss.org/
Early distributed computing 1.2 million CPU years so far...
Brute force attempt to crack strong encryption Protein folding
SETI@home
The world’s most powerful distributed super-computer delivered 65 Teraflops/second yesterday (Earth Simulator is 35 Tflop/s)
Users Results received Total CPU time Floating Point Operations
Latest Stats http://setiathome.ssl.berkeley.edu/totals.html
14th September 2004
Total
5,170,918 1.5x10
9 2x10 6 years
5.6x10
21 ops 5.6 zeta ops Last 24 Hours
1,934 1.4x10
6 1,115 years 5.6x10
18 flops/day
65 Teraflops/s
It’s not just compute cycles...
An exponential growth in data from many areas of science.
Human genome project
1995-2003 5 institutions sequenced the bulk of the human genome, depositing raw data in public FTP servers within 24 hours of it being sequenced. 3 copies of the data are mirrored in the UK, US & Japan.
Annotating the data is an ongoing world-wide collaborative effort. See e.g. http://www.biodas.org/ http://www.ensembl.org/ For more on the human genome project: http://www.sanger.ac.uk/HGP/overview.shtml
http://www.genome.gov/
Environmental eScience
http://www.climateprediction.net
http://ndg.badc.rl.ac.uk/ (NERC DataGrid) http://www.earthsystemgrid.org/
Where all this is heading
New science,
carried out by “virtual organisations” enabled by the internet.
VO = distributed data, compute resources, people Technology: Globus - http://www.globus.org/ Condor - http://www.cs.wisc.edu/condor/ Access Grid - http://www.accessgrid.org/
The Access Grid
High end video conferencing and collaboration technology.
O(100) nodes world wide.
Presenter mic Presenter camera Ambient mic (tabletop) Audience camera
“...one of the most compelling glimpses into the future I’ve seen since I first saw NCSA Mosaic.” Larry Smarr
Real-time “what if” scenarios
• An explosion!
• A dangerous chemical escapes!
• Where is the pollutant headed?
• Who needs to be evacuated?
The gViz project, Ken Brodlie et al, Leeds University http://www.visualization.leeds.ac.uk/gViz/ http://www.allhands.org.uk/proceedings/papers/67.pdf
Coupled models
• • • • flexibly couple together “component” models to form a unified Earth System Model (ESM), execute the resulting ESM across a computational Grid, share the distributed data produced by simulation runs, and provide high-level open access to the system, creating and supporting virtual organisations of Earth System modellers.
The GENIE project, Paul Valdes et al http://www.genie.ac.uk/
How you can get involved...
• NIEeS - http://www.niees.ac.uk/ • National eScience Centre (Edinburgh) http://www.nesc.ac.uk/ •Your local eScience Centre