Grids and eScience”

Download Report

Transcript Grids and eScience”

“Grids and eScience”

Mark Hayes Technical Director - Cambridge eScience Centre

GEFD Summer School 2004

Outline of this talk

1. What is “eScience”?! (as opposed to just plain science.) 2. A brief history of the Internet 3. Some examples of succesful eScience 4. Environmental eScience 5. Where all this is heading…

eScience - a definition “eScience is about global collaboration in key areas of science and the next generation of infrastructure that will enable it.”

Dr.John Taylor, Director General of the Research Councils 1998-2003

eScience - my definition “eScience is research into new ways of using the Internet to do science.”

In the beginning…

The computer as a communication device "The collection of people, hardware, and software... will become a node in a geographically distributed computer network…. Through the network... all the large computers can communicate with one another. And through them, all the members of the community can communicate with other people, with programs, with data, or with a selected combination of those resources.”

J.C.R.Licklider, “The Computer as a Communication Device” Science and Technology, April 1968

The ARPAnet in 1970

A brief history of the Internet

1962 – Paul Baran of RAND invents packet switched networking 1968 – Licklider’s vision 1969 – ARPAnet goes online 1973 – Bob Kahn & Vint Cerf invent TCP/IP 1979 – Usenet & MUDs invented 1983 – TCP/IP established as a standard 1987 – number of hosts > 10,000 1989 – number of hosts > 100,000 1989 – Tim Berners-Lee invents the World Wide Web 1992 – number of hosts > 1,000,000 http://www.isoc.org/internet/history/

International connectivity - 1991

International connectivity - 1997

International bandwidth

From “3D geographic network displays” - Cox et al, ACM Sigmod Record - December 1996

What does the Internet look like?

http://www.cybergeography.org/

Using the Internet to do science

• Online publication of papers, pre-prints e.g. http://www.arxiv.org http://www.pubmedcentral.org/ • CPU cycle scavenging, e.g. SETI@home, climateprediction.net

• The Human Genome Project: free access to data • Sloan Digital Sky Survey: online database of astronomical data http://www.sdss.org/

Early distributed computing 1.2 million CPU years so far...

Brute force attempt to crack strong encryption Protein folding

SETI@home

The world’s most powerful distributed super-computer delivered 65 Teraflops/second yesterday (Earth Simulator is 35 Tflop/s)

Users Results received Total CPU time Floating Point Operations

Latest Stats http://setiathome.ssl.berkeley.edu/totals.html

14th September 2004

Total

5,170,918 1.5x10

9 2x10 6 years

5.6x10

21 ops 5.6 zeta ops Last 24 Hours

1,934 1.4x10

6 1,115 years 5.6x10

18 flops/day

65 Teraflops/s

It’s not just compute cycles...

An exponential growth in data from many areas of science.

Human genome project

1995-2003 5 institutions sequenced the bulk of the human genome, depositing raw data in public FTP servers within 24 hours of it being sequenced. 3 copies of the data are mirrored in the UK, US & Japan.

Annotating the data is an ongoing world-wide collaborative effort. See e.g. http://www.biodas.org/ http://www.ensembl.org/ For more on the human genome project: http://www.sanger.ac.uk/HGP/overview.shtml

http://www.genome.gov/

Environmental eScience

http://www.climateprediction.net

http://ndg.badc.rl.ac.uk/ (NERC DataGrid) http://www.earthsystemgrid.org/

Where all this is heading

New science,

carried out by “virtual organisations” enabled by the internet.

VO = distributed data, compute resources, people Technology: Globus - http://www.globus.org/ Condor - http://www.cs.wisc.edu/condor/ Access Grid - http://www.accessgrid.org/

The Access Grid

High end video conferencing and collaboration technology.

O(100) nodes world wide.

Presenter mic Presenter camera Ambient mic (tabletop) Audience camera

“...one of the most compelling glimpses into the future I’ve seen since I first saw NCSA Mosaic.” Larry Smarr

Real-time “what if” scenarios

• An explosion!

• A dangerous chemical escapes!

• Where is the pollutant headed?

• Who needs to be evacuated?

The gViz project, Ken Brodlie et al, Leeds University http://www.visualization.leeds.ac.uk/gViz/ http://www.allhands.org.uk/proceedings/papers/67.pdf

Coupled models

• • • • flexibly couple together “component” models to form a unified Earth System Model (ESM), execute the resulting ESM across a computational Grid, share the distributed data produced by simulation runs, and provide high-level open access to the system, creating and supporting virtual organisations of Earth System modellers.

The GENIE project, Paul Valdes et al http://www.genie.ac.uk/

How you can get involved...

• NIEeS - http://www.niees.ac.uk/ • National eScience Centre (Edinburgh) http://www.nesc.ac.uk/ •Your local eScience Centre