Dissemination of simulations in the Virtual Observatory Gerard Lemson

Download Report

Transcript Dissemination of simulations in the Virtual Observatory Gerard Lemson

Dissemination of
simulations in the
Virtual Observatory
Gerard Lemson
German Astrophysical
Virtual Observatory,
Max-Planck Institute for
extraterrestrial physics
Overview
• The Virtual Observatory
• Theory/simulations in the VObs
• Case study: Millennium database
– Storing trees in a relational database
• Virtual telescope prototypes
• Outlook
Virtual Observatory I
• Broad goal
– Make results of astronomical research, data and
applications, more readily available to larger
community, and create value-adding services.
(Alex’s talk yesterday)
• Facilitate results–
–
–
–
–
communication
checking
(re)use
comparison
combination
Combination: a multi-wavelength view of a
galaxy merger
Radio
Optical
X-Ray
John Hibbard
http://www.cv.nrao.edu/~jhibbard/n4038/n4038.html
NASA/CXC/SAO/G. Fabbiano et al.
the problem
FIRST
ROSAT
2MASS
GAIA
SDSS
work on a solution
FIRST
ROSAT
2MASS
GAIA
SDSS
Virtual Observatory II
• Approach:
– online availability of datasets and applications
– standardized publication and discovery mechanisms
– standardized description through common
(meta-)data models
– standardized selection mechanisms
– standardized formats for transmitted data
– value added services
– introduce new technologies
– find clever algorithms
• Organized in International VO Alliance (IVOA)
Observations in the VO
• Most VO efforts concentrate on observational
data sets
– simple observables: photons detected at a certain
time from a certain area on the sky
– long history of archiving
– pre-existing standards (FITS)
– valuable over long time (digitising 80 yr old plates)
• Standards observationally biased
– common sky: cone search, SIAP, region
– common objects: XMatch
– data models: characterisation of sky/time/energy(/no
polarisation yet)
Theory in the VO: issues
• Simulations not so simple
– complex observables
– no standardisation (not even HDF5)
– archiving ad hoc, for local use
• Moore’s law makes useful lifetime relatively
short: few years later can do better
• Current IVOA standards somewhat irrelevant
– no common sky
– no common objects
– requires data models for content, physics, code
“Moore’s law” for N-body simulations
Courtesy Simon White
History of simulations
Toomre & Toomre, 1972
Courtesy Volker Springel
Di Matteo, Springel
and Hernquist, 2005
So why bother publishing simulations?
• Simulations are interesting:
– For many cases only way to see processes in action
– Complex observations require sophisticated models
for interpretation
• Bridging gap in specialisations: not everyone
has required expertise to create simulations,
though they can analyse them.
• Many use cases do not require the
latest/greatest
– exposure time calculator
– survey design
A possible formation scenario
Courtesy Volker Springel
Detailed observations
electron density
gas pressure
gas temperature
Courtesy Alexis Finoguenov, Ulrich Briel, Peter Schuecker, (MPE)
Detailed predictions
Courtesy Volker Springel
Case study: Simulations in a
relational database
• Goal: investigate the use of RDB and web
services for disseminating results of
cosmological N-body simulations.
• Why database ?
– encapsulation of data in terms of logical structure, no
need to know about internals of data storage
– standard query language for finding information
– advanced query optimizers
– forces one to think carefully about data structure
– speeds up path from science question to answer
– facilitates communication
– new ways of thinking about results
– links to other efforts (Sloan SkyServer)
The Virgo consortium’s
Millennium simulation
• Millennium simulation
–
–
–
–
–
–
10 billion particles, dark matter only
500 Mpc (~2Gly) periodic box
“concordance model” (as of 2004) initial conditions
64 snapshots
350000 CPU hours
O(30Tb) raw + post-processed data
• play
• Postprocessing:
– dark matter density fields smoothed at various scales (45 * 2563
grid cells)
– dark matter cluster merger trees (~750 million)
– galaxy merger trees (~1 billion/catalogue)
• DeLucia & Balizot, 2006
• Bower et al, 2006
Evolution
Dark matter and galaxies
Halos and galaxies
the Millennium database + web server
• Post-processing results only
• SQLServer database
– MPA: 2000, soon + 2005
– Durham: 2005
• Web application (Java in Apache tomcat web server)
–
–
–
–
portal: http://www.mpa-garching.mpg.de/millennium/
public DB access: http://www.g-vo.org/Millennium
private access: http://www.g-vo.org/MyMillennium
MyDB
• Access methods
– browser with plotting capabilities through VOPlot applet
– wget + IDL, R
– TOPCAT plugin
Database design: “20 queries”
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
Return the galaxies residing in halos of mass between 10^13 and 10^14
solar masses.
Return the galaxy content at z=3 of the progenitors of a halo identified at
z=0
Return the complete halo merger tree for a halo identified at z=0
Find properties of all galaxies in haloes of mass 10**14 at redshift 1
which have had a major merger (mass-ratio < 4:1) since redshift 1.5.
Find all the z=3 progenitors of z=0 red ellipticals (i.e. B-V>0.8 B/T > 0.5)
Find the descendents at z=1 of all LBG's (i.e. galaxies with SFR>10
Msun/yr) at z=3
Find all z=3 galaxies which have NO z=0 descendent.
Return all the galaxies within a sphere of radius 3Mpc around a particular
halo
Find all the z=2 galaxies which were within 1Mpc of a LBG (i.e.
SFR>10Msun/yr) at some previous redshift.
Find the multiplicity function of halos depending on their environment
(overdensity of density field smoothed on certain scale)
Find the dependency of halo formation times on environment
Time evolution: merger trees
Efficient storage of trees in a
relational database
• Goal: allow queries for the formation
history of any object
• No recursion possible, or desired
• Method:
– depth first ordering of trees
– label by rank in order
– pointer to “last progenitor” below each node
– all progenitors have label BETWEEN label of
root AND that of last progenitor
– cluster table on label
Merger trees :
select prog.
from galaxies des
,
galaxies prog
where des.galaxyId = 0
and prog.galaxyId
between des.galaxyId
and des.lastProgenitorId
Leaves :
select galaxyId as leaf
from galaxies des
where galaxyId
= lastProgenitorId
Branching points :
select descendantId
from galaxies des
where descendantId != -1
group by descendantId
having count(*) > 1
Main branches
• Roots and leaves:
select
,
into
from
,
where
and
des.galaxyId as rootId
min(prog.lastprogenitorid) as leafId
rootLeaf
galaxies des
galaxies prog
des.galaxyId = 0
prog.galaxyId between
des.galaxyId and des.lastProgenitorId
• Main branch
select rl.rootId, b.*
from rootLeaf rl
,
galaxies b
where prog.galaxyId between
rl.rootId and rl.leafId
More database design features
• Spatial indices
– Peano-Hilbert index links to field (256^3)
– Z-curve index (bit interleaved, 256^3)
• SQLServer2005 CLR integration with C# for range queries
– Zone index (ix/iy/iz, 50^3)
select
from
where
and
*
galaxies
snapnum = 63
ix = 1 and iy = 5 and iz = 20
• Random sampling
select
from
where
and
*
galaxies
snapnum = 63
random between 1000 and 2000
Under construction
• Batch processing through CAS jobs
• Mock catalogues
– pre-calculated in database
– online MoMaF
• Utilise PCA for storing photometric
predictions
• Tree comparisons: statistics of branch
lengths, node counts; tree edit distance.
Virtual telescopes
• Virtual observations of virtual universe
• Produce data products that are as similar
to observational results as possible:
– images
– spectra
– catalogues
• Include atmosphere and telescope effects
– predict
– analyse: easier to add problems than to
remove them
Prototype examples
• No realistic telescope yet
• Planck simulator
– http://www.g-vo.org/planck
• Mock catalogs through Millennium
– http://www.g-vo.org/mpasims/MoMaf2?
• Hydro simulations of galaxy clusters
– http://www.g-vo.org/hydrosims/
Mock Map Making Facility
Blaizot, J. et al
Mon.Not.Roy.Astron.Soc. 360 (2005) 159-175
Conclusions and outlook
• Simulation data valuable addition to VObs
• Especially with interfaces similar to observational ones
• IVOA theory interest group standards under
development: SNAP, Semantics, Simulation data model
• Virtual telescopes provide perfect use case for testing
VObs ideas:
– requires very different specialisations
– not co-located: needs distributed treatment
– requires standards for data structure and service APIs, as well
as models linking observations and theory
– high performance computational infrastructure for scientifically
meaningful results
• Distributed virtual telescope configuration
Acknowledgments
• Virgo consortium, in particular:
– Volker Springel, Simon White, Gabriella DeLucia, Jeremy Blaizot
(MPA, Munich, Germany),
– Carlos Frenk, Richard Bower, John Helly (ICC, Durham, UK)
• Alex Szalay, Jan van den Berg (JHU)
• GAVO is funded by the German Federal Ministry for
Education and Research
Relevant references and links
•
•
•
•
•
•
Springel, V., et al (2005), Simulations of the formation, evolution and
clustering of galaxies and quasars, Nature, 435, 629
Lemson, G. and the Virgo Consortium (2006), Halo and Galaxy Formation
Histories from the Millennium Simulation: Public release of a VO-oriented
and SQL-queryable database for studying the evolution of galaxies in the
LCDM cosmogony, http://xxx.lanl.gov/format/astro-ph/0608019
Lemson, G. & Springel, V. (2005), Cosmological Simulations in a Relational
Database: Modelling and Storing Merger Trees, ASPC, 351, Astronomical
Data Analysis Software and Systems XV
http://aspbooks.org/custom/publications/paper/351-0212.html
De Lucia , G. & Blaizot, J. (2006) The hierarchical formation of the brightest
cluster galaxies, http://xxx.lanl.gov/format/astro-ph/0606519/
Bower, R. et al (2006), The brokern hierarchy of galaxy formation,
Mon.Not.Roy.Astron.Soc. 370 645-655
http://www.mpa-garching.mpg.de/millennium and http://www.gvo.org/Millennium