Transcript Document

Hello!
Virtual
Observatories
Ajit Kembhavi
IUCAA
Pune, India
Data Storage and Retrieval
The Astronomer
Vermeer
1632-1675
The Library of Alexandria
3rd Century BC
The Data Avalanche
Immense amounts of data are being
produced by large telescopes using
large area detectors.
Terabytes of data are now available, and
Petabytes will soon be available from
frequent all sky imaging.
Vast databases are also being
produced through simulations.
Astronomical Data Explosion
~ 100 Gb/night
P. Quinn
Data Explosion
Peter Quinn
Wavelength Coverage
The data spans the electromagnetic
spectrum from the radio to the
gamma-ray region.
Obtaining, analysing and interpreting
the data in different wavebands involves
highly specialised instruments and
techniques.
The astronomer needs new tools for
using this wealth of data in
multiwavelength studies.
Stars in the Milky Way
The Hertzsprung-Russell Diagram
The Alliance
Members of the IVOA
Interactions
Virtual Observatories
• Provide tools for data analysis, visualization and
mining.
• Develop interoperability concepts to make
different databases seamless.
• Manage vast data resources and provide these online to astronomers and other users.
Empower astronomers by providing sophisticated
query and computational tools, and computing
grids for producing new science.
IVOA Technology Initiatives
The IVOA has identified six major
technical initiatives to fulfill the scientific
goal of the VO concept.
IVOA-LISTS
REGISTRIES: These collect metadata about data
resources and information services into a
queryable database. The registry is distributed. A
variety of industry standards are being investigated.
DATA MODELS: This initiative aims to define the
common elements of astronomical data structures
and to provide a framework to describe their
relationships.
UNIFORM CONTENT DESCRIPTORS: These
will provide the common language for for metadata
definitions for the VO.
DATA ACCESS LAYER: This provides a
standardized access mechanisms to distributed data
objects. Initial prototypes are a Cone Search
Protocol and a simple Image Access Protocol.
VO QUERY LANGUAGE: This will provide a
standard query language which will go beyond the
limitations of SQL.
VOTable: This is an XML mark-up standard for
astronomical tables.
Science Initiatives
• Many IVOA projects have active Science Working
Groups consisting of astronomers from a broad
cross-section of the community representing all
wavelengths.
• The focus here is to develop a clear perception of
the scientific requirements of a VO.
• Projects within the working groups will develop
new capabilities for VO based analysis.
• This will enable the community to create new
research programs and to publish their data and
research in a more pervasive and scientifically
useful manner.
Virtual Observatory India
A collaboration between IUCAA
and PSPL,
with a grant from the Ministry
of Communications and
Information Technology
IUCAA
Persistent Systems
Pvt. Ltd., Pune
Virtual Observatory - India
Data Archives and Mirrors at VO-I
SDSS
2Mass
Chandra
2DFGRS
2QZ
FIRST
NVSS
Vizier, Aladin, ADS
Fast Computing
Four alpha server ES-45 nodes, each
with 4 processors, each node with 8
GB RAM
Fast, Low latency interconnect
Memory Channel Architecture
Trucluster clustering environment
(Tru64 Unix, DecMPI, openMP)
VO-India Software Projects
VOPlot Visualizer for catalogue data
VOTable C++ Parser
VOTable Streaming writer
Data Converters
Fits Browser
User interfaces and query tools
Applications beyond astronomy
All tools have web-based and stand alone versions
The VOPlot Collaboration
Visualization and simple
statistics of catalogue data.
Integration with sky atlases.
The VOPlot Tool
A VO-I + CDS collaboration
First conceived as a web-based tool for Vizier
Then integrated with Aladin
VOPlot is now also a stand alone system
It has been integrated with many data bases
Sonali Kale, K.D. Balaji et. Al.
VOPlot
Colour-magnitude diagram
parallax
Catalog Data Interface Tool
A tool to query catalog data.
Simple, customizable, graphic interface.
Not specific to type of data or catalogue.
SQL queries for expert users.
VO tools available for analysis:
VOPlot, Aladin, VOStat, SIMBAD, NED...
Data Organization and Architecture
Browse Server Database
Back
Create Views
Back
On-the-fly GUI
Back
Query using a Form
Back
Query using SQL Directly
Back
Results in VOPlot
Back
Results in Aladin
Back
Himalaya Chandra Telescope Data Archives
SDSS J125637-022452
High proper motion L-subdwarf
Optical spectra of mixed late M and mid L type
Only the third L subdwarf known
Positions 1986-2000
Proper motion
0.617 arcsec / yr
Thank You
AVO Prototype
Demo
Astrogrid:
Astronomy
Catalogue
Extractor
AVO:
Aladin+SED
VO-India:VOPlot
FITS Manager
View, create and add to FITS files
Convert to other formats
Pallavi Kulkarni
Fits-manager
VOTable Java Streaming Writer
Acts on a data array in memory
to convert it to the VOTable form,
which is streamed row by row
to an output file. Very large
VOTables can be written without
excessive memory.
Pallavi Kulkarni
VOTable-Java
VOTable
• This is a new data exchange standard
produced through efforts led by
Francois Ochsenbien of CDS,
Strasbourg and Roy Williams of
Caltech.
• VOTable is in XML format. Physical
quantities come with sophisticated
semantic information.
VOTable
• The format enables computers to easily parse the
information and communicate it to other computers.
• Federation and joining of information become possible
and Grid computing is easier.
• VOTable parsers have been developed in Perl, Java
and C++.
• Enhancements and extensions are being considered.
Streaming Parser
Non-streaming Parser
VOTable Data
The data part in a VOTable may be represented using
one of three different formats:
– FITS : VOTable can be used either to encapsulate FITS
files, or to re-encode the metadata.
– BINARY : Supported for efficiency and ease of
programming, no FITS library is required, and the
streaming paradigm is supported.
– TABLEDATA : Pure XML format for small tables.
C++ VOTable Parser
Motivation:
–
–
–
Provide a library for API based access to
VOTable files.
APIs can be directly used to develop
VOTable applications without having to do
raw VOTable processing.
Streaming and Non-streaming versions are
available.
Sonali Kale, Sudip Khanna
C++ VOTable Parser
Salient Features:
•
•
•
Implemented as a wrapper over XALANC++.
XALAN-C++ is a robust implementation of
the W3C recommendations for
XSL Transformations (XSLT) and the
XML Path language (XPath).
XPath queries can be used to access the
VOTable data.
Project Design
VTable
Metadata
Link Collection
Field Collection
Field
Table Data
Link
Link Collection
Link
Values
minimum
Row Collection
maximum
Row
Option Collection
Column Collection
Options
IUCAA HPC Facility
Hercules
Co-proposed by :
Ajit Kembhavi
T. Padmnabhan
Tarun Souradeep
• Four Alpha Server ES-45 machines
HPC Team :
Sarah Ponthratnam
Sunu Engineer
Rajesh Nayak
Anand Sengupta
• Each with 4 processors Alpha (21264C)
•1.25 GHz clock speed
• Cache on chip: 64 Kb –I, 64 Kb-D
• Cache : 16 Mb ECC DDR
• RAM 3 x 8 Gb + 12 Gb
• Fast, Low latency interconnect
> 30 G flops
Preliminary HPL
benchmark
• Memory channel Architecture (MCA)
• High volume Storage
ES-45
• 1 Tera-byte SCSCI Specfp2000: 1327
Linpack(Tru64
1000x1000:
•Trucluster clustering environment
Unix, 6847
DecMPI, openMP)
Virtual Observatory - India
Persistent Systems
IUCAA
Caltech, Fermilab, JHU, NASA/HEARC,
Microsoft, NCSA/UIUC, NOAO, NRAO,
Raytheon ITS, SDSC/UCSD, SAO/CXC,
STScI, UPenn, UPitts/CMU, UWis, USC,
USNO, USRA, CVO
NVO-People
Virtual Observatory - India
Ajit Kembhavi
Inter-University Centre for Astronomy and Astrophysics
Pune, India
Virtual Observatories
• Provide tools for data analysis, visualization
and mining.
• Develop interoperability concepts to make
different databases seamless.
• Manage vast data resources and provide
these on-line to astronomers and other
users.
• Empower astronomers by providing
sophisticated query and computational
tools, and computing grids for producing
new science.
Terapix
Jodrell Bank
Registry and DIS
High Volume Storage
Raid 5, 4 Terabyte
CVO Collaborations
• There are three major projects at the CVO involving
collaborations with other VO.
• CVO is collaborating with the German Astrophysical VO
to incorporate ROSAT X-ray data and catalogues into the
CVO system.
• CVO is collaborating with the Australian VO.to
incorporate 2Qz and 2DF galaxy spectra into the CVO
database.
• CVO is an associate member of NVO and is have put in
place some components of the NVO galaxy morphology
demo.
Science Initiatives
• Many IVOA projects have active Science Working
Groups consisting of astronomers from a broad
cross-section of the community representing all
wavelengths.
• The focus here is to develop a clear perception of
the scientific requirements of a VO.
• Projects within the working groups will develop
new capabilities for VO based analysis.
• This will enable the community to create new
research programs and to publish their data and
research in a more pervasive and scientifically
useful manner.
Australian –VO Collaborations
• The distributed volume renderer (dvr) software, is a
tool for rendering large volumetric data sets using the
combined memory and processing resources of
Beowulf like clusters.
• A collaboration between the Melbourne site of Aus-VO
and AstroGrid aims to develop the existing dvr
software into a grid-based volume rendering service.
• Users will be able to select FITS-format cubes from a
number of "Data Centres",have the data transferred to
a chosen rendering cluster, and then proceed to
visualise the volume of data remotely (See Demo).
C++ VOTable Parser
•
Initial version
- Released on May 31st , 2002.
- Support only for reading of tables.
- Support only for pure-XML TABLEDATA and not for BINARY or
FITS data streams.
- Runs on Windows NT 4.0, Windows 2000 and
RedHat Linux 7.1.
•
Future enhancements
- Can be incorporated quickly and efficiently.
Parser Design
Class Details
•
VTable: In memory representation of a single <TABLE>
from the <RESOURCE> element in VOTable
•
•
•
•
•
•
TableMetaData: Contains MetaData (Fields, Links and Description)
Resource: Represents the <RESOURCE> element in the VOTable.
TableData: Contains Rows
Field: Representation of <FIELD> from VOTable
Row: Representation of <TR> from VOTable
Column: Representation of <TD> from VOTable
Parser Design
API – Typical Operations
•
File Level I/O Routines
– Open VOTable file
– Close VOTable file
•
Table I/O Operations
– Get number of rows
– Get number of columns
– Get column(field) information (column name, column number,
etc.)
– Accessing table data
Parser Implementation
•
•
•
•
•
•
Development on Windows NT 4.0 platform using VC++.
Ported to RedHat Linux 7.1/gcc-2.96 with zero effort.
18 C++ classes representing various elements of the
VOTable format.
8500 lines of C++ code written for V1.1 release
Project start date: April 7th 2002
V1.1 Release: May 31st 2002
Current status: V1.2 design in progress
What is in Release V1.1




Parser to serve as a building block for developing
VOTable based applications.
Can be easily used by users of CFITSIO library.
Supports powerful XPath queries against
VOTable files.
The first version of the VO Table parser can now
be downloaded:
http://vo.iucaa.ernet.in/~voi/html/infopage.html
VOTable Parser Demo

Serves as a tutorial to help understand the basic
APIs provided by the VOTable parser.

Demonstrates how to access the data and
metadata elements of a VOTable file.
Future Work
•
•
•
•
•
Develop APIs for writing data in VOTable
format.
Develop APIs for supporting IMAGE data and
FITS files in VOTable.
Enhance existing API set to allow more
elaborate and flexible operations on VOTable
files.
Support future VOTable versions.
Develop applications for conversion between
FITS and VOTable formats.
References
•
•
•
•
The first version of the C++ parser can now be
downloaded from the VO-India website
http://vo.iucaa.ernet.in/~voi
VOTable Details:
http://vizier.u-strasbg.fr/doc/VOTable/
XALAN
http://xml.apache.org/xalan-c/index.html
XPATH
http://www.w3.org/TR/xpath
Virtual Observatory - India
SDSS Data Features

Size
:
900 Gb

DBMS
:
Microsoft SQL (MS-SQL)

Data Contains :
1) Spectroscopic data
2) Tilling data
SDSS Query Architecture
User
Interface
Client
Search MS-SQL Database
MS-SQL
Server
Submit Query/Request
User
Process Query
Output
Output : 1) HTML
2) XML
3) CSV
Data Catalogs & Web services at IUCAA
Catalogs
Catalog
Description
2dfQSO
 Size : 4 MB
2dfGRS
 Size : 4 GB
 Organized as mSQL
2MASS
 Size : 42 GB
Sky Survey
 Size : 13 GB
FIRST
 Size : 192 GB
Web Services
1) VizieR Services
 The most complete library of astronomical catalogues (e.g. Guide
Star catalogues, USNO-BI.
 Tools to select, extract, format records matching a certain criteria.
2) Anglo-Australian 2DF System
 Query Tool to select records from the 2DF catalogue.
 Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.
Star Positions
• REGISTRIES: These collect metadata
about data resources and information
services into a queryable database. The
registry is distributed. A variety of industry
standards are being investigated.
• DATA MODELS: This initiative aims to
define the common elements of astronomical
data structures and to provide a framework to
describe their relationships.
• UNIFORM CONTENT DESCRIPTORS:
These will provide the common language for
for metadata definitions for the VO.
Data Catalogs & Web services at IUCAA
Catalogs
Catalog
Description
2dfQSO
 Size : 4 MB
2dfGRS
 Size : 4 GB
 Organized as mSQL
2MASS
 Size : 42 GB
Sky Survey
 Size : 13 GB
FIRST
 Size : 192 GB
Web Services
1) VizieR Services
 The most complete library of astronomical catalogues (e.g. Guide
Star catalogues, USNO-B1)
 Tools to select, extract, format records matching certain criteria.
2) Anglo-Australian 2DF System
 Query Tool to select records from the 2DF catalogue.
 Display Skymap & Spectrum (FITS) of objects in 2DF catalogue.
SDSS Data Features

Size
:
900 GB

DBMS
:
Microsoft SQL (MS-SQL)

Contains :
Spectroscopic data
Tiling data
SDSS Query Architecture
User
Interface
Client
Search MS-SQL Database
MS-SQL
Server
Submit Query/Request
User
Process Query
Output
Output : 1) HTML
2) XML
3) CSV
VO Schema