Krakow 2008 poster - FNWI

Download Report

Transcript Krakow 2008 poster - FNWI

Cooperative experiments in VL-e: from scientific workflows
to knowledge sharing
Z.Zhao (1) V. Guevara(1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B. Hertzberger(1)
(1) Informatics Institute, University of Amsterdam, The Netherlands
(2) Institute of Computer Science AGH, Krakow, Poland
Complex Scientific experiments
Key tools for supporting the lifecycle of a scientific experiment
Complex scientific experiments involve distributed scientific data and resources, and are
often shared among scientists from different domains. Cooperative experiments involve not
only coordination of resources and computing processes, but also the sharing and transfer of
knowledge among scientists.
Support for Cooperative experiments has become, an important requirement for the eScience middleware. Semantic technologies enhance the storage and query of Grid
resources and the high level searching and matching between different resources.
The mission of the Dutch Virtual Laboratory for e-Science project is to provide generic
functionalities that support a wide class of specific e-Science application environments. A
set of tools are being developed for:
• modeling and managing workflow templates
• browsing resources
• composing and executing workflows on Grid-enabled resources
comp(C i )
• supporting workflow interoperability.
Information
gathering
DNA micro-array
experiment
Micro-beam
experiment
Common aspects of
an experiment
Experimentation
Access to
devices
Access to
data
Interpretation
Access to information
Process and data flow
in an experiment
COMM ENT
date : DATE
comm ent : STRI NG
cr eat or : PERSON
0. .1
pr oj ect _of
0. .*
EXPERI MENT
name : STRING
i d : STRI NG
pr oj ect _i d : STRING
t ype : STRI NG
s ubj ect : STRI NG
date : DATE
descr i pti on : STRI NG
publ i s hed_i n : STRI NG
l i ter at ur e : STRING
ur l : STRI NG
has_next_exp
0. .1
has_exp
exper i ment _i n
1. .*
has_comm ent
0. .*
has_el ement
has_comm ent
0. .*
0. .*
1. .* el em ent_of
has_s ub_el m
EXP_ELEMENT
name : STRING
i d : STRI NG
exp_i d : STRI NG
descr i pti on : STRI NG
has_s uper _elm
0. .1
has_pr ev_el m
has_s ubmi t ter
Process Flow Template
0. .*
has_next_el m
has_r el at ed_exp
0. .*
1. .*
0. .*
has_contr i but or
1. .1
0. .*
ADDRESS
s t reet : STRI NG
post al _code : STRI NG
ci ty : STRI NG
s t at e : STRING
count r y : STRI NG
has_pr oper t y
0. .*
PROCESS
date : DATE
has_addres s
0. .*
has_addres s
0. .*
has_em pl oyee
empl oyee_of
0. .*
0. .*
perf or med_by
1. .*
has_contr i but ed
has_s ubmi t ted
0. .*
has_pr oject
ORGANI ZATI ON
name : STRING
acti vi t y_t ype : STRI NG
phone : STRING
f ax : STRI NG
emai l : STRI NG
ur l : STRI NG
has_vendor
0. .*
has_vendor
has_pr otocol
0. .*
PERSON
name : STRING
i d : STRI NG
t i tl e : STRING
phone : STRING
f ax : STRI NG
emai l : STRI NG
ur l : STRI NG
PROTOCOL
name : STRING
i d : STRI NG
descr i pti on : STRI NG
has_per for med
0. .*
has_def ined
defi ned_by
SOFTWARE
name : STRING
i d : STRI NG
descr i pti on : STRI NG
0. .*
TEMPLATE
has_har dwar e
HW_TOOL
has_par am eter
1. .1
0. .1
Annotations on an
experiment
PROPERTY
name : STRING
num_val : DOUBLE
t ext _val : STRI NG
unit : STRI NG
name : STRING
i d : STRI NG
has_par am eter
The WS-VLAM is a workflow system to coordinate the
execution of distributed Grid-enabled software components.
• Developed following the OGSA/WSRF standard
• Provides a client-side applications allowing scientists
to design and monitor the execution of the workflows.
• Provides server-side applications, including a
workflow engine that schedules and executes the
workflow on the Grid.
Provides tools to recognize different workflow
descriptions,
• Describe the meta information according to a
predefined schema.
• Launch workflows, and monitor its execution.
PARAM ETER
SW_TOOL
interactive execution of applications
monitoring of the experiment execution,
viewing intermediate results,
changing parameter ranges and setting new policies.
•
0. .*
1. .1
•
•
•
•
DATA_ELEM ENT
0. .*
HARDWARE
name : STRING
i d : STRI NG
descr i pti on : STRI NG
•
•
interactively access resources to manipulate data
(upload, download, search, annotate, and view),
start applications and monitor resources
access data files stored in different security domains,
file systems, and protocols, including the Storage
Resource Broker (SRB), Grid-FTP, SSH, and web
services.
The VLE-WFBus provides interface to wrap and integrate
legacy scientific workflows.
has_pr ev_exp
0. .*
•
The Framework for Interactive Parameter Sweep (FRIPS)
aims to support:
VL-e Approach
PROJ ECT
name : STRING
i d : STRI NG
descr i pti on : STRI NG
s t ar t _dat e : DATE
end_dat e : DATE
ur l : STRI NG
The Virtual resource Browser (VBrowser) offers scientists
an environment in which they can:
0. .*
1. .1
has_s of twar e
Experimentation Environment
Datasystems,
Model
Grid accessible infra: apparatus,
network
Key phases in the lifecycle of a scientific experiment
The development of a scientific experiment has different activities performed at different
periods of time. These activities can be grouped in a general lifecycle with four phases:
Discussion
•
•
•
•
problem investigation
experiment prototyping
experiment execution
results publication
In each of the phases of the
lifecycle, support for cooperative
interactions is required and we
describe this support from three
dimensions:
Experiment
Prototyping
Problem
investigation:
•Look for relevant problems
•Browse available tools
•Define the goal
•Decompose into steps
•Design experiment workflows
•Develop necessary
components
Shared
repositories
Results
Publication
•Annotate data
•Publish data
• information sharing
• communication
• coordination.
• We can clearly see the importance for supporting cooperative experiments in e-Science; the tools developed in the
VL-e project are towards this direction.
• Compared to the web 2 based cooperative environments, such as myExperiment, VL-e tools have clear focus on the
runtime issues of the workflow, which makes the VL-e tools complementary with these web 2 environments.
Experiment
execution
•Execute experiment processes
•Control the execution
•Collect and analysis data
• Currently, there are close discussion going between VL-e and the MyExperiment society, such as proposing
WSVLAM workflows as new workflow types which are shared between scientists, and making workflow bus as
generic execution interface for different workflows shared over MyExperiment environment.
Bioinformatics use case
(1) Problem investigation
(3) Experiment execution
(4) Publication and sharing
To perform /in silico /experimentation with genomics data,
biologists need tools to process and compare datasets and
explore the obtained results interactively. The traditional
program used to identify RIDGEs in transcriptome maps:
Scientist use the workflow template to perform the
experiment on grid-enabled resources, details of this complex
infrastructure is hidden through the use of:
After the successful execution of the application workflow, the
workflow components are:
• the resource browser is used to browser distributed
storage resources to select the appropriate data sets.
• take hours to run in a typical desktop computer,
• lacks the versatility needed for interactive and explorative
analysis
• an intuitive workflow composer is used to parameterize
and extend variants of the workflows.
.
(2) Experiment Prototyping
A modular and generalized version of the original program is
first developed and executed across grid-enabled resources.
Workflow templates are created and ready to use.
Contacts:
zhiming Zhao, e-mail: [email protected]
Related Links:
• http://www..vl-e.nl
•
http://www.science.uva.nl/~gvlam/wsvlam
• semantically annotated
• stored in a shared repository
The workflow components are available to all scientists in the VL-e
project and other worldwide via a web application (HAMMER)
allowing to:
• query and download description of the workflow components to
the user space so they can create new version of the application
workflow.
Project Leader : L.O. Hertzberger
Phone: 020 525 7464
Fax: 020 525 74 90
e-mail: [email protected]
This work was carried out in the context of the
Virtual Laboratory for e-Science project. This
project is supported by a BSIK grant from the
Dutch Ministry of Education, Culture and Science
(OC&W) and is part of the ICT innovation
program of the Ministry of Economic Affairs (EZ).