Transcript Document

Data streaming, collaborative visualization and
computational steering using Styx Grid Services
Jon Blower1 Keith Haines1 Ed Llewellin2
http://www.resc.rdg.ac.uk
1
Reading e-Science Centre, ESSC, University of Reading, RG6 6AL
2 BP Institute for Multiphase Flow, Dept. of Earth Sciences, University of Cambridge, CB3 0EZ
Summary
http://jstyx.sourceforge.net
Root of the server
We present the Styx Grid Service (SGS), a system that allows existing binary
executables to be wrapped and exposed as remote services. A major advantage of
the SGS architecture is that data can be streamed directly between service instances
without the need for encoding in XML and without the data passing through a
workflow enactor. Additionally, clients can monitor service data such as progress and
status asynchronously, without requiring any incoming ports to be open through the
firewall. As we shall show, the SGS architecture can be used in collaborative
visualization and computational steering.
Available Styx Grid Services
Instances of the “sgs2” service
This poster is a summary of Blower et al., 2005 [1]. Please see this paper and the
project website (http://jstyx.sf.net) for more details.
Key principles
In the development of the SGS we were guided by several key principles:
 The software should be easy to install and use, relying on as few
dependencies as possible.
 In creating Styx Grid Services, one should not have to modify existing
executables.
 The system should be able to interoperate with other service types
Figure 1: The “namespace” (virtual file system) of a Styx Grid Service server. The SGS appears as a hierarchy of files
on a file server. To create a new service instance, read from the “clone” file: this returns the ID of the service. Set the
parameters of the service by writing to the files in the “params/” directory. Run the service by writing “start” to the
“ctl” file. The output of the service is read from the files in the “io/out” directory. The service can be steered by
writing to the files in the “steering/” directory.
 The software should be lightweight and responsive.
 Theprevious
system should
be platform-independent
as far as
possible.
Following
experience
with the Inferno operating
system
[2] we decided that
the Styx protocol for distributed systems was an appropriate base for the software.
We decided to implement the Styx protocol in Java and use this as the foundation for
the SGS system.
The Styx protocol – everything is a file!
The Styx protocol [3] is a well-established protocol for building distributed systems.
Styx is a key component of the Inferno and Plan 9 operating systems. In every Styxbased system, all resources are represented as a file, or a set of files. However, in a
Styx system the “files” are not always literal files on a hard disk. They can represent a
block of RAM, the interface to a program, a physical device, a data stream or indeed
anything. Styx can therefore be used as a uniform interface to access diverse
resource types. Styx files are organized in a hierarchical filesystem known as a
namespace. Since the Styx protocol only has to operate on files, it is very
lightweight, containing only 13 commands, such as “open”, “read”, “write” etc.
Each resource in a Styx system can be represented very naturally as a URL, e.g.
styx://myserver:9876/sensors/temperature might represent a file on a remote server
that can be read to provide temperature data from a sensor. These URLs are
effectively pointers to Styx resources.
Styx systems typically use persistent connections.
The client initiates the
connection and then messages can pass freely between the client and server until the
connection is closed. This use of persistent connections means that clients can
receive asynchronous messages without requiring incoming ports to be open through
the firewall. This is the basis of the monitoring of service data (progress, status etc) in
the SGS architecture.
Streaming data between services in workflows
The original motivation behind the development of the Styx Grid Service was the need
to handle large binary datasets in workflows. We wanted to be able to pipe data
directly between services, analogous to using the pipe operator on a Unix
command line, except that the data will be transferred across the Internet. The
methodology is as follows:
Using the JStyx software to create Styx Grid Services
Creating an SGS server
Creating client programs
Running an SGS server is simply a matter of
constructing an XML configuration file
(below) containing details such as the
locations of the executables to wrap, the
input files required and the parameters taken.
Then a Java program is run that reads this
configuration file and generates the server
program.
The SGSExplorer program (below) is a
“universal client” for Styx Grid Services. It
reads information from the server about the
required inputs and outputs for a service and
automatically generates a simple GUI.
<gridservice name="lbflow"
command="/path/to/lbflow -i input.sim"
description="Lattice Boltzmann sim.">
<streams>
<outstream name="stdout"/>
<outstream name="stderr"/>
</streams>
<serviceData>
<serviceDataElement name="status"/>
<serviceDataElement name="exitCode"/>
</serviceData>
<inputfiles>
<inputfile path="input.sim"/>
</inputfiles>
</gridservice>
Figure 2 (left): Excerpt from XML configuration file for SGS server
(right): Screenshot of SGSExplorer application
Collaborative visualization
The SGS architecture allows several users simultaneously to view live output from a
running program. Any number of clients can download from the same stream at the same
time.
This makes the construction of collaborative visualization applications
straightforward.
1) We create a binary executable that writes data to its standard output (and maybe
standard error too).
Computational steering
2) In the SGS namespace we create a virtual file that represents the standard output
(Fig. 1). When clients read from this file over the network they get the data.
3) This
file
can
be
uniquely
identified
by
a
URL,
e.g.
styx://server:port/sgs1/instances/1/io/out. This URL is a pointer (or reference) to the
stream and can be passed around by workflow engines.
4) Downstream services can download data from this URL. They do not know the
difference between downloading from a live data stream and downloading from a
static file. They can then pass the data to the standard input of another executable.
Wrapping Styx Grid Services as Web Services
In order to interoperate with other remote service types, Styx Grid Services can be
wrapped, for example as a Web Service. The inputs to the Web Service would
specify the values of all the input parameters. The Web Service would then create a
new SGS instance, start it running and return the URL to the root of the new service
instance (e.g. styx://server:port/sgs1/instances/2/) as its output. This URL can then be
passed to other services in a workflow. Downstream services can get the output from
the SGS instance from styx://server:port/sgs1/instances/2/io/out.
SGSs could similarly be wrapped as WS-Resources: this is currently being
investigated by Andrew Harrison of Cardiff University.
Alternatively, the API functions provided in
the JStyx software allow custom client
programs to be written with little knowledge
of the underlying mechanisms.
Figure 3: Computational steering of a Lattice Boltzmann
simulation. The slider in the top right is used to
dynamically vary the pressure gradient driving the flow.
Some programs allow parameters to be
changed while the executable is running,
allowing the user to see the effects of
changing the parameters in a live setting.
This is known as "steering". One way to
achieve this is to have the executable
read the values of these parameters from
local files, continuously polling the files
for updates as the executable runs. In
the SGS framework, these files can be
manipulated via the Styx interface (see
figure 1), allowing the computation to be
steered remotely, usually simultaneously
with visualizing the results.
References
[1] J. Blower, K. Haines and E. Llewellin, Data streaming, workflow and firewall-friendly Grid Services with
Styx, Proceedings of the UK e-Science All Hands Meeting, September 2005
[2] J. Blower, K. Haines, and A. Santokhee: Composing workflows in the environmental sciences
using Inferno, Proceedings of the UK e-Science All Hands Meeting, September 2004
[3] R. Pike, and D. M. Ritchie: The Styx architecture for distributed systems,
http://www.vitanuova.com/inferno/papers/styx.html.