Building Problem Solving Environments with Application Web Service Toolkits Choonhan Youn and Marlon Pierce Computer Science, Syracuse University And Community Grid Labs, Indiana University.

Download Report

Transcript Building Problem Solving Environments with Application Web Service Toolkits Choonhan Youn and Marlon Pierce Computer Science, Syracuse University And Community Grid Labs, Indiana University.

Building Problem Solving
Environments with Application Web
Service Toolkits
Choonhan Youn and Marlon Pierce
Computer Science, Syracuse University
And Community Grid Labs, Indiana
University
Presentation Outline
• Introduction
– What is the Computational web portal?
– Gateway: computing web portal
– Limitations of traditional approach
• Web Service-Based Computing Portal Architecture
• Core Web services for Computing Portals
–
–
–
–
–
Job submission
File Manipulation
Context Management
Script Generation
Job monitoring
• Application Web services
• Web service negotiation.
• Conclusion
Computational Web Portals
• Computational Web Portals provide seamless access to HPC resources
– You can log in anywhere through any general web browser.
• Portals simplify the use of HPCs for novice users.
– Basics: batch script generation, job submission and monitoring, file
service and ……
– Computational grid services: Globus, Condor
• Portals can simplify the use of unfamiliar codes.
– GEM code: disloc, simplex
• Provide a work management environment for all users.
– You can see what you did last week.
• Other PSEs Web portals
–
–
–
–
NASA Information Power Grid LaunchPad
NPACI Hotpage
Pacific Northwest National Laboratory’s Ecce system, UNICORE
Our own Gateway/ServoGrid projects
Gateway project
• Gateway is a computational web portal project funded through:
– DoD HPC MO PET Portal: Kerberos security in computational web portal
– GEM science: Support codes developed by earthquake modeling
consortium
– Alliance: Contribute to NCSA portal
– SciDAC (Scientific Discovery through Advanced Computing): DOE
project to build portal services for Plasma physics
• Our goal is to provide building block components that can be used to
build specific portals.
• We also develop browser-based interfaces for basic services and
specific science codes.
• Developed to support typical, if simple, high performance computing
services
– Batch script generation, job submission and monitoring, file management
and transfer.
– Do it all securely
Problems with Traditional Portal Architecture
• Portals accesses heterogeneous
back ends and grids through a
particular middle tier.
• Most portal projects are not
interoperable
–
–
–
–
Portal developers don’t have to
reinvent every single important
service (lesson from GGF GCE).
Users will have access to more
services than any one project can
provide.
Users will be able to pick up the best
available implementation of a service.
…
Web browser
?
Middle tier software incompatible
Wide range of protocols.
• Why do we need the portal
interoperability?
–
Web browser
services
services
…
Back end resources
…
Back end resources
Web Service-Based Computing
Portal Architecture
Simulation
Component
JS: Job submission
JM: Job Monitoring
FT: File Transfer
CM: Context Manager
SG: Script Generation
AWS: Application Web Service
HIS: Host Independent Service
HSS: Host Specific Service
Data
Component
HPC
Backend
Resources
Data Base
HSS
HSS
Middle Tier
(Web Server)
JS
SOAP
JM
JS
FT
JM
FT
…
Web Services Provider
Publish
SOAP
Publish
SOAP
Service
Repository
SOAP
SOAP
SOAP
SOAP
Publish
CM
Repository
Client
SOAP
Client
User Interface Server
Portal Server
AWS
SOAP
SG
HIS
HTTP
…
Web Browser
HTTP
Middle Tier
(Web Server)
Core Web services – 1
• Given WSDL and SOAP, what can you build?
• Host-Specific Services (HSS)
–
–
–
–
Instances of these services are bound to particular hosts.
Job Submission
File Transfer
Job & Host Monitoring
• Host-Independent Services (HIS)
–
–
–
–
Informational services that are not tied to specific service points
The service provided does not depend on the location.
Context Management
Script Generation
• These core services are simple, stateless.
Core Web services - 2
• Job Submission
– Allow users to execute scientific applications
– Execute operating system calls directly or may interact with Grid
services through, for example, the CoG client API to Globus.
– We use Java Runtime processes to run external (non-Java) commands,
for example, PBS qsub.
• File Manipulation
– Upload and download files between their desktops and various backend
destinations.
– Allow users to transparently move, rename, and copy files on remote
back-ends and crossload between different backend sites.
– File uploading and downloading service illustrate the use of SOAP
messages with attachments in the RPC messaging style.
– SOAP attachments are non-XML files that are appended to the SOAP
message and are useful for sending binary data and files with known
MIME formats.
Core Web services - 3
• Context Management (CM)
– Archives interactions with the computational portal and stores all of the
metadata associated with user sessions.
– Provides simplest possible data model
• CM provides an easy interface to an arbitrarily deep and complex tree-shaped
data structure.
• Context data nodes are defined by recursive schema that hold optional,
unbounded name/value pairs and child nodes.
– We use CM to store locations of job scripts, miscellaneous file URIs,
user’s application instance XML files, etc.
– CM metadata stored on file systems, XML-native databases, ….
• Actual data may be anywhere.
– Actual service interface for manipulating contexts and the context data
•
•
•
•
Add one or more contexts.
Search and store the context data with XPath queries.
Remove the specified context.
List the child contexts.
Context Manager Architecture
Client
SOAP/HTTP
Shared
WSDL
Interface
Axis Servlet
Context
Manager
Internal
Communication
Context Data
FS
XMLDB
Core Web services - 4
• Script Generation
– For users who are unfamiliar with HPC systems.
– The information about user’s choice with the portal interaction is
stored as user’s application instance XML document.
– Generate the job script which could be broken down into two parts:
a queue script for a particular queuing system such as PBS, LSF
and LoadLeveler and a user script for running the application code.
• Job monitoring
– Has been built in the polling method.
– Monitor the execution of a job running in a queuing system.
– Return the array of the generated a WSDL complex type,
effectively an XML data object that contains the job status of the
scheduler, given the user name and the type of queuing system as
input parameters on job monitoring method.
File manipulation service
List user files on selected
host, Solar. File operations
include Upload, download,
Copy, rename, crossload
Job monitoring service
List the user’s job status
on selected host, Solar that
is running PBS queuing
system.
Application Web Services (AWS)
• Application: specifically some code developed by the
scientific community.
– Example: Finite element codes, grid generation codes and so on.
• AWS are designed to make scientific applications (i.e.
earthquake modeling codes) into Grid Resources.
• An actual application is wrapped by a Java program.
• We need a meaningful metadata model for applications
– Describe application-specific requirements
– Describe bindings of applications to host environments and to Web
services in a general way that is independent of the particular
portal.
• Scientific applications consist of several core Web services.
– Get files to right place, script submission instructions, submit the
job, get notified at various states.
AWS Lifecycle
• Applications can exist in four stages:
– Abstract state: describes optional choices and
configurations that are available.
– Ready state: Specific choices are made
– Submitted: Application is running
– Completed: Application is finished, but we
need to archive information about it.
AWS Schema Structure
• Two sets of XML schema:
– Application Descriptors:
• describe abstract state.
• describe application options. Used by the application developer
to deploy his/her service into the portal.
– Application Instance Descriptors:
• describe particular instance states (ready, running, archived).
• describe particular user choices and archive them for later
browsing and resubmission.
• Schema sets are arranged hierarchically
– Applications contain hosts
– Schema are designed to be pluggable
• Don’t like my queue description schema? Plug in your own.
AWS XML Descriptors
• Application description schema
– A “basic information” element that contains information such as
application name, version, option flags.
– An “internal communication” element that contains child elements for
describing input, output, and error fields for the code.
– An “execution environment” element that contains a list of core services
needed to execute the application.
– An optional, generic parameter to hold arbitrary information about the
application.
• Host description schema
– Contains information about the resource such as DNS name and IP
address
– All of the information needed to invoke the parent application on that
resource such as location of the executable, location of the workspace or
scratch directory, and so on.
• Queue description schema
– Contains information needed to perform queue submissions such as
memory size, number of CPUs and so on( in case of PBS).
Example: Deploy an application code, Simplex on a particular host as a service
and this form is used to edit the Application XML descriptor file
Sample generated user view of application code, Simplex: this form is
generated from the Application XML descriptor for a particular application
runs: the input files used, the location of the output, the resources used for the
computation, etc.
Portal Stack
Aggregate Portals
User Interfaces
Application Web Services
and Workflow
Core Web Services
Message Security, Information
• Core services provide the
basic connection to back
end “Grid” services.
• Application services
combine core services and
application metadata.
• User interface portlets are
built for each service.
• Portals aggregate portlet
components into portals.
Portlets for User Interface
Components
• Web services define XML interfaces for accessing
services.
• User interface components (such as JSPs) combine service
stubs into useful objects for human interaction.
• So we actually have two points of interoperability:
– At the WSDL interface
– At the user interface
• Portlets combine HTML (and other) user interfaces into
aggregate portal interfaces.
– EX: Jetspeed from Jakarta
Reliability of Distributed
Services
• Distributed service systems have some important reliability
problems
– Information must be up to date.
• The system adjust when servers become available or unavailable.
• Service metadata should match the actual capabilities of the system.
– Messages should reach the services.
• We are automating application service metadata through
publish/subscribe mechanisms.
– Servers contain embedded publisher/subscriber clients
– Information aggregators publish requests for information to JMSstyle brokers.
– All available servers subscribed to the request topic publish their
information back to the aggregator.
Bridging Between Client-Server
and Messaging Services
Browser
Peers register
themselves
to Aggregator
Tomcat
Server
Tomcat
Server
Broker
Aggregator
HTTP
Tomcat
Server
SOAP
Dynamic
User Interface
Component
Web service
request for
information
Tomcat
Server
Servers
run Narada
Notifiers
Tomcat
Server
•
•
•
•
•
•
•
•
Conclusions
Traditional portals have “stovepipes” with interoperability problems.
By designing and implementing several core portal services and Application Web
Services around Web services, we gain interoperability and reusability.
The emphasis on the development of reusable services that can form the basis for
multiple PSEs.
The portal developer can construct specific implementations and composites of
primitive service components and can also provide services that may be shared
among different portals.
Application-specific services and data models that can be used to encapsulate entire
applications independently of the portal implementation.
User interfaces to application services become distributed portlets.
Everything is distributed
– Core Web Services->Application Web Services->User Interfaces Portlets>Portals
– Uses HTTP, SOAP, WSDL, ….
It all has to be secured.
– A flexible, message-based security system that can be bound to multiple mechanism and
multiple message formats.
– The general approach: to use assertion
– SAML, WS-Security
– Kerberos, PKI