The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University Acknowledgements Funding from NSF NMI (2003-2007) and OCI SDCI (2007-2010). Current participants Indiana University (Pierce,

Download Report

Transcript The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University Acknowledgements Funding from NSF NMI (2003-2007) and OCI SDCI (2007-2010). Current participants Indiana University (Pierce,

The Open Grid Computing
Environments Project
Marlon Pierce
Community Grids Laboratory
Indiana University
Acknowledgements
Funding from NSF NMI (2003-2007) and OCI
SDCI (2007-2010).
Current participants
Indiana University (Pierce, Gannon)
RENCI (Kandaswamy)
RIT(von Laszewski)
SDSC (Wilkins-Diehr)
SDSU (Thomas, Edwards)
TACC (Dahan)
Outline
Web Portals and Science Gateways
OGCE efforts
OGCE Portal Software
Portal tools
Java COG, GTLAB
OGCE Gateway Services
GFAC, GPIR
Software Engineering Issues
What is next?
OGCE Goals
To provide easily installable, well-tested
software for building Web client and service
components that constitute a Grid
Computing Environment.
Science Web Portal --> GCE --> Science
Gateway
To support developing groups through
training, outreach, and divine intervention.
Gateways have many needs that can’t be
solved by downloadable software alone.
What Is a Web Portal?
 Aggregate content from
multiple sources into a
single display.
 Typically consume
RSS/Atom news feeds.
 More powerful versions
these days support
Flickr, calendars,
games, etc.
 Gadgets, widgets
 Examples: iGoogle,
Netvibes, My Yahoo!
Science Portals and
Gateways
Science portals resemble standard portals, but
must also
Support access to computing and storage
resources.
Allow users remote, Unix-like access to these
resources.
Provide access to science applications and data
sets.
So security is crucial.
And we must provide value added services as
well as user interfaces.
Workflow Composer
Grid
Portal Server
User’s Desktop
User’s Browser
A Comprehensive Gateway
Architecture
Security
Services
Gateway Services
User Data
& Metadata
Catalogs
Data
Services
Application
Resource
Catalogs
Information
Services
Workflow/ Application
Execution Engine
Job MGMT, Resource Broker
And Scheduling Services
Globus-Teragrid “OGSA-Like” Services
Security
Services
Components for Science
Portals
OGCE is founded on the principal that portals
should be built out of reusable parts.
Key standard in our first phase: the JSR 168
portlet specification.
Portlets can run in multiple containers
uPortal, Sakai, GridSphere, LifeRay, etc.
Allows us to build Grid specific components and
deploy along side other goodies: Sakai
collaboration tools, contributed portlets, etc.
OGCE Portal Software
OGCE GPIR portlet can interoperate
with TeraGrid and your own GPIR
services.
Manage TeraGrid MyProxy
credentials with the OGCE
ProxyManager portlets.
OGCE file management client portlets
interact with TeraGrid GridFTP servers.
General purpose batch and interactive job submission to
GRAM, WS-GRAM is supported.
Dashboard Portlet
The dashboard portlet allows users to track jobs on the
selected resource. The user can view either his own set
of jobs or get information on all submitted jobs.
14
Queue forecasting portlets work
with the NWS QBETS to predict
wait times and deadlines.
PURSe portlets manage user requests for
portal accounts and Grid credentials.
Condor and Condor-G
OGCE IFrame Portlet can be
used to integrate external sites.
Building Your Own Grid
Portlets
Coding Portlets
Portlets are just servlet-like Java classes.
Basic API key methods:
doView(), processAction().
These are coupled to JSP pages (typically)
through tag libraries and request dispatchers.
OGCE supports Velocity portlets
So we must provide the coding logic for
processAction().
COG abstraction layers provide this.
CoG Abstraction Layers
Nano
materials
BioDisaster
Informatics
Management
Applications
Portals
Development
Support
CoGGridfaces
GridfacesLayer
Layer
CoG
CoG
CoG GridIDE
GridIDE
CoGData
Dataand
andTask
TaskManagement
ManagementLayer
Layer
CoG
CoGAbstraction
AbstractionLayer
Layer
CoG
CoG
CoG
CoG
CoG
CoG
CoG
GT2
GT3
(X)
GT4
WS-RF
CoG
CoG
CoG
CoG
Condor Unicore
CoG
CoG
CoG
CoG
SSH
Others
Task
Handler
The class diagram
is the
same for all grid
tasks (running jobs,
modifying files,
moving data).
Task
Task
Specification
Service
Security
Context
Classes also abstract toolkit
provider differences. You set
these as parameters: GT2,
GT4, etc.
Service
Contact
Task and Specification
Task task=new TaskImpl(“mytask”,
Task.JOB_SUBMISSION);
task.setProvider(“GT2”);
JobSpecification spec=
new JobSpecificationImpl();
spec.setExecutable(“rm”);
spec.setBatchJob(true);
spec.setArguments(“-r”);
…
task.setSpecification(spec);
Service and Security Context
Service service=new
ServiceImpl(Service.JOB_SUBMISSION);
service.setProvider(“GT2”);
SecurityContext securityContext=
CoreFactory.newSecurityContext(“GT2”);
//Use cred object from ProxyManager
securityContext.setCredentials(cred);
service.setSecurityContext(
(SecurityContext)securityContext);
Service Contact and Submit
ServiceContact serviceContact=
new ServiceContact(“myhost.myorg.org”);
service.setServiceContact(serviceContact);
task.setService(
Service.JOB_SUBMISSION_SERVICE,
service);
TaskHandler handler=new GenericTaskHandler();
handler.submit(task);
Coupling CoG Tasks
The COG
abstractions also
simplify creating
coupled tasks.
Tasks can be
assembled into task
graphs with
dependencies.
“Do Task B after
successful Task A”
Graphs can be
nested.
Problems with
Portlet Development
 Grid portlets typically wrap each single Grid capability in a
separate portlet
 Problem is that Grid portlets need to combine these operations
 Portlets are entire web applications, so we need a component model for
portlets: reusable portlet parts
 Even with the COG Abstraction Layer, we must still do a lot of
coding to biuld new applications.
 To address these problems we have adopted Java Server
Faces
 Provides several nice Model-View-Controller features
 JSF provides an extensible framework (tag libraries) for making
reusable components.
 Apache JSF portlet bridge allows you to convert standalone JSF
applications (development phase) into portlets (deployment phase).
Grid Tag Libraries
and Beans (GTLAB)
 GTLAB provides common components for building portlets
using tags and reusable parts.
 The goal of GTLAB to simplify Grid portlet development
 Enable rapid development
 GTLAB capabilities include Grid operations with XML based
tags within Java Server Faces (JSF) framework.
 Grid tag libraries are built using JSF custom component
development techniques
 Grid tags are interfaces to backing Grid beans
 End users pass values to Grid beans by using tag attributes.
 We build on Java CoG 4’s abstraction layer.
 Each backing Grid bean has equal capability with a portlet
application in case of Grid portlet approach.
29
GTLAB Example
• Grid tags are associated with Grid services via Grid beans
• Grid Beans wrap the Java COG Kit (version 4)
• We show an example JSF page section below.
• This allows you to develop new Grid portlets with no additional Java code.
<html>
<body>
<f:form>
<o:submit id=”test” action=”next_page” />
<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512”
lifetime=”2” username=“mnacar” password=”***” />
<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org”
provider=”GT4” executable=”/bin/ls” stdout=”tmp/result
stderr=”tmp/error” />
</o:submit>
</f:form>
</body>
</html>
30
Grid Tags
Associated Grid Beans Features
<submit/>
ComponentBuilderBean
Creating components, job
handlers, submitting jobs
<handler/>
MonitorBean
Handling monitoring page actions
<multitask/>
MultitaskBean
Constructing simple workflow
<dependency/>
MultitaskBean
Defining dependencies among sub
jobs
<myproxy/>
MyproxyBean
Retrieving myproxy credential
<fileoperation/>
FileOprationBean
Providing Gridftp operations
<jobsubmission/>
JobSubmitBean
Providing GRAM job submissions
<filetransfer/>
FileTransferBean
Providing Gridftp file transfer
ResourceBean
Describes common properties
among all tags and beans. Passing
values given by standard visual
JSF components.
How to prepare
application pages
 Developers embed Grid tags snippet into JSF page
These components are non-visual and are not displayed in
HTML.
 Resource bean provides bridging with form inputs and GTLAB
framework.
<h:outputText value="Taskname: "/>
<h:inputText value="#{resource.taskname}" />
<o:multitask id="multi" persistent="true" task
name="#{resource.taskname}" />
 Dynamic values to Grid tag attributes are provided by Resource
bean.
 Only visual component is <o:submit/> tag that is associated
with action method of GTLAB.
32
GTLAB Dashboard Portlet
Example
<o:submit id=”track” action=”list_page” />
<o:multitask id=”dashboard” taskname=”track” persistent=”true” >
<o:myproxy id=”proxy” hostname=”gf1.ucs.indiana.edu”
lifetime=”2”
username=”#{resource.username}” password=”#{resource.password}” />
<o:jobsubmit id=”jobA” hostname=”cobalt.ncsa.teragrid.org”
provider=”GT4” executable=”/bin/whoami”
stdout=”tmp/result”
stderr=”tmp/error” />
<o:jobsubmit id=”jobB” hostname=”cobalt.ncsa.teragrid.org”
provider=”GT4” executable=”/bin/showq”
stdin=”tmp/result” stdout=”tmp/list”
stderr=”tmp/error” />
<o:dependency id=”depend” task=”jobB” dependsOn=”jobA” />
</o:multitask>
</o:submit>
33
Tracking and
Managing Jobs
 GTLAB manages lifecycles of jobs and monitor their
status.
 Grid operations are usually batch processes
 We provide callback mechanism to follow up the jobs
 GTLAB creates handlers for jobs and persistently stores them.
 GTLAB handlers manages the job events such as stop,
cancel or resuming the running jobs.
 GTLAB provides archive for job metadata and allows
managing the archive
 Handler tag helps to organize user’s job repository
 <o:handler id=”delete” action="#{monitor.delete}" >
<f:param id="task" name="taskname“ value="#{task}"/>
</o:handler>
34
OGCE Gateway Services
Web Services in Scientific
Communities (G. Kandaswamy)
 Web services are used to “wrap” scientific
applications to
 Describe, publish, discover and consume scientific
applications in a standard way
 Compose complex workflows from scientific
applications
 Run and monitor complex workflows on distributed
resources
 Such web services that “wrap” scientific
applications are called “application services”
36
A Simple Application Service
Command-line
Arguments
SOAP Request
Application
Service
Web Service
Client
Output Results
SOAP Response
Host1
Commandline
Application
Host2
37
Things Are Usually More Complicated
Initial boundary
conditions
Run once per
forecast region
ARPS-TRN
Run once per day
ARPS-SFC
EXT2ARPS
Run for each
forecast
and/or ADAS
analysis
EXT2ARPS
MCI2ARPS
88D2ARPS
Lateral boundary
conditions
Satellite data
NIDS2ARPS
Level III data
Level II data
Decoded data from
other programs (sfc,
rwh etc.)
ADAS
ARPS2WRF
WRF
ARPS-PLOT
38
The Problem
 Application services may not be available
during a workflow execution
 Unreliable resources (software, computers,
networks)
 Heavy load on service
 Does not meet QoS or security requirements
of client
 Workflows cannot complete unless all
services are available
39
GFAC Solution
 A Generic Application Factory
 A persistent web service that knows how to
create instances of any application service
 Use a Generic Application Factory to
create instances of application services
on-demand from workflows
40
Implementation
 The Generic Application Factory (GFac)
 The Generic Service Toolkit: A toolkit that
“wraps” any command-line application as
an application service
 Without writing any web service code
 Without modifying the application in any
significant way
41
Creating an Application Service
(1/2)
 Write “ServiceMap” document to describe
your service
 Write “Application Deployment
Description” document to describe a
deployment of your application
 Upload the above two documents to a
Registry service
42
Creating an Application Service
(2/2)
Service
Provider
Portal
Host1
Generic
Service
Portlet
1. Create service
request
Certificate &
Capabilities
Vault
MyProxy
Service
Capability
Manager
Service
GFac
3. Create
service
4. Configure
service
5. Register capabilities
Generic
Application
Web
Service
Service
2. Get ServiceMap &
Host Description
Registry
Service
5. Register WSDL
Host2
43
Invoking an Application Service
Portal
User
2. Access service
Generic
Service
Portlet
3. Return user interface
4. Invoke Service
Host2
5. Get Application
Deployment Description
and Host Description
Application
Service
Registry
Service
7. Return results
Certificate &
Capabilities
Vault
4. Run
application
6. Send notifications
Application
MyProxy
Service
Capability
Manager
Service
Host3
44
Software Engineering Issues
OGCE Code Repository
We use SourceForge, SVN
http://sourceforge.net/projects/ogce
Other SourceForge tools are useful.
Replaced old OGCE bugzilla with SF
bugzilla recently after we were attacked by
robots.
Portal Build System
 The portal download gives you everything you need to get
started except Java.
 Includes Tomcat, GridSphere, Ant, and Maven.
 Assume you have a Grid somewhere.
 Build system (recently revised) is designed to build everything in
one command.
 “mvn clean install”
 Also designed to support extensibility (I.e. replace GridSphere with
Sakai) and simple updates of portlets.
 We use Maven 2 exclusively.
 Nice for managing third party jar dependencies.
 It can call Ant as necessary
 Testing portals is another matter
 Normal unit test systems like Junit are not really appropriate.
JMeter Test Suite
Create lots of unit
tests, run, and see
results in a dashboard
File Transfer portlet
unit tested in JMeter
UI: check for valid
HTML response
Nightly Builds and Tests
on NMI Testbed
What’s Next?
Some Future Issues
Better support for science tools, not just bare
grids.
Experiment builder, Xbaya workflow manager,
metadata repository services and clients.
Better support for TeraGrid Science
Gateways
Logging, auditing, integration with GridShib
JavaScript Grid abstraction layers and agent
services to support non-portlet clients.
More projects: obviously we are interested in
working with the OSG
What About Web 2.0?
This is another talk entirely.
http://grids.ucs.indiana.edu/ptliupages/presentatio
ns/Web20Tutorial_CTS.ppt
http://grids.ucs.indiana.edu/ptliupages/publications
/Web20ChapterFinal.pdf
See also recent OGF 19 and 21 Workshops.
Join us at SC07 for the GCE07 Science
Gateway Workshop
~20 peer-reviewed or invited talks, with focus on
Web 2.0.
More Information
OGCE Web Site:
www.collab-ogce.org
Announcements Atom Feed
http://collab-ogce.blogspot.com/atom.xml
Contact me: [email protected]
Some Example Portals
LEAD Gateway Portal
NSF Large ITR and Teragrid Gateway
- Adaptive Response to Mesoscale
weather events
- Supports Data exploration,Grid Workflow
TeraGrid User Portal
User Portal Sharable Portlets
Account Management
view projects and allocation usage
view system account usernames
view DNs registered for account
add users to projects
supports >3500 users
 Resource
view comprehensive list of TG
resources and their attributes
view job queues, load, status of
resources
 Documentation
 current User Info
documentation
 contextual help for all interfaces
 Consulting
 TG help desk information
 portal feedback channel
 Allocation
 Info about how to apply
for/renew allocations
North Carolina Bioportal
 Principal collaborators: John McGee
and Lavanya Ramakrishnan
 Features
 access to common bioinformatics
tools
 extensible toolkit and infrastructure
 OGCE and National Middleware
Initiative (NMI)
 leverages emerging international
standards
 remotely accessible or locally
deployable
 packaged and distributed with
documentation
 National reach and community
 TeraGrid deployment
 Portals hosted at RENCI and NCSA
 Education and training
UNC-Charlotte
Visual Grid
Portal
Project Lead: Prof. Barry Wilkinson
Portal Developer: Jeremy Villalobos