OGCE Briefing to NSF OCI An overview of the NMIfunded OGCE project.

Download Report

Transcript OGCE Briefing to NSF OCI An overview of the NMIfunded OGCE project.

OGCE Briefing to
NSF OCI
An overview of the NMIfunded OGCE project
Why Are We Here?

Discuss OGCE successes and influence on the Web
portal and science gateway community.


Discuss portal and gateway research opportunities.



OGCE is funded to build standards-compliant portal components
and services to support user interactions with CI middleware.
What are the future directions of portal and gateway
technologies?
What is the role of portals within Cyberinfrastructure
Discuss portal software future

Proposed Software Service Provider: TeraGrid Science
Gateway Software Center
Summary of Successes

Supporting Science



Software




RENCI, TeraGrid User Portal, LEAD, DES/LSST, CIMA, and other
portals represent a significant amount of NSF and other funding
Supporting (collectively) > 100’s of codes (from RENCI alone),
potentially 1000’s of users (from TUP alone), and access to Terabytes of
data (just from CIMA and DES).
Over 1800 IP-unique downloads of the OGCE portal software
COG Kit downloads: Over 4000+ over last year.
Developed comprehensive set of portlets and science gateway services.
Outreach, Leadership, Convergence


Annual GCE Portal Workshop with special issue in journal Concurrency.
Over 80 presentations, tutorials, and classes.
 9 book chapters, 19 journal articles, 38 peer-reviewed conference
papers.
 1 book in preparation.
OGCE Success
Stories
Exemplary portal projects
supported by OGCE
LEAD Gateway Portal
NSF Large ITR and Teragrid Gateway
- Adaptive Response to Mesoscale
weather events
- Supports Data exploration,Grid Workflow
LEAD Gateway Architecture
Portal server composed of
portlets and supported by
scalable, persistent web
services


Typical of gateways
NCAR tests with 2 groups
of 25 concurrent users
each launching forecast
workflows and visualizing
results.
Goal is to support 100’s of
users.
 10+ applications in various
workflow combinations


Services and portlets flow
between LEAD and OGCE.

User’s Browser
GFAC, PURSe, Proxy
Management, etc.
Workflow Composer
User’s Desktop

Grid
Portal Server
Security
Services
Gateway Services
User Data
& Metadata
Catalogs
Data
Services
Application
Resource
Catalogs
Information
Services
Workflow/ Application
Execution Engine
Job MGMT, Resource Broker
And Scheduling Services
Globus-Teragrid “OGSA-Like” Services
Security
Services
TeraGrid User Portal
User Portal Sharable Portlets
Account
Management
 view projects and allocation usage
 view system account usernames
 view DNs registered for account
 add users to projects
 supports >3500 users



current User Info
documentation
 contextual help for all interfaces

comprehensive list of TG
resources and their attributes
 view job queues, load, status of
resources
Consulting


Resource
 view
Documentation

TG help desk information
portal feedback channel
Allocation

Info about how to apply
for/renew allocations
North Carolina Bioportal


Principal collaborators: John McGee and Lavanya Ramakrishnan
Features


access to common bioinformatics tools
extensible toolkit and infrastructure





remotely accessible or locally deployable
packaged and distributed with documentation
National reach and community



OGCE and National Middleware Initiative (NMI)
leverages emerging international standards
TeraGrid deployment
Portals hosted at RENCI and NCSA
Education and training

hands-on workshops across North Carolina

clusters, Grids, portals and bioinformatics
PittGrid: Portal



PittGrid Portal is built using OGCE Portal Toolkit
Supports PittGrid’s Globus 4 and Condor services
PittGrid users can login to the portal to submit and
monitor their jobs



Job submission portlet and Condor job submission portlet allows
user to submit their job online to Globus and Condor,
respectively
GPIR is used to provide information services
OGCE has worked closely with Senthil Natarajan (Pitt)
and Matt Farrellee (UW) on enhancements to Condor
portlets and BirdBath.
UNC-Charlotte
Visual Grid
Portal
Project Lead: Prof. Barry Wilkinson
Portal Developer: Jeremy Villalobos
Summary of OGCE Collaborations

Mixture of





Portal builders for hire (TUP, TIGRE)
Direct collaboration, consulting with existing projects (OGCE person on-site)
Portal developer off the street
OGCE software used includes portlet components, libraries, and
services.
Wide variety of projects, personnel, funding levels, and expectations

Some have many full time developers for all project aspects: from portal
cosmetics to grid system admin
 Some have one tech person: a grad student or system admin

Interesting requirements:

Expected: TeraGrid access, support for SRB, condor and traditional
schedulers.
 1-100 codes, 1-1000 users, 1-1,000,000,000,000 bytes of data
 Unexpected: AJAX, virtual workspaces, integration of multiple semiindependent portals (portal federation)
OGCE Portal
Software and
Services
OGCE Software Development
Overview

Portlets are our central
technology



Portlets
JSR 168 standard.
Standard compliant portlets
allow reuse of portal code
between projects.

Portal Container
This should definitely be used
in the TeraGrid Science
Gateway community.
OGCE devotes significant
effort to services, tools, and
libraries.
Grid Libraries
Service
Service
Service
Grid Infrastructure
Why Portlets? Use of Standards


These are standard components for building
(Java) portals out of reusable parts.
We work within a larger community
 Commercial efforts
 Open Source: GridSphere,
uPortal/JA-SIG, Pluto,
StringBeans, Jetspeed2, eXo
 Supporting Apache portals efforts (Jetspeed2, Portlet
Bridges efforts)

We participate in standards development.
 JSR

286 expert committee, GGF/OGF
But Grids have their own problems and
requirements that we must solve.
Grid Portal Problem
OGCE Solution
Users don’t like to
Proxy manager portlet, PURSe portlets, SSO
manage grid credentials portal module.
Must interact with
Globus Toolkit services
GridFTP, MyProxy, and GRAM portlets support
both GT2 and GT4.
Must support multiple
versions of the toolkit
Java COG portlet API allows dynamic binding to
different versions of Globus
We must support other Developed SRB and Condor portlets
Grid middleware pieces
Must support user
collaboration
OGCE-Sakai portlets allow access to Sakai
collaboration services.
Grid portlets must be
easier to develop.
We developed support for Velocity and Grid
programming tag libraries.
Users need to monitor
resources.
Developed GPIR Portlets to support GPIR service
instances.
Problem
OGCE Service or Library
Need programming libraries to
support diverse file-like systems
including mass storage systems.
NCSA Trebuchet libraries can be
used to build both portlets and
services.
Need to support semantic portal
metadata.
Tupelo metadata service
developed.
Need to provide persistent
storage for Grid resource
information; information must be
accessible programmatically.
Developed GPIR Web service.
Science applications must be
Developed GFAC application
easier to deploy as a Grid service factory service.
with a portlet interface.
Need to support coupled job
execution.
Java COG workflow service
developed.
Community
Leadership, Outreach,
and Participation
GCE Workshops at
Supercomputing


Open calls to the portal community
GCE05 held in Seattle, November 18th 2005






http://pipeline0.acel.sdsu.edu/mtgs/gce05/
5 invited talks (judged by tech committee)
11 accepted posters
50+ participants
Expanded, re-reviewed papers to appear in Concurrency and
Computation
GCE06 scheduled Nov 12-13, 2006 in Tampa



Selected as part of SC06 workshop peer-review process
Call for papers just went out
http://www.cogkit.org/GCE06/
Plenary Session
Poster Session
Science Portals in
2010 and Beyond
What we want to make
happen and how
Virtualizing Grid Access


As TeraGrid expands it will become a
“utility” that extends our desktop with
huge resources.
Portals and Gateways will provide
access to:

Virtualized Storage:



An “infinite capacity” data and replica
management service.
Users will not manage data but have
access through personal metadata
catalogs.
Virtualized Computation:


Portal is a front-end to services that
automatically allocate and schedule
computational cycles as needed.
User focuses on science … not resource
management.
Knowledge Discovery and
Delivery

Agents for Search
 Portals
support data discovery.
 We will be able to pose queries for
future discovery


“I am interested in all new data relating to
chemical structures of the following form
… When you find them, run the following
analysis workflow against it and notify me
if the result is interesting”.
“Mash Ups” show how to
integrate data from multiple
sources.
 Combine
“big” data (Google) and
“little” data (my GPS data)
Search
Agent
Chem
Info
Crawler
Candidate
event
discovery
event
Analysis workflow
Validating Scientific Discovery

The portal is an integral part of the process of
computational science


Serves as an active repository of data
provenance
The portal records each computational
experiment that a user initiates


Disks are cheap, so why not record everything?
Provides a complete audit trail of the experiment
or computation
 Published results will include link to provenance
information for repeatability and transparency.

Many portals have done this on a smaller scale

CIMA, PubChem + NIH cancer screening
centers, LEAD, SERVO/Quakesim, ...
 But this should be standard practice.
 Should be persistently stored in journal catalogs
Grid Portal Software Development
Science portal and gateway research have
exciting opportunities.
 We must balance these research
opportunities with nuts-and-bolts software
development.
 We can make an accurate short term
forecast for the next generation of portals.

Opportunity
Approach
Task
Current portlet standard
JSR 286 should address
needs enhancements to
these shortcomings.
support JavaScript, interportlet communication, etc.
Upgrade current portal
containers to support the
new standard.
Portlets need better
Encapsulate AJAX
interactivity; need to
techniques in libraries.
support science mash-ups.
Build high quality AJAX tag
libraries for portlets;
support JSR 286.
Portals need to bind to and WSRP 2.0 standard serves Build a high quality WSRP
share externally running
this purpose.
2.0 implementation.
portlets.
PHP, Python, Ruby, and
other popular languages
are used to develop
portals.
Both WSRP and Apache
Portal Bridges projects
allow language
independence.
Build Ruby Grid
programming libraries and
portlet bridges.
Need portlet metadata
standards for provenance.
Build from current
community standards.
Build and release.
Need to move components Examine approaches such
seamlessly between
as WSRP desktops, JSF
desktops and portals.
support for XUL, etc.
Portlet and container APIs
will be generalized.
Develop this within the
GGF/OGF community.
Portal Software
Center
Directly supporting science
gateways through as a
software service provider
Supporting TeraGrid Gateways: a
No-Cost Extension Activity


The NSF has a significant investment in the success of the TeraGrid
Science Gateways.
Current Gateway efforts focus on integration of gateways with the
TeraGrid.



This is currently in heroic phase, uses on-site staff people.
This assumes there is an on-site staff person at the Gateway.
True for large, well-funded projects but maybe not true for others.


This has to be a potential success story: small colleges, MSIs, etc., need
TeraGrid resources.
We think the Gateways effort should be expanded to include
software support as well as integration.


Directly support common software base of many of the Gateways.
Respond to gateway requirements, bugs, feature requests with priority.
 Provide depth of support for smaller gateways.

Portal hosting, custom development, training.
A TeraGrid Science Gateway
Software Center

The center would focus on a common (but not
required) software stack for gateways.
 Represents
common practice and the “eigen” portal,
at least for Java.
 Possible future eigen-portals for Python, PHP, Ruby,
etc, and linear combinations thereof.

Center’s board of directors would consist of
current TG Gateway leadership and
representatives from active gateway projects.
 Those
in charge now would still be in charge.
 We would give them more power.
How Is This Different from Now?


Current efforts focus on integration.
Two concerns:

How do we support smaller groups?



The Gateway bar is getting higher, not lower, as we think through the
requirements.
How do we help bridge between campus Grids and the TeraGrid?
A Gateway SSP will support integration through both common and
Gateway specific software.


General portal/portlet software
AND services (such as logging, auditing, accounting, shutdown) that all
gateways need based on gateway requirements.
 AND hosting services to help smaller groups
 AND training on specific base software for new developers.

We are NOT the portal police


Existing gateways can maintain their own autonomous software bases.
Not everything goes in the gateway stack.
Looking Forward


We obviously are positioning the OGCE project to be a
Software Service Provider for the science gateways.
As we envision it, the Gateways SSP would

Develop portlets and services to support gateways generally.





“Tactical to strategic” approach as we ramp up.
Collaborate with large gateways (RENCI) on specific problems.
Package and integrate tools into a simple Gateway download.
Support these tools through help desks.
How does this compare to the NSF vision for SSPs?
OGCE Project
Participants
PI/Co-I
Institution
Major Contributions
Marlon Pierce,
Dennis Gannon,
Beth Plale,
Geoffrey Fox
Indiana University
Packaging, Grid portlet
development, GFAC, PURSe,
GGF Leadership
Mary Thomas,
Jay Boisseau
(Eric Roberts)
San Diego State
University/Texas
Advanced
Computing Center
Packaging, Grid portlet
development, GPIR, CFT, OGCE
Web Site Development; GCE05
organization; SRB Portlets
Jay Alameda,
Joe Futrelle
National Center for
Supercomputing
Applications
Grid tool development
(Trebuchet, OGRE), Tupelo
metadata development, portlet
development
Charles
Severance,
Joseph Hardin
University of
Michigan
Sakai collaboration services and
portlets, JSR 286 participation.
Gregor von
Laszewski
University of
Chicago/Argonne
National Lab
Grid portlet API and library
development; GCE06
organization; GlobDev liaison
Additional Slides
GPIR Deployment and TIGRE
Portal
VLAB Computational Chemistry
Portal
DES and LSST Portals
Monitor workflows
Set up and launch pipelines