The Open Gateway Computing Environment: Experiences Developing Tools for Scientific Communities in the Apache Software Foundation Marlon Pierce Indiana University March 7, 2012

Download Report

Transcript The Open Gateway Computing Environment: Experiences Developing Tools for Scientific Communities in the Apache Software Foundation Marlon Pierce Indiana University March 7, 2012

The Open Gateway Computing
Environment: Experiences
Developing Tools for Scientific
Communities in the Apache
Software Foundation
Marlon Pierce
Indiana University
March 7, 2012
March 7, 2012: The Competition
for Your Attention
IU Science Gateway Group Overview
• IU Science Gateway Group members
– Marlon Pierce: Project Leader
– Suresh Marru: Principal Software Architect
– Raminder Singh, Chathura Herath, Yu Ma, Lahiru Gunathilake:
Senior team members
– Research assistants and interns
• NSF SDCI funding of Open Gateway Computing
Environments project
– TACC (M. Dahan), SDSC (N. Wilkins-Diehr), SDSU (M. Thomas),
NCSA (S. Pamidighantam), UIUC (S. Wang), Purdue (C. Song),
UTHSCSA (E. Brookes)
• We participate in two Apache incubators
– Apache Rave: http://incubator.apache.org/rave/
– Apache Airavata: http://www.airavata.org
Web user interfaces to
Grids, Clouds, and other
scientific resources.
Scientific workflow
composition and
execution.
Cyberinfrastructure Layers
User
Interfaces
Gateway
Software
Web/Gadge
t Container
Web/Gadget
Interfaces
Application
Abstractions
Fault
Tolerance
Workflow
System
Auditing &
Reporting
Resource
Middleware
Cloud Interfaces
Compute
Resources
Computational
Clouds
Color Coding
Web Enabled
Desktop Applications
Monitoring
Gateway Abstraction
Interfaces
User
Management
Information
Services
Security
Provenance &
Metadata
Management
Registry
Grid Middleware
SSH & Resource
Managers
Computational Grids
Local Resources
OGCE Gateway Components
Complimentary Gateway Components
Dependent resource provider components
Collaborations
Collaborating Team
Scientific Field
GridChem (Sudhakar Pamidighantam,
NCSA)
Computational Chemistry
ParamChem (Alex Mackerell, Sudhakar
Pamidighantam, Micheal Sheetz et. al)
Molecular Sciences
WIYN Consortium One Degree Imager (Pat
Knezek, NOAO)
Astronomy
OLAM (Craig Mattocks, University of
Miami)
Atmospheric and Environmental Modeling
UltraScan (Borries Demeler, University of
Texas Health Science Center)
Experimental Biophysics
LCCI (James Vary, Iowa State)
Computational Nuclear Physics
Dark Energy Survey Simulation Working
Group (August Evrard et. al)
Astrophysics, Astronomy
VLab (Renata Wentzcovitch, University of
Minnesota)
Planetary Materials
1
Key Problems for Science Gateway,
Cyberinfrastructure Software
• Reusability
– Reuse or write your own gateway software?
• Sustainability
– The reason to reuse.
– Cyberinfrastructure Software Sustainability and
Reusability: Report from an NSF-funded workshop
• https://scholarworks.iu.edu/dspace/handle/2022/6701
• Governance
–
–
–
–
How are design decisions made?
Who decides if the software is suitable for release?
How do you handle contributions?
How do you add people to the development and project
management teams?
OGCE Funds Software Lifecycle
Governance: Open Community
Software
• More than SourceForge, GitHub, Google Code,
etc
– Those provide excellent Web tools to help developers.
• Here we are concerned with community building.
– Diverse community of developers increases
probability of reusability and sustainability
– But diverse communities require governance
– Get governance right, and sustainability and
reuse will follow.
Some Open Model Examples in CI
• NSF-funded CDIGS project
– http://confluence.globus.org/display/CDIGS/CDIGS+Home+Page
• HUBzero Consortium, Sakai Foundation, Kuali Foundation
– Institutional level organizations
• Eclipse Parallel Tools Platform
– Jay Alameda, NCSA: NSF SI2 funding to develop HPC tools
workbench
– http://www.eclipse.org/ptp/
• Enzo Project
– Excellent talk at TG11 by Prof. Brian O’Shea on their open
community efforts
– http://enzo-project.org/
• Apache Software Foundation
– OODT and TIKKA Data Management Projects at NASA JPL
Two Apache Software
Foundation Case Studies
Apache Rave and Airavata Incubators
Apache Airavata
• Science Gateway software framework to
– Compose, manage, execute, and monitor
computational workflows on Grids and Clouds
– Web service abstractions to legacy command linedriven scientific applications
– Modular software framework to use as individual
components or as an integrated solution.
• More Information
– Airavata Web Site: airavata.org
– Developer Mailing Lists: [email protected]
Apache Airavata High Level Overview
Example Workflow: Nuclear Physics
Courtesy of collaboration with Prof. James Vary and team, Iowa State
Apache Rave
• Open Community Software for Enterprise
Social Networking, Shareable Web
Components, and Science Gateways
• Founding members:
•
•
•
•
Mitre Software
SURFnet
Hippo Software
Indiana University
• More information
• Project Website: http://incubator.apache.org/rave/
• Mailing List: [email protected]
1
Gadget Dashboard
View
Gadget Store
View
Rave Building Blocks
• Rave is implemented in JavaScript, Java with
Spring MVC
– Bean initialization specified in XML configuration files.
– Inversion of Control makes it easy to swap out
implementations.
– Disciplined MVC through Java annotations
• Builds on Apache Shindig and Wookie
– Provide layout management, user management,
administration tools, production backend data
systems, etc.
Extending Rave for Science Gateways
• Two constraints
– Must work out of the box
– But must be flexible for developers to adapt it.
• Rave is designed to be extended.
– Good design (interfaces, easily pluggable implementations)
and code organization are required.
– It helps to have a diverse, distributed developer
community
• How can you work on it if we can’t work on it?
• Rave is also packaged so that you can extend it without
touching the source tree.
• GCE11 paper presented 3 case studies for Science
Gateways
Rave Extension General Steps
• Download and install Rave’s source
– “mvn clean install” puts JARs, WARs, and POMs
into your local Apache Maven repository.
– Only if building from a snapshot.
• Create a new Apache Maven project
– You’ll need rave-portal-dependencies POM in your
<dependencies/>.
– Include any configuration files that you would like
to modify.
– Include the source code for your extensions.
The Apache Way and Science
Gateways
Why Apache for Gateway Software?
• Apache Software Foundation is a neutral playing field
– 501(c)(3) non-profit organization.
– Designed to encourage competitors to collaborate on
foundational software.
– Includes a legal cell for legal issues.
• Provides the social infrastructure for building communities.
• Opportunities to collaborate with other Apache projects
outside the usual CI world.
• Foundation itself is sustainable
– Incorporated in 1999
– Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook,
IBM, …)
• Proven governance models
– Projects are run by Program Management Committees.
– New projects must go through incubation.
The Apache Way
• Projects start as incubators with 1 champion and several mentors.
– Making good choices is very important
• Graduation ultimately is judged by the Apache community.
– +1/-1 votes on the incubator list
• Good, open engineering practices required
– DEV mailing list design discussions, issue tracking
– Jira contributions
– Important decisions are voted on
• Properly packaged code
–
–
–
–
Build out of the box
Releases are signed
Licenses, disclaimers, notices, change logs, etc.
Releases are voted
• Developer diversity
– Three or more unconnected developers
– Price is giving up sole ownership, replace with meritocracy
Apache and Science Gateways
• Apache rewards projects for cross-pollination.
– Connecting with complementary Apache projects
strengthens both sides.
– New requirements, new development methods
• Apache methods foster sustainability
– Building communities of developers, not just users
– Key merit criterion
• Apache methods provide governance
– Incubators learn best practices from mentors
– Open, democratic procedures
– Processes for adding new committers and management
– Ex: Releases are peer-reviewed and voted on.
– All communications are archived.
Apache Contributions Aren’t Just
Software
• Apache committers and Project Management
Committee members aren’t just code writers.
• Successful communities also include
–
–
–
–
Important users
Project evangelists
Content providers: documentation, tutorials
Testers, requirements providers, and constructive
complainers
• Using Jira and mailing lists
– Anything else that needs doing.
How To Get Involved
• Join the DEV mailing lists.
• Grab the software and start complaining.
• Post Jira tickets
– Add your patches to Jira if you want to solve a
problem.
– Request a review
– Frequent patch submission is the best way to get
voted in as a committer.
Case Study: GridShib and Community
Credentials
• XSEDE Science Gateways use shared
community credentials when accessing
backend resources.
– Many portal users map to one community
account.
• GridShib adds attributes to grid credentials
– Gateway membership, originating IP address, user
email, creation time, etc.
• For Rave, we’ll have to change the User
service implementation to support this.
GridShib Step By Step
•
•
•
•
•
•
•
Install Rave in your Maven repo.
Create a Maven project with standard directory layout
for WAR packaging
Create a new user service (ComUserService) for
obtaining a community credential and adding
GridShib attributes.
Replace applicationContext-security.xml with your
version
In the XML, replace the default UserService with
ComUserService.
Place all GridShib resources in src/main/resources
Place web.xml in src/main/webapp/WEB-INF
–
You’ll need an additional listener to get the IP address.