Science Gateways Marlon Pierce Science Gateway Group Indiana University What I Want to Accomplish • Invite you to participate in the Science Gateway Institute – NSF S2I2 planning.

Download Report

Transcript Science Gateways Marlon Pierce Science Gateway Group Indiana University What I Want to Accomplish • Invite you to participate in the Science Gateway Institute – NSF S2I2 planning.

Science Gateways

Marlon Pierce Science Gateway Group Indiana University

What I Want to Accomplish

• Invite you to participate in the Science Gateway Institute – NSF S2I2 planning grant • Invite you to participate in Apache Airavata – Open community software for building scientific workflows for gateways

• • • •

What Are Science Gateways?

Web-based user interfaces and services that provide a science-centric view of cyberinfrastructure.

Often centered around running science applications and workflows on grids and clouds.

But can also be data-centric, information-centric – Earth System Grid – eBird and citizen science portals Or community-centric – Such as many HUBzero hubs.

Gateways Support Science

Gateway

NanoHUB CIPRES UltraScan GridChem

Domain

Nanotechnology Bioinformatics Biophysics Chemistry

Metrics

Has supported nanotechnology simulations and data sharing among more than 250,000 users since 2000 and has been cited in more than 900 publications.

Made it possible for more than 6,000 biologists to run phylogenetic analyses on XSEDE computing resources over the past 3 years and has enabled more than 475 publications over that period. Supported the data analysis needs of over 120 active scientists and has contributed to over 60 publications during the last 3 years.

Provided access to computational chemistry tools for more than 800 users, enabling 47 publications between 2007 and 2012.

Science Gateways Institute

NSF vision for cyberinfrastructure in the 21st century

Software is critical to today’s scientific advances

• Science is all about connections – Instruments, sensor networks, HPC facilities, campus laboratories, visualization facilities, data stores – Connections are often made through software • A critical, but often overlooked component

http://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf

Software vision implemented in 2010

Software Infrastructure for Sustained Innovation (SI2) program • • • Scientific Software Elements (SSE) – Small groups create software that advances one or more area Scientific Software Integration (SSI) – Larger interdisciplinary teams, software frameworks Scientific Software Innovation Institutes (S2I2)

http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817

Institutes: Long term hubs of excellence

• • • • Serve a research community of substantial size and disciplinary breadth Expertise, processes, architectures, resources and implementation mechanisms to transform research practices and productivity Support, outreach, workforce development, proactive approach to diversity Pathways to community involvement

http://www.nsf.gov/pubs/2011/nsf11589/nsf11589.htm

• • • •

The Science Gateway Institute Partners

Project Leadership – Nancy Wilkins-Diehr, SDSC Community Workshop Organization – Katherine Lawrence, University of Michigan Workforce Development – Linda Hayden, ECSU Gateway Providers – iPlant: Dan Stanzione, Rion Dooley – – HUBzero: Michael McLennan, Michael Zentner Apache Airavata: Marlon Pierce, Suresh Marru

Millions of dollars are spent on gateways, but developers face several challenges: • They often work in isolation even though development can be quite similar across domain areas.

• They need to bridge cyberinfrastructure—locally, campus-wide, nationally, and sometimes internationally.

• They need foundational building blocks so they can focus on higher-level, grand-challenge functionality.

• They struggle to secure sustainable funding because gateways span the worlds of research and infrastructure.

http://sciencegateways.org/volunteer/

Incubator Service

• • • • •

Assist with the entire lifecycle of a gateway:

Business plan development and review • Build-and-test facilities • Hosting service Development environment, consulting, documentation and software recommendations • Offering gateways expertise in the following areas: – Usability assessment Software repositories Software engineering facilities – – Licensing Sustainability – Project management Software assessment services – like Open Source Software Advisory Service, Apache assessment service, Software Sustainability Institute (UK) – Security

Apache Airavata: Software for Scientific Workflows

http://airavata.apache.org

What Is Apache Airavata?

• Science Gateway software system to • Compose, manage, execute, and monitor distributed, computational workflows • Wrap legacy command line scientific applications with Web services.

• Run jobs on computational resources ranging from local resources to computational grids and clouds

Apache Airavata Architecture

i n s o l m e n s L o r e m s i p u m d u o

Core Developer Message Box Scientific Application Apache Airavata API Workflow Interpreter Application Factory Computational Resources Regist ry

Apache Airavata in Action

Domain

Astronomy Astrophysics Bioinformatics Biophysics Computational Chemistry Nuclear Physics

Description

Image processing pipeline for One Degree Imager instrument on XSEDE Supporting workflow of Dark Energy Survey simulations working group on XSEDE Supported workflow executions on Amazon EC2 for BioVLAB project Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources

Cyberinfrastructure: How Open is Open Source Software?

• What’s missing?

 Open source licensing  Open standards  Open codes (GitHub, SourceForge, Google Code, etc

We also need open governance

Open Community Software and Governance • Open source projects need diversity, governance.

– Reproducibility – Sustainability • Incentives for projects to • diversify their developer base.

• Govern • Software releases • Contributions • Credit sharing.

• • Members are added • Project direction decisions.

IP, legal issues Our approach: Apache Software Foundation Compete Collaborate

More Information

• • • • • Science Gateway Institute: www.sciencegateways.org/ – Nancy Wilkins-Diehr, PI Contact me: [email protected]

Apache Airavata: http://airavata.apache.org

You can contribute to Apache Airavata!

• Join the mailing list: [email protected]

YouTube presentation on Apache and NSF Cyberinfrastructure: http://www.youtube.com/watch?v=AN7LoQct17 U

Are you building gateways that serve your science discipline?

Do you wish you could connect with and learn from others who are doing the same thing?

We are building an institute to serve you—and others like you—with resources, services, experts, and ideas for creating and sustaining science gateways.

Sign up

http://sciencegateways.org/volunteer/

to join the conversation: science gateway /sī

əns gāt

wā ′ / n. 1. an online community space for science and engineering research and education.

2. a Web-based resource for accessing data, software, computing services, and equipment specific to the needs of a science or engineering discipline.

NSF CI Advisory Committee commissions 6 task forces

Software task force recommends to NSF:

1. Multi-level, long term support (individual, team, institute) 2. Responsibility for verification, validation, reproducibility 3. Consistent policy on open source 4. Collaborations across divisions, agencies and industry 5. Use of ACCI to obtain community input on priorities

http://www.nsf.gov/od/oci/taskforces/TaskForceReport_Software.pdf

Figure 1. High-level architecture of software offerings and value-added services provided by the institute.

Science Gateways: Enabling & Democratizing Scientific Research

Advanced Science Tools Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Knowledge and Expertise http://sciencegateways.org/

Simultaneous NSF study identifies limitations to short-lived science portals or gateways

• • Characteristics of short funding cycles – Build exciting prototypes with input from scientists – Work with early adopters to extend capabilities – Tools are publicized, more scientists interested – – Funding ends Scientists who invested their time to use new tools are disillusioned • Less likely to try something new again – Start again on new short-term project Need to break this cycle and fund for long-term success

Gateway-Building Support

Institute staff assigned to a project for months, up to a year – Assist with gateway development or implementation of advanced features • Workflows, fault tolerance, sensor feeds, HPC simulations – Teach research teams what it takes to build, enhance, operate, and maintain gateways after support ends – Peer-reviewed request process open to all

• • • • • • •

Gateway Forum

Gathering place for scientific web developers across NSF directorates, agencies, and international boundaries Social forums, white papers, blogs, testimonials and user stories Annual conference Broad and engaging symposium series Gateway training program – Synchronous and asynchronous, video tutorials – Best practices, case studies Showcase of successful projects Environment that enables continuous community feedback

• •

Gateway Framework

Modular, layered approach – Supports community contributions – Grocery store approach allows developers to pick and choose the components they need Tiered architecture 1.

Value-added services • • Publication channel for delivering content to a wider audience Information repositories for good design practices • Information/code samples for best practices in user-interface and user experience design 2.

3.

4.

Core web framework which includes hosted site creation and content management Platform API to provide a cohesive set of RESTful web services upon which the previous two layers rely • Systems layer where the hardware and low-level middleware reside Clouds and cloud services, HPC systems, grid middleware, data warehouses, databases, instrumentation, and distributed data stores

• • •

Workforce Development

Terrific opportunities for students and IT professionals – Much science gateway development currently done by campus IT Gateway building training – Web development is a natural interest area for students • Very visual, see results of programming instantly – Builds cross-disciplinary communication skills • Talk to scientists, construct a gateway that meets their needs – Utilize existing programming opportunities such as Google Summer of Code Opportunities to proactively address diversity

• • • •

Community engagement activities in conceptualization grant

One-on-one interviews with community leaders Group-based data collection – – Focus groups, BOFs, workshops Broad online surveys Social-feedback services – Get Satisfaction, UserVoice, HUBzero Continued events in the full institute to stay in touch with the community – – Annual conference Rolling 5-minute polls