HIRDLS SIPS Aura Data Systems Working Group Sept. 13, 2006

Download Report

Transcript HIRDLS SIPS Aura Data Systems Working Group Sept. 13, 2006

Aura Science Meeting Data Systems Working Group HIRDLS SIPS Status September 27, 2010 Vince Dean, Cheryl Craig NCAR Brendan Torpy, Greg Young Univ. of Colorado, Boulder

Outline

Instrument Status

Processing Status and Data Releases

SIPS Status

Lessons Learned

Useful Links

HIRDLS Instrument Status

Chopper Anomaly remains

• • • • •

March 17, 2008 – HIRDLS chopper stopped rotating Cannot collect radiance data Rest of instrument continues to function Attempts to recover:

• Automatic attempt to restart every 3m 20s. => Complete • • Cycled instrument temperature => Complete Varied chopper frequency over available range => Complete • • Upload software change to increase rate of restart attempts => Complete Run a macro to input maximum energy to chopper motor => In progress

Exploring other options to restart:

• Switch sides of telescope electronics unit (TEU) => Future

HIRDLS Processing Summary

Version 5 – HIRDLS version 5.00.00 (Level 2 Swaths)

Adds Geopotential Height

• • •

Algorithmic Improvements:

• Improved cloud detection • • Reduced number of temperature drop-outs Improved temperature retrievals in lower mesosphere • Increased a priori weighting in cloudy regions

Public release: May 2010 Delivered 1070 days of data to Goddard

• 2005-01-22 through 2008-01-01

HIRDLS Processing Summary

Version 6 – Goals for next release

Additional species

• • • • Water vapor Methane • NO

2 Improvements for current species

• Fewer gaps in recovered profiles • Improved accuracy

Target release: 1 st Quarter 2011

HIRDLS SIPS System

Data Management System

3 Dell Power Edge 2950 machines - Centos 5.5 Linux

Java/MySQL web-based application

Processing and data store – “heavy iron”

SGI Altix 3000-series 80-processor Itanium

• • •

SUSE Linux Enterprise 10 160 Gbytes of memory 63 Tbytes of online RAID storage

(mirrored to backup disks in another building)

Estimated annual data growth

• •

Subscriptions: 700 Gbytes Derived products: 5 Tbytes

HIRDLS SIPS System

New x86 processing server

Xeon-based 8-core server

• • •

Added specifically for IDL codes.

IDL is not officially supported on our Itanium systems, and it runs slowly there, in x86 emulation.

Performance is much better there, as hoped.

HIRDLS SIPS Statistics

Data files:

• •

2,200,000 files 30 Tbytes of data – with full off-site mirror

Versions of processing programs:

981 distinct programs installed

• • •

Over 300 distinct variations on processing stream producing Level 2 products Most are for computational experiments.

A few are promoted to released status for major reprocessing.

Processing tasks:

170,000 distinct executions of processing modules.

HIRDLS SIPS System Dual Role

Large-scale reprocessing

Works as designed.

Scheduling algorithm keeps processors fully utilized while running a mix of processing modules.

Scientific experiments within SIPS system

Many experimental runs – mostly testing algorithms for Kapton correction – in what was meant to be a production system.

Provenance of each data file:

• • • • • • when created by whom with which version of which program with which input files creating which output files …

HIRDLS SIPS and Data Production Lessons Learned Lesson: Co-locate science, software development and production teams.

• • • •

The close teamwork was invaluable for our constant experimentation.

Quick turnaround of processing experiments.

Quick resolution of integration problems.

Quick feedback on requirements from science users to SIPS development and operations staff.

10 10

HIRDLS SIPS and Data Production Lessons Learned Lesson: Build for generality. Identify requirements up front, but expect that the needs of production will be different from what you expected.

• •

Good software practice – identify and allow for those things which are likely to change.

Make as much configurable and table-driven as practical.

• •

This requires both foresight and luck! We had some of both.

11 11

HIRDLS SIPS and Data Production Lessons Learned Lesson: Build for generality …

• • • • •

Changes accommodated: Number and sequence of processing steps.

Number and types of intermediate files.

Number of versions of processing software (many).

Fixed-length vs. variable-length data granules.

Custom, streamlined, data interfaces for science users.

12 12

HIRDLS SIPS and Data Production Lessons Learned Lesson: Build in an audit trail.

• • • •

HIRDLS Kapton anomaly pushed the limits of the data production system.

It became an experimental test bed.

We have run more than 300 variations on the processing stream which produces our Level 2 product …and we can track all of them.

Detailed record of data provenance organized what could have been chaotic.

13 13

HIRDLS SIPS and Data Production Lessons Learned Lesson: Recognize and accommodate differences between research and production data processing.

• • •

The science code is typically developed by one or two expert developers per module. Choose the size and number of modules accordingly. Design to allow insertion of additional modules as needed for new functionality.

A larger team with collaborative development was well suited to the data management and process scheduling software, allowing us to create a complex system with rich metadata documenting the provenance of the data.

Expect that the operators of the system and the users of the science data will require different interfaces and views of the data.

14

HIRDLS SIPS and Data Production Lessons Learned Lesson: Standardize and test interfaces early.

Aura File Format Guidelines standardized file structure of data products from all Aura instruments. Arriving at consensus required much persistence and diplomacy, but it was repaid with easier collaboration after launch.

Processing modules were required to conform to a standard interface within the SIPS production system, making it easier to release, install and track new modules.

15 15

HIRDLS Links

HIRDLS at Goddard DISC

Data access

• •

Data quality documents http://disc.sci.gsfc.nasa.gov/Aura/HIRDLS/index.shtml

HIRDLS at UCAR

http://www.eos.ucar.edu/hirdls/

HIRDLS at Oxford University

http://www.atm.ox.ac.uk/hirdls/