Standard Tools - Cornell University

Download Report

Transcript Standard Tools - Cornell University

Reconstruction and Analysis
on Demand:
A Success Story
Christopher D. Jones
Cornell University, USA
Overview
• Describe “Standard” processing model
• Describe “On Demand” processing model
– Similar to GriPhN’s “Virtual Data Model”
• What we’ve learned
• User reaction
• Conclusion
C. Jones
CHEP03
2
Standard Processing System
• Designed for reconstruction
– All objects are supposed to be created for each event
• Each processing step is broken into its own module
– E.g., track finding and track fitting are separate
• The modules are run in a user-specified sequence
• Each module adds its data to the ‘event’ when the module
is executed
• Each module can halt the processing of an event
Input
Module
Track Finder
C. Jones
Track Fitter
CHEP03
Output
Module
3
Critique of Standard Design
• Good
– Simple mental model
• Users can feel confident they know how the program works
– Easy to debug
• Simple to determine which module had a problem
• Bad
– User must know inter-module dependencies in order to place the
modules in the correct sequence
• Users often run jobs with many modules they do not need in order to
avoid missing a module they might need
– Optimization of module sequence must be done by hand
– Reading back from storage is inefficient
• Must create all objects from storage even if job does not use them
C. Jones
CHEP03
4
On-demand System
• Designed for analysis batch processing
– Not all objects need to be created each event
• Processing is broken into different types of modules
– Providers
• Source: reads data from a persistent store
• Producer: creates data on demand
– Requestors
• Sink: writes data to a persistent store
• Processor: analyzes and filters ‘events’
• Data providers register what data they can provide
• Processing sequence is set by the order of data requests
• Only Processors can halt the processing of an ‘event’
Source
Processor A
C. Jones
Processor B
CHEP03
Sink
5
Data Model
A Record holds all data that are related by life-time
e.g., Event Record holds Raw Data, Tracks, Calorimeter Showers, etc.
A Stream is a time-ordered sequence of Records
A Frame is a collection of Records that describe the state of the detector at an
instant in time.
All data are accessed via the exact same interface and mechanism
C. Jones
CHEP03
6
Data Flow: Frame as Data Bus
Data Providers: data returned when requested
Sources: data from storage
Event
Database
Producers: data from algorithm
Calibration
Database
TrackFinder
TrackFitter
Frame
SelectBtoKPi
EventDisplay
Processors: analyze and filter data
Event List
Sinks: store data
Data Requestor: sequentially run requestors for each new Record from a source
C. Jones
CHEP03
7
Callback Mechanism
• Provider registers a Proxy for each data type it can create
• Proxies are placed in the Record and indexed with a key
– Type: the object type returned by the Proxy
– Usage: an optional string describing use of object
– Production: an optional run-time settable string
• Users access data via a type-safe templated function call
List<FitPion> pions;
extract( iFrame.record(kEvent), pions);
• (based on ideas from Babar’s Ifd package)
• extract call builds the key and asks Record for Proxy
• Proxy runs algorithm to deliver data
– Proxy caches data in case of another request
– If a problem occurs, an exception is thrown
C. Jones
CHEP03
8
Callback Example: Algorithm
Processor
Producer
Source
SelectBtoKPi
Track Fitter
Track Finder
HitCalibrator
FitPionsProxy
FitKaonsProxy
…
TracksProxy
CalibratedHitsProxy
Raw Data File
Calibration DB
RawDataProxy
PedestalProxy
AlignmentProxy
…
C. Jones
CHEP03
9
Callback Example: Storage
Processor
Source
SelectBtoKPi
Event Database
FitPionsProxy
FitKaonsProxy
RawDataProxy
…
In both examples, same SelectBtoKPi shared object can be used
C. Jones
CHEP03
10
Critique of On-demand System
• Good
– Can be used for all data access needs
• Online software trigger, Online data quality monitoring,
Online event display, calibration, reconstruction, MC generation,
Offline event display, analysis
– Self organizes calling chain
• Users can add Producers in any order
– Optimizes access from Storage
• Sources only need to say when a new Record (e.g., event) is available
• Data for a Record is retrieved/decoded on demand
• Bad
– Can be harder to debug since no explicit call order
• Use of exceptions key to simplifying debugging
– Performance testing is more challenging
C. Jones
CHEP03
11
What We Have Learned
• First release of the system was September 1998
• Callback mechanism can be made fast
– Proxy lookup takes less than 1 part in 10-7 of CPU time on simple
job that processed 2,000 events/s on moderate computer
• Cyclical dependencies are easy to find and fix
– Only happened once and was found immediately on first test
• Do not need to modify data once it is created
– Preliminary versions of data are given their own key
• Automatically optimizes performance of reconstruction
– Trivially added filter to remove junk events by using FoundTracks
• Optimize analysis by storing many small objects
– Only need to retrieve and decode data needed for current job
C. Jones
CHEP03
12
User Reactions
• In general, user response has been very positive
– Previously CLEO used a ‘standard system’ written in FORTRAN
• Reconstruction coders like the system
– We have code skeleton generators for Proxy/Producer/Processor
• Only need to add their specific code
– Easy for them to test their code
• Analysis coders can still program the ‘old way’
– All analysis code in the ‘event’ routine
• Some analysis coders are pushing bounds
– Place selectors (e.g. cuts for tracks) in Producers
• Users share selectors via dynamically loaded Producers
– Processor only used to fill Histograms/Ntuples
– If stored selections, only rerun Processor when reprocessing data
C. Jones
CHEP03
13
Conclusion
• It is possible to build an ‘on demand’ system that is
–
–
–
–
–
–
efficient
debuggable
capable of dealing with all data (not just data in an event)
easy to write components
good for reconstruction
acceptable to users
• Some reasons for success
– Skeleton code generators
• User only has to write new code, not infrastructure ‘glue’
– Users do not need to register what data they may request
• Data reads occur more frequently than writes
– Simple rule for when algorithms run
• If you add a Producer, it takes precedence over a Source
C. Jones
CHEP03
14