ClaRA Framework - JLab Computer Center

Download Report

Transcript ClaRA Framework - JLab Computer Center

CLAs Reconstruction and Analysis
Physics Data Processing with SOA based Framework
Vardan Gyurjyan on behalf of Clas12 software group
Outline





Problem statement
SOA based framework as a solution
Current status of the ClaRA project
Future plans
Conclusion
V. Gyurjyan
July 21, 2015
Computing power
Network Integration
Node/Rack Integration
CMOS Technology
Single Chip Integration
V. Gyurjyan
July 21, 2015
Integration
Rack
32 Mode Cards
13.9 TF/s
IBM Blue Gene
72 Racks
1PF/s
Mode Card
32 Computer Cards
435 GF/s
Chip
4 Cores
Computer Card
1Chip
13.6GF/s
V. Gyurjyan
July 21, 2015
Normalized Growth since 1980
Network Evolution
1,000,000
Network Capacity
2x / 7 months
100,000
10,000
1,000
100
User Traffic
2x / 12months
Router
Capacity
2.2x / 18months
10
Moore’s Law
2x / 18 months
CAT4
10Mbps
10base-T
1
1980
1983
1986
2003 – CAT6 10Gbps
CAT5
100Mbps
100base-T
1989
1992
CAT5e
1Gbps
1000base-T
1995
1998
2001
2007 – CAT7 100Gbps
V. Gyurjyan
July 21, 2015
High Performance Computing Trends
1.
2.
3.
Exponential growth in processor performance (coming to an end)
Power cost = System cost: invention required
Growth in level of parallelism (near term solution)
V. Gyurjyan
July 21, 2015
IBM Approach – Path to Petascale


Multiple modest cores on a single chip rather than one highperformance processor
 Watts/FLOP will not improve much from future technologies.
Linux environment and MPI (standard messaging interface)
V. Gyurjyan
July 21, 2015
"The Network is the Computer."
John Gage
V. Gyurjyan
July 21, 2015
Specifics of the Offline Software






Lifetime of the software >= lifetime of the experiment.
Collaborative nature of the development.
Coexistence of parallel running applications for the single
experiment.
Unprecedented scale and complexity of the physics computing
environment
Physics computing environment must keep up with fast growing
computing technologies
Large worldwide user base.
V. Gyurjyan
July 21, 2015
PDP (Physics Data Processing) Application
Conventional vs. parallel/distributed
V. Gyurjyan
July 21, 2015
Running Conventional Software Application
Copy
checkout
Configure
Compile
Fix errors
no
Give up
yes
ok
Run
no
Modified?
Complain
yes
V. Gyurjyan
July 21, 2015
Programming Errors

Compile time


Compiler reports a “best guess” of the problem

Undeclared variables or functions

Missing semicolon or brace

Typos



Program does not compile.
Missing files or libraries
Type ambiguities
Run time

Executable crashes or has unexpected behavior

May not appear for all conditions or all data sets

Uninitialized variables

Memory errors

Numeric errors

Type errors in print statements

Closing a NULL file pointer

Accessing a NULL pointer

Variables out of scope
V. Gyurjyan
July 21, 2015
Challenges of the Conventional Approach





Difficult to organize and coordinate activities
Difficult to maintain
Inevitable fragmentation of the software
Poor scalability
Computing skills are required to use physics data processing
applications
V. Gyurjyan
July 21, 2015
ABC
CLAS 6
CLAS 12
A
A
B
B
C
A+B << C
C: requires a few or no programming skills
V. Gyurjyan
July 21, 2015
One way to eat an elephant
A bite at a time
V. Gyurjyan
July 21, 2015
Where we start?


Each bite is a clear, simple, single purpose application, developed by
group B member.
Group A, with a tight collaboration with group B and C shall
control and manage the process, never loosing maniacal focus on a
big picture (elephant).
Define a
piece of a
big problem
Understand
the
problem
Distill the
problem to
its essence
solution
V. Gyurjyan
Test
July 21, 2015
“Things should be made as simple as
possible, but not simpler.”
Albert Einstein
V. Gyurjyan
July 21, 2015
Language and Architecture Evolution
1990
1980
1970
1960
2000
1991
1983
1972
1964
1954
Service Oriented
programming
2000
Assembly Language
2010
Object Oriented
programming
Structured and Procedural
programming
2020
1950
1940
1930
V. Gyurjyan
July 21, 2015
SOA

SOA promotes the goal of separating service users from the service
implementation.


Style of building reliable systems that deliver functionality as services
Loose coupling between interacting services

Directories and addressing mechanisms are at the center of SOA.
Complex
Arbitrary format
Program
Arbitrary format
Standard format
Service
Standard format
Specialized, simple
V. Gyurjyan
July 21, 2015
Attributes of Services







Well defined, easy-to-use, somewhat standardized interface
Self-contained with no visible dependencies to other services
(almost) Always available but idle until requests come
Location transparency
Easily accessible and usable readily, no “integration” required
New services can be offered by combining existing services
Quantifiable quality of service
V. Gyurjyan
July 21, 2015
Service Interface



Standard message based
Highly Polymorphic
 Intent is enough
Implementation can be changed in ways that do not break all the
service consumers
V. Gyurjyan
July 21, 2015
Service Orientation is scalable


End users can consume and combine a lot of services since they
don’t have to know or “learn” how the services are made.
Service providers (A+B) can offer their services to a lot more
consumers by optimizing
 The user interface
 Access
 Implementations
V. Gyurjyan
July 21, 2015
“On Demand” Physics Data Processing



Use software as you need
Much lower setup time, forget about
 Installation
 Implementation
 Training
 Maintenance
Scalable and effective usage of resources
 Parallelism (CPU, Storage, Bandwidth…)
V. Gyurjyan
July 21, 2015
What is ClaRA?





Framework that Implements SOA.
Service development environment.
Toolbox of generic physics data processing services.
Network distributed platform.
The “Glue”, binding together services into an algorithmic data
analysis application.
V. Gyurjyan
July 21, 2015
Design criteria

Framework service shall be simple to use and easy to learn.

Framework service should be customizable to be able to adapt to the different data
processing tasks.

Framework shall provide context sensitive help and assistance, with many real world physics
data processing application examples.

Framework shall provide ready to use services, encapsulating essential functionalities of the
physics data processing system.

Services shall be reusable and easily replaceable.

Physics data processing application design and implementation shall require a few or no
programming skills.

Neither specific computing environment, nor compiling shall be necessary to build and run
physics data processing application.

Framework shall provide graphical environment for physics data processing application
development.

Frameworks platform shall be network distributed, and shall have temporal continuity.

The new system shall provide World Wide Web access to the services for remote
configuration and execution of the data processing applications. The necessary security
considerations must be addressed.
V. Gyurjyan
July 21, 2015
Data and Algorithm



Framework advocates clear separation between:
 a) data and algorithm
 b) transient and persistent data
Methods in the data object will be limited to manipulations of the
internal data members only.
Algorithm will process one type of data and generate data objects of
a different type.
Data
Algorithm
Data
V. Gyurjyan
July 21, 2015
Persistent and Transient Data




Physics algorithm objects should not use data objects directly in the
persistent storage.
Transient data storage as a means of communication between physics
algorithms.
Two different optimization criteria for applications using persistent
and transient data.
Being independent from the persistent storage technology.
V. Gyurjyan
July 21, 2015
Data Object categories
Data
Event
Detector
Statistical
V. Gyurjyan
July 21, 2015
ClaRA Platform
Service
Container
node-1
Web
Service 1
cMsg
SOAP
SOAP
SCC
Service
Container
node-N
Front-End Container
Normative Service
Service
Container
node-2
Web
Service N
WWW
Web
Service 2
SCC
cMsg
SOAP
Service
Container
node-3
SOAP
CMSG
Web
Service 3
SOAP
Users
V. Gyurjyan
July 21, 2015
Current Status
Geometry
Service
Magnetic
Field Map
Service
GEMC
Service
Tracking
Service
bCNU
Service
Event Data
Service
ClaRA cMsg Platform
Thin Clients
ClaRA WebServices Platform
Math
Service
Stat
Service
Probability
Service
WWW
Geometry
Service
Matrices
Service
V. Gyurjyan
July 21, 2015
Examples





EVIO event producer and EVIO event consumer services (C++).
data producer and data consumer services. C examples use cMsg
payload (ASCII).
C++ geometry service client example
Java geometry service client example
Web services JSP clients
V. Gyurjyan
July 21, 2015
Tracking composite application
Transient
data
Spacepoint
maker
Coarse
track
finder
Cluster
Analyzer
Ambiguity
solver
Track fitter
Histogram
builder
V. Gyurjyan
July 21, 2015
ClaRA cMsg Platform
Thin Clients
Persistent
data
Tracking application service decomposition
Supervisor
EvtData
TransientEvtData
retrieve
record
DetectorData
start
start
CoarseTrackFinder
retrieve
Raw Data
SeadMaker
Space Points
VertexFinder
StatData
record
Track candidates
ClusterAnalyzer
retrieve
start
TransientStatData
SpacePointFormation
Resolved Tracks
record
retrieve
AmbiguitySolver
start
TrackFitter
TransientDetData
Final Tracks
TrackScoring
Transient Storage
Tracking State machine
V. Gyurjyan
July 21, 2015
Performance measurements
V. Gyurjyan
July 21, 2015