Transcript St. Clara - JLab Computer Center
Physics data processing with SOA
4/29/2020 V. Gyurjyan. CLAS12 software meeting 1
Architecture
Formal description Purpose, functions, external visible properties, interfaces Internal components and their relationships Scalability and maintenance Evolution 4/29/2020 V. Gyurjyan. CLAS12 software meeting 2
4/29/2020 V. Gyurjyan. CLAS12 software meeting 3
What is SOA?
Style of building reliable systems that deliver functionality as services Loose coupling between interacting services 4 4/29/2020 V. Gyurjyan. CLAS12 software meeting
Service
Atomic unit of an SOA.
Encapsulates logic, data, or process.
Autonomous Location Transparency It is defined by the messages it can accept and the responses it can give. Messages can be one-way, synchronous or asynchronous.
They can be integrated to provide higher-level services It can be completely self-contained, or it may depend on the availability of other services or other resources.
4/29/2020 V. Gyurjyan. CLAS12 software meeting 5
4/29/2020
Service interaction
Service Registry Discover Service Consumer Advertise Interact Service Provider V. Gyurjyan. CLAS12 software meeting 6
Loose Coupling
Flexibility Scalability Replaceability Fault tolerance 4/29/2020 V. Gyurjyan. CLAS12 software meeting 7
Statelessness
Service Registry 4/29/2020 Service Consumer Service Provider 1 2 Service Provider 3 Service Provider V. Gyurjyan. CLAS12 software meeting Data storage 8
SOA is for distributed computing
SOA design can be used for a single system - individual processes as services - communication through well defined interfaces, using local channels or high-speed interconnect.
4/29/2020 V. Gyurjyan. CLAS12 software meeting 9
SO vs. OO
OO advocates tight coupling, i.e. users have static knowledge of the object.
Service has an internal thread of control.
SOA promotes the design of the system in which it is not necessary that requests and responses are handled by the same set of communicating entities .
10 4/29/2020 V. Gyurjyan. CLAS12 software meeting
Do we need distributed physics data processing?
HEP and NEP go GRID GRID – distributed computing in which resources often spread across different physical locations and administrative domains.
Two classifications for GRID: - compute - data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 11
SOA Standards WSDL UDDI BPEL WS-Profile WS-Security WS-Choreography
SO GRID
Grid Standards OGSI Extension to WSDL WS-Resource WS-ResourceLifetime WS-ResourceProperties WS-RenewableReferences WS-ServiceGroup WS-BaseFaults 4/29/2020 V. Gyurjyan. CLAS12 software meeting 12
4/29/2020
CLAS12 mini-GRID infrastructure
CNU ODU JLAB UVA V. Gyurjyan. CLAS12 software meeting 13
CLA
s
R
econstruction and
A
nalyses framework
CLAS12 software framework and middleware CLAS12 physics data processing services 4/29/2020 V. Gyurjyan. CLAS12 software meeting 14
Framework design choices
Separation between: a) data and algorithm b) transient and persistent data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 15
Algorithm services
Algorithm services will process one type of data and generate data objects of a different type.
Data Algorithm Service 4/29/2020 V. Gyurjyan. CLAS12 software meeting 16
Persistent and Transient Data
Algorithm service should not use directly data objects in the persistent storage.
Transient data storage as a means of communication between physics algorithm services.
Data service translates data from persistent storage into transient storage.
4/29/2020 V. Gyurjyan. CLAS12 software meeting 17
Why not one data object?
Two different optimization criteria for applications using persistent and transient data.
Being independent from the persistent storage technology. 4/29/2020 V. Gyurjyan. CLAS12 software meeting 18
Data service
Persistent Data Adapter Data Service Data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 19
Data Object categories
Event data Detector data Statistical data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 20
Framework Services
One Supervisor service for each application.
Algorithm services (Tracking, Calibration, SegmentFinder, VertexReconstruction, TruthAssociation,etc.) Data services (Geometry, PersistentEvtData, PersistentDetectorData, TransientEvtData, etc.) JobOptions Messaging 4/29/2020 etc.
V. Gyurjyan. CLAS12 software meeting 21
Hypothetical tracking
EvtData DetectorData StatData TransientDetData TransientStatData 4/29/2020 Supervisor TransientEvtData retrieve record retrieve start start Raw Data Space Points Track candidates Resolved Tracks record retrieve record retrieve Final Tracks Transient Storage Tracking State machine V. Gyurjyan. CLAS12 software meeting start start SpacePointFormation CoarseTrackFinder SeadMaker VertexFinder ClusterAnalyzer AmbiguitySolver TrackFitter TrackScoring 22
4/29/2020
ClaRA implementation details
Container Container RMI Container Platform V. Gyurjyan. CLAS12 software meeting 23
ClaRA Service Categories
Normative services
NS administrative service Crate and manage all other services Environment administration
Supervisor service
SS - Contains state machine of the particular data processing
Data processing service
S 4/29/2020 V. Gyurjyan. CLAS12 software meeting 24
Communication between services
All service communications are through message transfer. Message format is ACL (Agent Communication Language) defined by FIPA (Foundation for Intelligent Physical Agents).
Each message is one of several predefined communication types. Message structure: - Sender - Receiver - Content (Data Object or Ontology Object) - Language - Ontology - Protocol 4/29/2020 Refuse-reason Not understood Failure-reason V. Gyurjyan. CLAS12 software meeting Types Inform-content Agree 25
Data processing description
Service Creation
Administrative domain and container NC NR ClaRA platform NA Service: Java or c/c++ through JNI V. Gyurjyan. CLAS12 software meeting Service Service container 26 4/29/2020
Legacy software/hardware virtualization
Service abstracts software/hardware component Service invoke actions which monitor or change the component state.
S S S ClaRA Platform 4/29/2020 IPC Fortran Program IPC Farm Node V. Gyurjyan. CLAS12 software meeting IPC Storage 27
Summary
SOA is widely seen as the basis for developing flexible, highly scalable software systems, that can span management and ownership domains, regardless of the hardware and software platforms deployed in each.
SOA encourages modularity and encapsulation, and promotes system hierarchical structure.
Normative services of the framework are programmed and tested.
Legacy software/hardware virtualization as a service has bin developed and tested.
System distribution and scalability has bin tested.
4/29/2020 V. Gyurjyan. CLAS12 software meeting 28
Future plans
Develop PersistentEvtDataService, capable of handling EVIO data format (based on JEVIO).
Define persistent detector data IO, and develop PersistentDetectorDataService.
Define persistent statistics data format, and develop PersistentStatisticsDataService.
Define transient data format and interface, and develop TransientDataServices.
Develop ClaRA specific ontology concepts, used in service description and communications.
4/29/2020 V. Gyurjyan. CLAS12 software meeting 29