St. Clara - JLab Computer Center

Download Report

Transcript St. Clara - JLab Computer Center

Physics data processing with SOA

4/29/2020 V. Gyurjyan. CLAS12 software meeting 1

Architecture

     Formal description Purpose, functions, external visible properties, interfaces Internal components and their relationships Scalability and maintenance Evolution 4/29/2020 V. Gyurjyan. CLAS12 software meeting 2

4/29/2020 V. Gyurjyan. CLAS12 software meeting 3

What is SOA?

  Style of building reliable systems that deliver functionality as services Loose coupling between interacting services 4 4/29/2020 V. Gyurjyan. CLAS12 software meeting

Service

       Atomic unit of an SOA.

Encapsulates logic, data, or process.

Autonomous Location Transparency It is defined by the messages it can accept and the responses it can give. Messages can be one-way, synchronous or asynchronous.

They can be integrated to provide higher-level services It can be completely self-contained, or it may depend on the availability of other services or other resources.

4/29/2020 V. Gyurjyan. CLAS12 software meeting 5

4/29/2020

Service interaction

Service Registry Discover Service Consumer Advertise Interact Service Provider V. Gyurjyan. CLAS12 software meeting 6

Loose Coupling

    Flexibility Scalability Replaceability Fault tolerance 4/29/2020 V. Gyurjyan. CLAS12 software meeting 7

Statelessness

Service Registry 4/29/2020 Service Consumer Service Provider 1 2 Service Provider 3 Service Provider V. Gyurjyan. CLAS12 software meeting Data storage 8

SOA is for distributed computing

 SOA design can be used for a single system - individual processes as services - communication through well defined interfaces, using local channels or high-speed interconnect.

4/29/2020 V. Gyurjyan. CLAS12 software meeting 9

SO vs. OO

   OO advocates tight coupling, i.e. users have static knowledge of the object.

Service has an internal thread of control.

SOA promotes the design of the system in which it is not necessary that requests and responses are handled by the same set of communicating entities .

10 4/29/2020 V. Gyurjyan. CLAS12 software meeting

Do we need distributed physics data processing?

   HEP and NEP go GRID GRID – distributed computing in which resources often spread across different physical locations and administrative domains.

Two classifications for GRID: - compute - data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 11

SOA Standards  WSDL  UDDI  BPEL  WS-Profile  WS-Security  WS-Choreography

SO GRID

Grid Standards  OGSI  Extension to WSDL  WS-Resource  WS-ResourceLifetime  WS-ResourceProperties  WS-RenewableReferences  WS-ServiceGroup  WS-BaseFaults 4/29/2020 V. Gyurjyan. CLAS12 software meeting 12

4/29/2020

CLAS12 mini-GRID infrastructure

CNU ODU JLAB UVA V. Gyurjyan. CLAS12 software meeting 13

CLA

s

R

econstruction and

A

nalyses framework

  CLAS12 software framework and middleware CLAS12 physics data processing services 4/29/2020 V. Gyurjyan. CLAS12 software meeting 14

Framework design choices

 Separation between: a) data and algorithm b) transient and persistent data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 15

Algorithm services

 Algorithm services will process one type of data and generate data objects of a different type.

Data Algorithm Service 4/29/2020 V. Gyurjyan. CLAS12 software meeting 16

Persistent and Transient Data

   Algorithm service should not use directly data objects in the persistent storage.

Transient data storage as a means of communication between physics algorithm services.

Data service translates data from persistent storage into transient storage.

4/29/2020 V. Gyurjyan. CLAS12 software meeting 17

Why not one data object?

 Two different optimization criteria for applications using persistent and transient data.

 Being independent from the persistent storage technology. 4/29/2020 V. Gyurjyan. CLAS12 software meeting 18

Data service

Persistent Data Adapter Data Service Data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 19

Data Object categories

 Event data  Detector data  Statistical data 4/29/2020 V. Gyurjyan. CLAS12 software meeting 20

Framework Services

 One Supervisor service for each application.

 Algorithm services (Tracking, Calibration, SegmentFinder, VertexReconstruction, TruthAssociation,etc.)  Data services (Geometry, PersistentEvtData, PersistentDetectorData, TransientEvtData, etc.)  JobOptions  Messaging  4/29/2020 etc.

V. Gyurjyan. CLAS12 software meeting 21

Hypothetical tracking

EvtData DetectorData StatData TransientDetData TransientStatData 4/29/2020 Supervisor TransientEvtData retrieve record retrieve start start Raw Data Space Points Track candidates Resolved Tracks record retrieve record retrieve Final Tracks Transient Storage Tracking State machine V. Gyurjyan. CLAS12 software meeting start start SpacePointFormation CoarseTrackFinder SeadMaker VertexFinder ClusterAnalyzer AmbiguitySolver TrackFitter TrackScoring 22

4/29/2020

ClaRA implementation details

Container Container RMI Container Platform V. Gyurjyan. CLAS12 software meeting 23

ClaRA Service Categories

Normative services

NS administrative service   Crate and manage all other services Environment administration 

Supervisor service

SS - Contains state machine of the particular data processing 

Data processing service

S 4/29/2020 V. Gyurjyan. CLAS12 software meeting 24

Communication between services

 All service communications are through message transfer. Message format is ACL (Agent Communication Language) defined by FIPA (Foundation for Intelligent Physical Agents).

 Each message is one of several predefined communication types.  Message structure: - Sender - Receiver - Content (Data Object or Ontology Object) - Language - Ontology - Protocol 4/29/2020 Refuse-reason Not understood Failure-reason V. Gyurjyan. CLAS12 software meeting Types Inform-content Agree 25

Data processing description

Service Creation

Administrative domain and container NC NR ClaRA platform NA Service: Java or c/c++ through JNI V. Gyurjyan. CLAS12 software meeting Service Service container 26 4/29/2020

Legacy software/hardware virtualization

  Service abstracts software/hardware component Service invoke actions which monitor or change the component state.

S S S ClaRA Platform 4/29/2020 IPC Fortran Program IPC Farm Node V. Gyurjyan. CLAS12 software meeting IPC Storage 27

Summary

  SOA is widely seen as the basis for developing flexible, highly scalable software systems, that can span management and ownership domains, regardless of the hardware and software platforms deployed in each.

SOA encourages modularity and encapsulation, and promotes system hierarchical structure.

   Normative services of the framework are programmed and tested.

Legacy software/hardware virtualization as a service has bin developed and tested.

System distribution and scalability has bin tested.

4/29/2020 V. Gyurjyan. CLAS12 software meeting 28

Future plans

     Develop PersistentEvtDataService, capable of handling EVIO data format (based on JEVIO).

Define persistent detector data IO, and develop PersistentDetectorDataService.

Define persistent statistics data format, and develop PersistentStatisticsDataService.

Define transient data format and interface, and develop TransientDataServices.

Develop ClaRA specific ontology concepts, used in service description and communications.

4/29/2020 V. Gyurjyan. CLAS12 software meeting 29