Transcript No Slide Title
Architecture Nuts & Bolts
Vincenzo Innocente, BluePrint RTAG
Vincenzo Innocente CMS
Nuts & Bolts 1
No Flames
It is very difficult to use as (good/bad) example any of those marvelous frameworks and toolkits that never made it into a popular product All my respect goes to those who developed products that have the misfortune to be daily used by thousand of people and are easy target for my (positive/negative) criticisms…
AHisto.fill
TObject.draw
~G4RunManager Please accept my apologies
2
CMS Data Analysis Model
Environmental data Detector Control Online Monitoring Quasi-online Reconstruction store Request part of event Event Filter Object Formatter Request part of event Store rec-Obj Request part of event store Simulation Persistent Object Store Manager Database Management System store Store rec-Obj and calibrations Data Quality Calibrations Group Analysis Request part of event Physics Paper User Analysis on demand 3
Architecture Overview
Data Browser Generic analysis Tools GRID Analysis job wizards ORCA OSCAR COBRA FAMOS Detector/Event Display Federation wizards Software development and installation Consistent User Interface Objy tools CMS tools Distributed Data Store & Computing Infrastructure Coherent set of basic tools and mechanisms
4
Simulation, Reconstruction & Analysis Software System
Uploadable on the Grid Physics modules Specific Framework
Basic Services
Event Filter Reconstruction Algorithms Physics Analysis Data Monitoring Calibration Objects Configuration Objects Grid-Aware Data-Products Event Objects adapters and extensions
ODBMS Geant3/4 CLHEP Paw Replacement C++ standard library Extension toolkit 5
Framework Dynamics
Framework: Controls flow of execution Defines object interaction (implementing design patterns) Calls client (plug-in) functions May offer a traditional “client API” for integration in more specialized frameworks Clients specialize framework behavior: Inheriting from framework classes Overwriting their methods Instantiating other framework classes Interacting directly with other, more general, frameworks Flow of control Call backs Client API Framework API Customized Extension (client plug-in) 6
Devil is in the Details
Build independent components: Avoid Dependencies among components at the same level Gratuitous and exaggerated re-use
One hammer does not fit all screws
global states (even
cout
) Exposure of internal relationships (a->b()->c(i)->d(“b”)) assumptions on higher level behavior (lent pointers) Interfaces that force your environment on user code Balance inheritance (white box) vs composition (black box) Distinguish Framework API, Client API and User API
These are Architectural issues NOT coding guidelines
I do not mind of “#define int float” in your .cc, I mind if in a .h
7
Examples
Exceptions throw internal exception (avoid inheriting from std::exception?) Catch it in the framework adapter and throw appropriate framework exception Algorithms do not throw a CARFSkipEventException deep inside No one even think of inheriting from Python exceptions Do not hardcode cout CobraOut G4out If really critical, implement a proper messanger: Every package implement one based on some “pattern” An adapter takes care of the communication with the framework Use envelops (not Proxies) and facades toward the user Stick to the standard and the language (avoid being smarter) In CMS we could add Architecture.h (config.h) on the fly at each .cc just before compiling Do not use Cint or Python where native C++ suffices 8
Package Metrics
Project Release Packages Anaphe ATLAS
3.6.1
1.3.2
CMS/ORCA CMS/COBRA CMS/IGUANA Geant4
1.3.7
4.6.0
5.2.0
2.4.2
4.3.2
31 230 236 199 87 35 108 ROOT
2.25/05
30
*) John Lak os, Large-Scale C++ Programming
Average # of direct dependencies 2.6
6.3
7.0
7.4
6.7
3.9
7.0
6.4
Cycles (Packages Involved) - 2 (92) 2 (92) 7 (22) 4 (10) - 3 (12) 1 (19) # of levels ACD* CCD* NCCD*
8 96 97 35 19 6 21 22 5.4
70 16211 77 18263 24 15 5.0
16 19 167 4815 1312 174 1765 580
Size 1.3
10 11 3.6
2.7
1.2
2.8
4.7
630/170k 1350k 1350k 420k 180k 150/38k 680k 660k Size = total amount of source code (roughly—not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree A good toolkit’s NCCD will be close to 1.0
< 1.0: structure is flatter than a binary tree (= independent packages) > 1.0: structure is more strongly coupled (vertical or cyclic)
Aim: Minimise NCCD for given software/functionality
9
Metrics: NCCD vs Cycles
12 10 8
ATLAS
6 4
ORCA
2
Anaphe IGUANA G4
0 0%
Toolkits & Frameworks
10%
COBRA
20% 30% 40% 50% Fraction of Packages in Cycles
ROOT
60% 70%
10
Toward a Project Praxis Define the global software model
Granularity, role and nature of “Modules” Physical vs logical modules (yesterday at CMS plenary M.Livny concluded asking for staticly linked, check-pointable executables…) Reuse model of sub-components Which “glues” have to be used, where and how
Define THE set of basic components Agree on Metrics to measure modularity
Not only Frameworks, also applications based on them 11