Document 7690589

Download Report

Transcript Document 7690589

Analysis of CMS Heavy Ion Simulation
Data Using ROOT/PROOF/Grid
Jinghua Liu
for
Pablo Yepes, Jinghua Liu
Rice University, Houston, TX
Maarten Ballintijn, Gunther Roland,
Bolek Wyslouch, Jinlong Zhang
MIT, Cambridge, MA
Supported by NSF grants #0218603, #0219063
CHEP03
1
Outline
From data analysis user’s point of view
 Why: ROOT/PROOF/Grid
 How: Step by Step
What: Test Result
Summary
Other PROOF talks in this conference:
Fons Rademakers
Maarten Ballintijn
CHEP03
2
ROOT/PROOF
ROOT as a data analysis tool
PROOF: Parallel ROOT Facility ,based on and
part of ROOT
on clusters of heterogeneous machines
•
•
parallel analysis of objects in a set of files
parallel execution of scripts
Transparency, Scalability, Adaptability, Error
handling, Authentication
“Bring the KB to the PB not the PB to the KB”
KB: code-->CPU, PB: data
Use distributed CPUs to analyze distributed data
CHEP03
3
PROOF/Grid Interface





Use a Grid Resource Broker to detect which
nodes in a cluster can be used in the parallel
session
Use Grid File Catalogue and Replication Manager
Utilize Grid Monitoring Services
Support Globus Authentication
Abstract Grid interface
CHEP03
4
Step by Step






Setup PC cluster(s) (for PROOF/Grid)
Prepare the data files
Write analysis code (algorithm)
Compile a data set for PROOF
Run a PROOF job
Get the results
CHEP03
5
PC Clusters




Client machine (desktop)
P4 @ 1.8GHz /512MB/40GB
Cluster1:
2 Dual Xeon @ 2.4GHz /1GB/360GB
1 Dual Athlon @ 1.73GHz /1GB/240GB
8 Dual PIII @ 400MHz /512MB/60GB
Cluster 2:
3 Dual Athlon @ 1.67GHz /2GB/200GB
Operating systems:
RedHat 6.1, RedHat 7.3, Slackware 8.1
Globus version: 2.2
CHEP03
6
CMS Heavy Ion Simulation


Jet & high-pT particle angular correlation
Use Calorimeters only
CHEP03
7
CMS Heavy Ion Simulation


Pythia (event generator): 10,000 jet events
Hijing (Heavy Ion event generator): 1000
events



Each Hijing event (dN/dy~5000) was divided into
~500 sub-events
Randomly re-combine 500 sub-events (from different
events) to form a new Hijing event, a cheap way to
obtain more Monte Carlo events
CMSIM (GEANT 3 based simulation program
for CMS)
CHEP03
8
Data Production: Globus Jobs
Globus Gate Keeper (Condor)
Work node
Work node
Work node
Work node
Client PC
Globus Gate Keeper (PBS)
Work node
Work node
Work node
Globus used to submit & manage the jobs
No data replication (files were intentionally stored locally)
CHEP03
9
Build ROOT Tree
Superimpose jet events on top of Hijing events and
generate ROOT Tree
Standalone code linked with ROOT libraries
CMS: Ecal (Electromagnetic Calorimeter):
barrel 61200 cells, endcap 14648 cells
HCal (Hadronic Calorimeter):
14616 cells (multi-layer)
4032 towers
calotree--Ecal cells (energy, position)
Hcal towers (energy, position)


10,000 events were split into 100 files, 100 events each,
file size ~160MB, total data 16GB
Data distributed, each node got some local files
CHEP03
10
TSelector – The Algorithms

Create TSelector from TTree
$ root
root[0] TFile f(“heavyion001.root”)
root[1] calotree->MakeSelector(“myselector”)
root[2] .q
$ ls
myselector.C
myselector.h

Add the analysis code (algorithm) into TSelector
$ vi myselector.h
$ vi myselector.C
CHEP03
11
TSelector – The Algorithms

myselector.h
Class myselector : public TSelector {
public:
TTree
*fChain;
.
.
private:
TH1F *hist1d;
TH2F *hist2d;
.
.
.
}
CHEP03
12
TSelector – The Algorithms

myselector.C
void myselector::Begin(TTree *tree) {
hist1d = new TH1F(“DeltaPhi”,”DeltaPhi”,100,180.,180.);
Hist2d = new TH2F(“EtaPhi”,”EtaPhi”,100,-5.,5.,100,-4.,4.);
fOutput->Add(hist1d);
fOutput->Add(hist2d);
}
Bool_t myselector::Process(Int_t entry) {
user’s analysis code goes here!
for(i=0; i< nclusters; i++) {
if (Et1>5)
for(j=i+1; j< nclusters; j++) {
if(Et2>5) {
DeltaPhi= …
hist1d->Fill(DeltaPhi);
}
CHEP03
13
TDSet – Data Location

Specify a collection of TTrees or files
[]
[]
[]
…
[]
[]
…
[]
TDSet *ds = new TDSet(“TTree”, “calotree”);
ds->Add(“/data1/cms/cmsim/heavyion001.root”);
ds->Add(“/data1/cms/cmsim/heavyion002.root”);
ds->Add(“lfn://pcs21.rice.edu/data5/heavyion110.root”);
ds->Add(“lfn://pcs11.rice.edu/cms/cmsim/heavyion230.root”);
ds->Print();

It’s better to put these into a macro

Returned by DB or File Catalog query etc
CHEP03
14
Running a PROOF Job
$ root
[] gROOT->Proof(“proofmaster.rice.edu”);
[] TDSet *ds = new TDSet(“TTree”, “calotree”);
[] ds->Add(“. . .”);
. . .
[] ds->Process(“myselector.C+”, “options”, nentries, first);
(note: options must be pre-coded in myselector.C)
[] TH1F *h1=(TH1F *)gProof->GetOutput(“DeltaPhi”);
[] h1->Draw();
CHEP03
15
Angular Correlation
CHEP03
16
Scale plot
Analysis speed vs. CPUs (PIII 1GHz equivalent)
CPU power/data size balanced
CPU intensive calculations
CHEP03
17
Summary




CMS Heavy Ion Analysis implemented and
tested with PROOF
Scales well with CPUs
PROOF/Grid can provide the data analysis
power unavailable otherwise. This power
can be achieved without much extra effort
PROOF/Grid interface is under rapid
development. The plan is to extend the
presented study to use Grid interface
CHEP03
18

The End
CHEP03
19