Transcript pps - Desy
ACAT05
May 22 - 27, 2005
DESY, Zeuthen, Germany
Search for the Higgs boson at LHC by
using Genetic Algorithms
Mostafa MJAHED
Ecole Royale de l’Air, Mathematics and Systems Dept.
Marrakech, Morocco
Search for the Higgs boson at LHC by using
Genetic Algorithms
Introduction
Genetic Algorithms
Search for the Higgs boson at LHC by using
Genetic Algorithms
Optimization of discriminant functions
Optimization of Neural weights
Hyperplans search
Hypersurfaces search
Conclusion
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
2
Introduction
Introduction
Higgs production at LHC
Several mechanisms contribute to the production of SM
Higgs boson in proton collisions
The dominant mechanism is the gluon fusion
process, gg H
Introduction
Decay Modes
decay into quarks: H bb and H cc
leptonic decay:
H + gluonic decay:
Hgg
decay into virtual W boson pair: H W +W -
Introduction
Main discovery modes
MH < 2MZ:
•
•
•
•
H
H
H
H
bb+X
W+W- ll
ZZ 4l
HW+W-ll (1)
•The decay channel chosen:
H W+W- e+-, + e- , e+e-, +-.
•Signature:
• Two charged oppositely leptons with large
transverse momentum P
• Two energetic jets in the forward detectors.
• Large missing transverse momentum P’
T
•Main background:
T
• tt production: with tt WbWb l j l j
• QCD W W +jets production
• Electroweak WWjj
+
M. Mjahed
-
ACAT 05, DESY, Zeuthen, 26/05/2005
7
HW+W-ll (2)
Main Variables
• , : the pseudo-rapidity and the azimuthal angle differences
ll
ll
between the two leptons
• , : the pseudo-rapidity and the azimuthal angle differences
jj
jj
between the two jets
•M , M : the invariant mass of the two leptons and jets,
•M (n,m = 1,2,3) some rapidity weighted transverse momentum
ll
jj
nm
Mnm
n
m
.
p
i iT
i event
n, m = 1, 2,3, …
i rapidity of the leptons or jets, piT their transverse momentums.
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
8
Genetic Algorithms
Pattern Recognition
Measurement
Feature Extraction/
Feature Selection
Feature
Classification
Decision
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
10
Pattern Recognition Methods
Pattern Recognition
Methods
Statistical
Methods
Others
Methods
PCA
Discriminant
Analysis
Decision Trees
Clustering
Wavelets
Connectionist
Methods
Genetic
Algorithms
Fuzzy Logic
Neural
Networks
Genetic Algorithms
• Based on Darwin’s theory of ”survival of the fittest”:
Living organisms reproduce, individuals evolve/ mutate,
individuals survive or die based on fitness
•The input is an initial set of possible solutions
• The output of a genetic algorithm is the set of ”fittest
solutions” that will survive in a particular environment
•The process
• Produce the next generation (by a cross-over function)
• Evolve solutions (by a mutation function)
• Discard weak solutions (based on a fitness function)
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
12
Genetic Algorithms
•Preparation:
Define an encoding to represent solutions (i. e., use a
character sequence to represent a classe)
Create possible initial solutions (and encode them as strings)
Perform the 3 genetic functions: Crossover, Mutate, Fitness
Test
• Why Genetic Algorithms (GAs) ?
•
Many real life problems cannot be solved in polynomial
amount of time using deterministic algorithm
•
Sometimes near optimal solutions that can be generated
quickly are more desirable than optimal solutions which
require huge amount of time
•
Problems can be modeled as an optimization one
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
13
Genetic Functions
• Crossover
1 0 0
1 1 0 1 0
0 1 1 1 1 1 0 1
• Mutation
1 0 0 1 1
1 0
0 1 1
0 1 0
1
0
0
1
1
0
1
0
1
1
0
1
0
0
1
0
1 1
•Roulette Wheel Selection
3
2
Spin
4
1
2
1
2
4
1
GA Process
Initialize Population
Evaluate Fitness
Terminate?
Yes
Output
solution
No
Perform selection,
crossover and mutation
Evaluate Fitness
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
15
GAs for Pattern Classification
Optimization of discriminant functions
Optimization of Neural weights
Hyperplans search
Hypersurfaces search
Efficiency and Purity of Classification
•
Validation
Test
Events
C1: N1
C2: N2
Total
Classification
C1
C2
N11
N12
N21
N22
M1
M2
Efficiency for Ci classification
Purity for Ci events
Ei
Nii
Ni
Pi
Nii
Mi
Misclassification rate for Ci or Error
N ij
i
Ni
Search for the Higgs boson at
LHC by using Genetic Algorithms
Search for the Higgs boson at LHC by using
Genetic Algorithms
Signal
Background
pp HX W+W-X
l+ l- X
pp ttX WbWbX l j l jX
pp qqX W+W- X
Events generated by the LUND MC PYTHIA 6.1 at s = 14 TeV
MH = 115 - 150 GeV/c2
10000 Higgs events and 10000 Background events are used
Research of discriminating variables
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
18
Variables
ll , ll: the pseudo-rapidity and
jj , jj: the pseudo-rapidity and
the azimuthal angle differences
between the two leptons
the azimuthal angle differences
between the two jets
Mll : the invariant mass of the two
Mjj : the invariant mass of the two
leptons
jets
Rapidity-impulsion weighted
Moments Mnm : (n=1, …,6)
i rapidity:
Mnm i n . piT m
iJet
1
2
i .Log(
Ei pi //
)
Ei pi //
ll, ll, jj, jj, Mll , Mjj, M11, M21, M31, M41
Optimization of Discriminant Functions (1)
Discriminant Analysis
C
F(x ) = (gsignal - gback )T V-1 x = i xi
The most separating discriminant Function FHiggs / Back
CHiggs
between the classes CHiggs and CBack is :
CBack
FHiggs / Back = - 0.02+0.12ll +0.4jj +0.35 Mll +0.61 Mjj
+ 0.74M11 +1.04M21
The classification of a test event x0 is then obtained
according to the condition:
if FHiggs / Back (xo) 0 then xo CHiggs else xo CBack
Classification of test events
Test
events
Classification
Efficiency
Purity
CHiggs
0.601
0.606
CBack
0.610
0.604
Optimization of Discriminant Functions (2)
• GA Parameters
• Discriminant Function
-0.02 0.12
0i
1i
• Fitness
• Number of generations
• Selection, Crossover, Mutation
0.4
0.35 0.61 0.74 1.01
2i
3i
6i
I
Classification
Eff. Purity
CHiggs 0.601 0.606
Matlab GA Toolbox
M. Mjahed
5i
Fitness Fn:
Misclassification rate
Test
Events
• GA Code
4i
0.395
CBack 0.610 0.604
ACAT 05, DESY, Zeuthen, 26/05/2005
Eff
0.6055 0.3945
21
Optimization of Discriminant Functions (3)
Chromosome 1
………………
Chromosome N
Generation of
N solutions
01 11 21
31 41 51 61
I
0N 1N 2N 3N 4N 5N 6N
N
Fitness Fn:
Coefficients of F
Initialize Population
Evaluate Fitness
• Genetic Process
Terminate?
Yes
No
Perform selection,
crossover and mutation
Evaluate Fitness
Output
solution
Optimization of Discriminant Functions (4)
• Optimization Results
Number of generations =10000
CPU Time: 120 s
• Optimal Disc. Fn
-0.02 0.12
0.4
Test
Events
0.35 0.61 0.74 1.01
Classification
Eff. Purity
CHiggs 0.652 0.649
CBack 0.648 0.650
M. Mjahed
0.35
ACAT 05, DESY, Zeuthen, 26/05/2005
Eff
0.65
0.35
23
Optimization of Neural Weights (1)
ll ll jj jj Mll M0 M11 M21 M31 M41
Neural Analysis
NN Architecture: (10, 10, 10, 1)
o1 wih12oh2i
i
h1
h2i f ( w hji1h2h1 j i )
j
h1i f ( w xjih1 x j i )
h2
j
O1
Classification of test events
if O1(x) 0.5 then x CHiggs else x CBack
Test
Evts
CHiggs
CBack
Classification
Eff. Pur. Eff
0.654 0.663
0.669 0.659 0.661 0.338
Optimization of Neural Weights (2)
• GA Parameters
•Connection Weights + Thresholds
Wijxh1, ih1 Wijh1h2, ih2
100 +10
Wijh2o
I
10
I
100 + 10
• Total number of parameters to be optimized : 230
• Fitness: Misclassification rate,
• Optimization Results
Number of generations =1000
CPU Time: 6 mn
Test
Evts
CHiggs
CBack
Classification
Eff. Pur. Eff
0.691 0.696
0.699 0.693 0.695 0.305
Hyperplan search (1)
H(x) = 0 + i xi = 0 ,
i=1, 10
Hyperplan Hj 0j 1j 2j 3j 4j 5j 6j 7j 8j 9j
10j
• 11 parameters to optimize
Classification rule
if H(x) 0 then x CHiggs else x CBack
• Genetic Process
• Generation of N hyperplans Hj,
j=1: N=20
• Number of generations =10000
• CPU Time: 4 mn
Initialize Population
Evaluate Fitness
Terminate?
Yes
No
Perform selection,
crossover and mutation
Evaluate Fitness
Output
solution
Hyperplan search (2)
• Hyperplan search Results
Classification of test events
Test
Evts
Classification
Eff.
Pur.
CHiggs 0.661 0.655
CBack 0.651 0.657
Eff
0.656 0.344
Same results than Discriminant functions
optimization
Test
Events
Classification
Eff.
Pur.
CHiggs 0.652 0.649
CBack 0.648 0.650
Eff
0.65
0.35
Hypersurface search (1)
10
S( x) 0 ( i xi i xi 2 i xi 3 )
i 1
Hypersurface j
I j i=0:10 i j i=1:10
i j i=1:10
j
• 31 parameters to optimize
Classification rule
if S(x) 0 then x CHiggs else x CBack
• Genetic Process
• Generation of N hyperplans Sj,
j=1: N=20
• Number of generations =10000
• CPU Time: 6 mn
Initialize Population
Evaluate Fitness
Terminate?
Yes
No
Perform selection,
crossover and mutation
Evaluate Fitness
Output
solution
Hypersurface search (2)
• Hypersurface search Results
Classification of test events
Test
Evts
Classification
Eff.
Pur.
CHiggs 0.689 0.696
CBack 0.693 0.693
Eff
0.691 0.309
Same results as NN weights optimization
Test
Evts
Classification
Eff.
Pur.
CHiggs 0.691 0.696
CBack 0.699 0.693
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
Eff
0.695 0.305
29
Conclusion
Methods
Importance of Pattern Recognition Methods
C
CBack
CHiggs
The improvement of an any identification is subjected to the multiplication
of multidimensional effect offered by PR methods and the discriminating
power of the proposed variables.
Genetic Algorithms Method allows to minimize the classification error and to
improve efficiencies and purities of classifications.
The performances are in average 3 to 5 % higher than those obtained with the
other methods.
Discriminant Functions Optimization : comparative to hyperplan
search approach
Neural Weights Optimization : comparative to hypersurface search
approach
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
30
Conclusion (continued)
Variables
Characterisation of Higgs Boson events:
Other variables should be examined
Physics Processes
Other processes should be considered
Detector effects should be added to the simulated events
M. Mjahed
ACAT 05, DESY, Zeuthen, 26/05/2005
31