No Slide Title

Download Report

Transcript No Slide Title

Extreme Value Theory: A useful framework for modeling extreme OR events

Dr. Marcelo Cruz Risk Methodology Development and Quantitative Analysis abcd

Operational Risk Measurement

Agenda

 Database Modeling  Measuring OR: Severity, Frequency  Using Extreme Value Theory  Causal Modeling: Using Multifactor Modeling  Plans for OR Mitigation

Operational Risk Database Modelling

ABSTRACT PROBLEMS

Doubtful Legislation Process Failures Human Errors Systems Problems Poor Controls

PROCES S

Failures in the process

OBJECTIVE PROBLEMS

Legal suits Interest expensesBooking errors (P&L Adjustments)

Consequence = -$$$!

Data Model

CEF’s Volumes Sensitivity Data Quality Control Gaps Organization Automation Levels Business Continuity IT Environment Process & Systems Flux Control Measure KCI’s Nostro Breaks Depot Breaks Intersystem breaks Intercompany breaks Interdesk breaks Control Account breaks Unmatched confirmations Fails Operations loss data

Market Risk adjustments

Error financing costs

 

Write offs Execution Errors Risk Optimization

Operational Risk

Earnings Volatility

P&L

Market Risk Credit Risk

(Revenue)

Operational Risk

(Costs) For the first time banks are considering impacts on the P&L from the cost side!

Measuring Operational Risk

Building the Operational VaR

1) Estimating Severity 2) Estimating Frequency

Choosing the distribution Estimating Parameters Testing the Parameters PDFsandCDFs Quantiles

3) Aggregating Severity and Frequency

Monte Carlo Simulation Validation and Backtesting

Measuring Operational Risk

Losses sizes (in $) 120 80 52 36 24 10 2 1 15 22 7 20

18

25 Location = Average = 34.6

Scale = St Deviation= 32.2

Time

f

(

x

) = 1 2 ps

e

(

x

m 2 s ) 2 f(x) = 1.08% (PDF - probability dist function) = 30.3% (CDF - cumulative dist function)

Measuring Operational Risk

What number will correspond to 95% of the CDF?

(How do I protect myself 95% of the time?) Quantile Function = (CDF) -1 --> the inverse of the CDF (Solves the CDF for x) In Excel, Normal Quantile function = NORMINV function Lognormal Quantile function = LOGINV function In our example: =NORMINV(95%,34.6,32.2) = 87.6

=LOGINV(95%,3.2,.78) = 92.7

Heavier tail !

(Not heavy enough as our “VaR” would have 1 violation!)

Measuring Operational Risk

EXTREME VALUE THEORY Losses sizes (in $) 80 120 52 36 24 2 10 1 15 22 20 18 7 25 threshold Time A model chosen for its overall fit to all database may not provide a particular good fit to the large losses. We need to fit a distribution specifically for the extremes.

Measuring Operational Risk

Broadly two ‘types’ of Extremes: Losses sizes (in $) 80 120 36 52 24 10 2 15 22 20 18 7 25 Threshold Losses sizes (in $) 120 80 36 52 24 2 10 1 15 22 20 18 7 25 Time Time Peaks over Threshold (P.O.T.) Fits Generalised Pareto Distribution (G.P.D.) Distribution of Maxima over a certain period - Fits the Generalised Extreme Dist (GEV)

Measuring Operational Risk

Extreme Value Theory Losses sizes (in $) 80 36 120 52 24 10 2 15 22 20 18 7 25 Threshold Time Hill Shape  ˆ =

k

1

k k

 = 1 ln

x

ln

k

Graphical Tests QQ and ME-Plots Choose distribution

Measuring Operational Risk

Back to the example, comparing the results: =NORMINV(95%,34.6,32.2) = 87.6

=LOGINV(95%,3.2,.78) = 92.7

1 violation (largest event = 120)

Using GEV (95%,3-parameter) =143.5

No violations !

Extreme Value Theory

Example: Frauds in a British Retail Bank

9 10 11 1 2 3 4 5 6 7 8 12 1992

907,077 845,000 734,900 550,000 406,001 360,000 360,000 350,000 220,357 182,435 68,000 50,000

1993

1,100,000 650,000 556,000 214,635 200,000 160,000 157,083 120,000 78,375 52,049 51,908 47,500

1994

6,600,000 3,950,000 1,300,000 410,061 350,000 200,000 176,000 129,754 109,543 107,031 107,000 64,600

1995

600,000 394,672 260,000 248,342 239,103 165,000 120,000 116,000 86,878 83,614 75,177 52,700

1996

1,820,000 750,000 426,000 423,320 332,000 294,835 230,000 229,369 210,537 128,412 122,650 89,540

Extreme Value Theory

Hill method for the estimation of the shape parameter: g ˆ

k

,

n

(

H

) = ( 1

k-1 i

= 1 

X j

,

n

ln

X k

,

n

) 1

1995 LogLosses

1 600,000.34

2 394,672.11

13.3046855

Hill 12.8858106 0.418875

3 260,000.00 12.46843691 0.626811

4 248,341.96 12.42256195 0.463749

5 239,102.93 12.38464941 0.385724

6 165,000.00 12.01370075 0.679528

7 120,000.00 11.69524702 0.884727

8 116,000.00 11.66134547 0.792239

9 10 86,878.46 11.37226541 0.982289

83,613.70 11.33396266 0.911449

11 12 75,177.00 11.22760061 0.926666

52,700.00 10.87237073 1.197653

Hill Plot

1.4

1.2

1 0.8

0.6

0.4

0.2

0 1 2 3 4 5 6 7 8 9 10 11 12

Extreme Value Theory

QQ-Plot 1995 QQ-Plots

:

Plotting:

{

X k

,

n

,

F

 -

(

p k

,

n

) :

k

=

1 ,...,

n

}

1 0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0 0

where

p k

,

n

=

n

-

k

 0 .

5

n

Approximate linearity suggests good fit

0.2

0.4

0.6

0.8

1 1.2

Uses: 1) Compare distributions 2) Identify outliers 3) Aid in finding estimates for the parameters

Extreme Value Theory

Parameter Estimation

Methods : 1) Maximum Likelihood (ML) 2) Probability Weighted Moments (PWM) 3) Moments

PWM works very well for small samples (OR case!) and it is simpler. ML sometimes do not converge and the bias is larger.

Extreme Value Theory

PWM Method: (Based on order statistics)

GEV

Auxiliaries Plotting Position

p n

,

k

=

n

-

j

 0 .

5

k

c r

(  ) = 1

n n

 = 1

j X j

,

n U r j

,

n

, r = = 7.8590c

 2

w

2 3

w

3 -

w

1

w

1 2.9554c

2 log log 2 3 = 0,1,2

Location Scale

= = ( 1 -

w

1 -

scale

{ 1   ( 1 

w

2  2  )  ( 1   )  )}

o

 

e u t

1  0

Extreme Value Theory

8 9 10 5 6 7 11 12

1994

Plot Position w1 1 6,600,000.00 0.958333333

2 3,950,000.00

0.875

PP^2 3456250 0.765625

w2 6325000 0.918403 6061458.333

3024218.75

3 1,300,000.00 0.791666667 1029166.667 0.626736 814756.9444

4 410,060.72 0.708333333 290459.6767 0.501736

205742.271

350,000.00

200,000.00 0.541666667 108333.3333 0.293403 58680.55556

176,000.00 0.458333333 80666.66667 0.210069 36972.22222

129,754.00

107,000.00

0.625

0.375

218750 0.390625

136718.75

48657.75 0.140625 18246.65625

109,543.00 0.291666667 31950.04167 0.085069 9318.762153

107,031.20 0.208333333 22298.16667 0.043403 4645.451389

0.125

13375 0.015625

1671.875

64,600.00 0.041666667 2691.666667 0.001736 112.1527778

w0 w1 w2 1,125,332.41

968,966.58

864,378.56

c Shape Scale Location -0.07731282 Hill -0.5899362 Gamma 612,300.60

1,101,869.17

1.56577

1.06

Extreme Value Theory

Parameter Estimation (PWM and Hill) Parameter

 Shape Parameter m Location Parameter  Scale Parameter

1992

0.959265

1993

0.994119

1994

1.56577

410,279.77

432,211.40

1,101,869.17

215,551.84

147,105.40

298,067.91

612,300.60

1995

0.679518

25,379.83

1996

1.07057

445,660.38

361,651.03

The shape parameter was estimated by the Hill method and the scale and location by the PWM.

Testing the Model - Checking the Parameters

Based on simulation, techniques like Bootstrapping and Jack knife helps find confidence intervals and bias in the parameters Jacknife Test for Model GEV Shape Std Err = 0.4208, Scale Std Err = 116,122.0647, Location Std Err = 126,997.6469

Shape Scale Location 1.2

350000 300000 1 250000 0.8

Jackknife => 0.6

200000 150000 0.4

100000 0.2

50000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Loss Number Removed(Descending) 16 Let

be the estimate of a parameter vector

based on a sample of operational loss events x = (x

1 , …,x n

). An approximation to the statistical properties can be obtained by studying a sample of B bootstrap estimators

m

(b) ( b = 1,…,B), each obtained from a sample of m observations, sampling with replacement from the observed sample x. The bootstrap sample size, m, may be larger or smaller than n. The desired sampling characteristic is obtained from properties of the sample {

m

(1),…,

m

(b)}. 17 18 19 20 21 22 23 0 <= Bootstrapping

Frequency Distributions

Number of Frauds January

95

February

82

March

114

April

74

4.50% 4.00% 3.50% 3.00% 2.50% 2.00% 1.50% 1.00% 0.50% 0.00% 0 50

Poisson PDF

100 150

May

79

June

160

= 102

July

110

August

115 118 126

Poisson Poisson CDF

100.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 0 20 40 60 80 100 120 140 160 91% 95% 99%

Poisson Distribution:

f

(

x

) =

k x

 = 0

e

 

k k

!

Other popular distributions to estimate frequency are the geometric, negative binomial, binomial, weibull, etc 200

Measuring Operational Risk

Severity Prob Prob Frequency Prob Losses sizes Number of Losses Aggregated Loss Distribution Need to be solved by simulation Aggregated losses

n

  = 0

p n F X

*

n

(

x

) No analytical solution!

1) Fast Fourier Transform 2) Panjer Algorithm 3) Recursion

Model Backtesting and Validation

Currently for Market / Credit Risks

MRC mt

 1 = max[

VaR mt

( 10 , 1 ); 1

S mt

x 60

i

59  = 1

VaR mt

-

i

( 10 , 1 )] 

CreditCh

arg

e

Multiplier based on Backtests (Between 3 and 4)

Model Backtesting and Validation

Kupiec Test Pr(

x

) =  

n x

 

p x

* ( 1 -

p

)

n

-

x LR

= 2 [ln( 

x

( 1  )

n

-

x

) ln(

p x

* ( 1 -

p

)

n

-

x

)]

Exceptions can be modelled as independent draws from a binomial distribution

Interval Forecast Method I m

t

 1 = 1 if 0 if  t  t  1  VaR mt  1  VaR mt

Series must exhibit the property of correct conditional coverage (unconditional) and serial independence

Regulatory Loss Functions

C mt

 1 =  

f

g ( (   t

t

  1 ,

VaR mt

) if 1 , VaR mt ) if  t  t   1  VaR mt 1  VaR mt C m = i n  = 1 C mt  i Define benchmarks (some subjectivity)

Under very general conditions, accurate VaR estimates will generate the lowest possible numerical score

Understanding the Causes - Multifactor Modeling

For Example:

Try to link causes to loss events We are trying to explain the frequency and severity of frauds by using 3 different factors.

January February March April May June July Number of Op Errors Losses ($$)

95 82 1,200,000 920,000 114 74 79 160 110 1,770,987 652,000 710,345 2,100,478 1,650,000

System Downtime N. of Employees No. of Transactions

20 17 16 16 1,003 910 30 15 16 41 33 14 17 17 13 14 1,123 903 910 1,250 1,196

N. of Op Errors = 88.88 + 6.92 System Downtime + 5.32 Employees - 0.22 N. of transactions

R 2 = 95%, F-test = 20.69, p-value = (0.01)

Losses = 4,597,086.21 - 7,300.01 System Downtime - 286,228 .59 Employees + 1,193 N.of Tr.

R 2 = 97%, F-test = 42.57, p-value = (0.00)

Understanding the Causes - Multifactor Modeling

Benefits of the Model 1) Scenario Analysis / Stress Tests Ex: Using confidence intervals (95%) of the parameters to estimate the number of frauds and the losses ($$) for the next month.

2) Cost / Benefit Analysis Ex: If we hire 1 employee costing 100,000/year the reduction in losses is estimated to be 286,228.

Developing an OR Hedging Program

OPERATIONAL RISK (MEASURED) Internal Risk Transfer MITIGATION (Non financial) Capital Allocation Insurance Securitization

General coverage rather

than specific risks It would not pay immediately after catastrophe (although some new products claim to do so)

• •

Specific coverage Immediate protection against catastrophes

Developing an OR Hedging Program

AGENT FINANCIAL INSTITUTION RISK TRANSFER COMPANY or SPV CAPITAL MARKET Insurance policy INSTRUMENT Takes the Risk and issues Bonds linked to operational event at the Buy the bond FINANCIAL RESULTS Paid a premium Receives a commission Recieves high yield RISKS None up to the limit insured None If the operational event described in the bond happens in the financial institution, loss of some or all the principal or interest

Developing an OR Hedging Program

Retain Insurance ORL Bond (OR insurance) CDF Optimal point OpVar

Conclusion

• It is possible to use robust methods to measure OR • OR-related events does not follow Gaussian patterns • More than just finding an Operational VaR, it is necessary to relate the losses to some tangible factors making OR management feasible • Detailed measurement means that product pricing may incorporate OR • Data collection is very important anyway!

My e-mail is [email protected]