WFM 5201: Data Management and Statistical Analysis Lecture-11: Frequency Analysis Akm Saiful Islam

Download Report

Transcript WFM 5201: Data Management and Statistical Analysis Lecture-11: Frequency Analysis Akm Saiful Islam

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
WFM 5201: Data Management and
Statistical Analysis
Lecture-11: Frequency Analysis
Akm Saiful Islam
Institute of Water and Flood Management (IWFM)
Bangladesh University of Engineering and Technology (BUET)
June, 2008
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Frequency Analysis

Probability Position Formula and
Probability Plot

Analytical Frequency Analysis
 Normal
and Log-normal distribution
 Gumbels Extreme Value distributions Type I
 Log Pearsons Type III distribution
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Introduction to Frequency Analysis






The magnitude of an extreme event is inversely related
to its frequency of occurrence, very severe events
occurring less frequently than more moderate events.
The objective of frequency analysis is to relate the
magnitude of extreme events to their frequency of
occurrence through the use of probability distributions.
Frequency analysis is defined as the investigation of
population sample data to estimate recurrence or
probabilities of magnitudes.
It is one of the earliest and most frequent uses of
statistics in hydrology and natural sciences.
Early applications of frequency analysis were largely in
the area of flood flow estimation.
Today nearly every phase of hydrology and natural
sciences is subjected to frequency analyses.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Methods



Two methods of frequency analysis are described: one is
a straightforward plotting technique to obtain the
cumulative distribution and the other uses the frequency
factors.
The cumulative distribution function provides a rapid
means of determining the probability of an event equal to
or less than some specified quantity. The inverse is used
to obtain the recurrence intervals.
The analytical frequency analysis is a simplified
technique based on frequency factors depending on the
distributional assumption that is made and of the mean,
variance and for some distributions the coefficient of
skew of the data.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Plotting Position Formula

The frequency of an even can be obtained by
use of “plotting position” formulas. Where,

P = the probability of occurrence
n = the number of values
m = the rank of descending values with largest
equal to 1
T = 1-P = the mean number of exceedances
a=c = parameters depending on n




WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Plotting Position relationship
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Parameters
n
10
20
30
40
50
a
0.448
0.443
0.442
0.441
0.440
n
60
70
80
90
100
a
0.440
0.440
0.440
0.439
0.439
a is generally recommended as 0.4 .
For normal distribution a = 3/8
For Gumbel’s (EV1) distribution a = 0.4
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Exercise-1:

Using the 23 years of annual precipitation
depths for a station given in the table
below, estimation the exceedance
frequency and recurrence intervals of the
highest ten values using Weibull equation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Here, n = 23
Year
Rain
depth (in)
Rank,
m
P (m/(n+1)) Tr (year)
1981
1986
1988
1978
24
23
23
21
1
3
3
4
0.042
24
0.125
0.167
8
6
1993
1999
1998
20
19
19
5
8
8
0.208
4.8
1980
1990
19
18
8
10
0.333
3
1983
18
10
0.417
2.4
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Probability plot
A probability plot is a plot of a magnitude
versus a probability.
 Determining the probability to assign a
data point is commonly referred to as
determining the plotting position.

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Plotting position may be expressed as a probability from
0 to 1 or a percent from 0 to 100. Which method is being
used should be clear from the context. In some
discussions of probability plotting, especially in
hydrologic literature, the probability scale is used to
denote prob ( X  x) or 1  Px ( x) . One can always
transform the probability scale 1  Px ( x) to Px (x) or even
Tx (x)
if desired.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Gumbel’s (1958) Criteria





The plotting position must be such that all observations
can be plotted.
The plotting position should lie between the observed
frequencies m-1/n of m/n and n where is the rank of
the observation beginning with m=1 for the largest
(smallest) value and n is the number of years of record
(if applicable) or the number of observations.
The return period of a value equal to or larger than the
largest observation and the return period of a value
equal to or smaller than the smallest observation
should converge toward n .
The observations should be equally spaced on the
frequency scale.
The plotting position should have an initiative meaning,
be analytically simple, and be easy to use.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Steps for probability plot





Rank the data from the largest (smallest) to the smallest
(largest) value. If two or more observations have the
same value, several procedures can be used for
assigning a plotting position.
Calculate the plotting position of each data point from
relationship Table presented in earlier slide.
Select the type of probability paper to be used.
Plot the observations on the probability paper.
A theoretical normal line is drawn. For normal
distribution, the line should pass through the mean plus
one standard deviation at 84.1% and the mean minus
one standard deviation at 15.9%.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Analytical Frequency Analysis

Chow has proposed
xT  x  KT s

where, K is the frequency factor
s is the standard deviation and x bar is the
mean value.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Methods of Analytical Frequency
Analysis




Normal distribution
Log-normal distribution
Gumbel’s Extreme Value distributions Type I
Log Pearsons Type III distribution
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Normal Distribution
The probability that X is less than or equal to x when X can
be evaluated N ( ,  2 ) from
x
prob( X  x)  px ( x)   (2 )

2 1 / 2
e
( t   )2 / 2 2
dt
(4.9)
The parameters  (mean) and  2 (variance) are
denoted as location and scale parameters, respectively.
The normal distribution is a bell-shaped, continuous and
symmetrical distribution (the coefficient of skew is zero).
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Standard normal distribution

The probability that X is less than or equal
to x when X is N ( ,  ) can be evaluated
from
2
x
prob( X  x)  px ( x)   (2 )

2 1 / 2
e
( t   )2 / 2 2
dt
(4.9)
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

The equation (4.9) cannot be evaluated
analytically so that approximate methods of
integration arc required. If a tabulation of the
integral was made, a separate table would be
required for each value of  and  . By using the
liner transformation Z  ( X   ) /  , the random variable
Z will be N(0,1). The random variable Z is said to
be standardized (has   0 and   1 ) and N(0,1) is
said to be the standard normal distribution. The
standard normal distribution is given by
   z   (4.10)
p ( z)  (2 ) e
and the cumulative standard normal is given by
z
prob( Z  z)  PZ ( z)   (2 ) 1 / 2 e t / 2 dt (4.11)
2
2
1 / 2
z2 / 2
Z
2

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Figure 4.2.1.3
Standard normal distribution
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam





Figure 4.2.1.3 shows the standard normal
distribution which along with the transformation
Z  ( X   ) /  contains all of the information shown
in Figures 4.1 and 4.2.
Both pZ (Z )
and PZ (z) are widely
tabulated.
Most tables utilize the symmetry of the normal
distribution so that only positive values of Z are
shown.
Tables of PZ (z) may show prob( Z  z ) or
prob (0  Z  z )
Care must be exercised when using normal
probability tables to see what values are
tabulated.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Exercise-2: Assume the following data follows
a normal distribution. Find the rain depth that
would have a recurrence interval of 100 years.

Year
2000
1999
1998
1997
1996
…..






Annual Rainfall (in)
43
44
38
31
47
…..
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Solution:
Mean = 41.5, St. Dev = 6.7 in (given)
x= Mean + Std.Dev * z
x = 41.5 + z(6.7)
P(z) = 1/T = 1/100 = 0.01
F(z) = 1.0 – P(z) = 0.99
From Interpolation using Tables E.4
Z = 2.33
X = 41.5 + (2.33 x 6.7) = 57.11 in
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Table: Area under standardized normal distribution
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Linear Interpolation from Z-table
z  z1
z 2  z1

p  p1 p2  p1


For, p=0.99 , find z ?
From table, z1 = 2.32 , p1 = 0.9898 and z2
=2.33 p2=0.9901
z  2.32
2.33  2.32

0.99  0.9898 0.9901  0.9898