Basic Probability and Statistics

Download Report

Transcript Basic Probability and Statistics

Basic Probability and Statistics

Random variables Distribution functions Various probability distributions

Definitions

• An experiment is a process whose output is not known with certainty.

• The set of all possible outcomes of an experiment is called the sample space (

• The outcomes are called sample points in

• A random variable is a function that assigns a real number to each point in

• A distribution function

F(x)

of the random variable

is defined for each real number

as follows

(

)  Pr(



) 2

Properties of distribution function

0 

(

)  1 

F(x)

is non decreasing .

For

1 

2 :

(

1 ) 

(

2 ).

lim  

(

)  1 ;

lim  

(

)  0 .

Random Variables

• • A random variable (r.v.)

is discrete if it can take on at most a countable number of values

x 1 , x 2 , x 3 ,

… • The probability that the discrete r.v.

takes on a value

x i

is given by:

p(x i )=Pr(X= x i ).

p(x)

is called the probability mass function .

   1

(

x i

)  1 .

(

) 

x i

 

x p

(

x i

) .

Random Variables

• • A r.v. is said to be continuous if there exists a nonnegative function

f(x)

such that for any set of real numbers

f(x)

is called probability density function .

Pr(



)  

B f

(

)

and    

(

)

 1 .

(

)  Pr(



)  Pr 

 (  ,

)  

  

(

)

Random Variables

• Mean or expected value

µ,

and given by:   of a r.v.

        

   1 

xf x j p

(

x j

) if (

)

if X is discrete, X is continuous .

is denoted by

E[X]

or • Variance of a r.v.

is denoted by

Var(X)

σ 2 ,

and given by:  2 

 

   2  

  2 .

Properties of mean

• If

is a discrete random variable having

pmf p(x)

, then:



(

)   

x g

(

)

(

) .

• If

is continuous with

pdf f(x)

, then:



(

)      

(

)

(

)

• Hence, for constants

and





 

  

Property of variance

• For constants

and

Var





 

Var

  .

Joint Distribution

• If

and

are discrete r.v., then,

(

)  Pr(



) 

• • is called the joint probability mass function of

and

Marginal probability mass functions of X and Y:

p X p Y

(

) (

)   



X p

(

)

) X, Y are independent if

(

) 

p X

(

)

p Y

(

) 9

Conditional probability

• • Let

and

be two events.

Pr(A|B)

is the conditional probability of event

happening given that

has already occurred.

• Baye’s theorem: Pr 

  Pr 

Pr   

 .

• If events

and

are independent, then

Pr(A|B) = Pr(A)

• Hence, from Baye’s theorem: Pr







 Pr

 

 Pr

   

Dependency

• Covariance is a measure of linear dependence and is denoted by

C ij C ij



 

X i

 

Cov(X i , X j )





X j

 



 

X i X j

  



 1 ..

• Another measure of linear dependency is the correlation factor : 



C ij



2 

2 ,

 1 ..

• Correlation factor is dimensionless but covariance is not.

Two random numbers in simulation experiment

• Let

and

be two random variates in a given simulation experiment that are

not

independent.

• Our performance parameter is

X+Y



Var X

 

X Y



    

  

X Var



 

 .

Var

 2

Cov



 • However, if the two r.v.’s

are Cov Var

 

X X



 

  0 .

Var



Var

independent:

 

Bernoulli trial

• An experiment with only two outcomes – “ Success ” and “ Failure ” where the chance of outcome is known

apriori

. • Denoted by the chance of success “

” (this is a parameter for the distribution). • Example: Tossing a “fair” coin. • Let us define a variable

X i

such that –

X i

   1 0 if trial

is otherwise.

a success • Then,

E[X i ] = p

; and

Var(X i ) = p(1-p)

Binomial r.v.

• A series of

independent Bernoulli trials . • If

is the number of successes that occur in the n trials, then

is said to be Binomial r.v. with parameters

(n, p)

. Its probability mass function is:

P x

 Pr 



   

n x

 

p x

( 1 

)



 0 , 1 , 2 ,...

n where

 

n x

  

(



Binomial r.v.

X X i



i n

  1

X i

,    1 0 if trial

is otherwise.

a success

[

] 

i n

  1

 



Var

(

)  

i n

  1

Var

 

i np

( 1 

Poisson r.v.

• A r.v.

which can take values 0, 1, 2, … is said to have a Poisson distribution with parameter

λ (λ > 0)

if the

pmf

is given by:

p i

 Pr 



 

  

i i

 0 , 1 , 2 ,...

• For a Poisson r.v.,

  

Var

    .

• The probabilities can be recursively found out:

p i

 1 

  1

p i

 0 .

Uniform r.v.

• A r.v.

is said to be uniformly distributed over the interval

(a, b)

when its pmf is:

(

)   

1 

if a



x otherwise



• Expected value:

E E

 1



a b a



xdx



2 

2 2 (



) 



2 

1 

a b a





3  3 (



) 

2 

3 17

Uniform r.v.

• Variance

Var



 

   2  (



) 2 .

12 • Distribution function

F(x)

for a given

x: a < x < b

(

)  Pr 



 

x a



1 

a dy





Normal r.v.

pdf:

(

)  1 2  

 (

  ) 2 / 2  2 ,  

  .

The normal density is a bell-shaped curve about

that is symmetric It can be shown that for a normal r.v.

with parameters

(µ, σ 2 )

    ,

Var

  2 .

Normal r.v.

• If

N(µ, σ 2 ) Z





N(0,1)

• Probability distribution function of “Standard Normal” is given as:  (

)  1 2   





2 / 2

,   

  .

• If

N(µ, σ 2 )

, then:

(

)   

    .

Central Limit Theorem

• Let

X 1 , X 2 , X 3 …X n

be a sequence of IID random variables having a finite mean

and finite variance

σ 2

. Then: lim

  Pr

1 

2   

n X n



 

  (

Exponential r.v.

pdf:

(

)  

 

, 0 

  .

cdf:

(

) 

0 

(

)



0  

 

y dy

 1 

 

 1  ;

Var

 1  2 .

Exponential r.v.

• When multiplied by a constant, it still remains an exponential r.v.

Pr 



  Pr 

X x c

 1 

 



Expo



  .

• Most useful property: Memoryless!!!

Pr 







  Pr 



 

 0 .

• Analytical simplicity

( ~

Exp

(  1 ),



2 )   1 ~ 2   1  2

Exp

(  2 .

) 23

Poisson process

A counting process {

(

 0 } is said to be a Poisson process if : 

( 0 )  0 .

 The process has independen t increments .

 The number of events in any interval of length distribute d with mean  .

That is, 

 0

is Poisson Pr 

(



) 

(

) 

 

 

( 

)

 0 , 1 , 2 

T n

 1 is the time between (

 1 )

and

th event, then this interarriv al time has exponentia l distributi on.

Useful property of Poisson process

• Let

S 1 1

denote the time of the first event of the first Poisson process (with rate

λ 1

), and

S 1 2

denote the time of the first event of the second Poisson process (with rate

λ 2

). Then:

(

1 1 

1 2 )   1   1  2 25

Covariance stationary processes

• Covariance between two observations

X i

only on

and not on

. and

X i+j

depends • Let

C j

be the covariance for this process. • So the correlation factor is given by: 



C i





2 

2 



C j

 2 ,

 1 , 2 ,  .

Point Estimation

• Let

X 1 , X 2 , X 3 …X n

be a sequence of IID random variables (observations) having a finite population mean

and finite population variance

σ 2

• We are interested in finding these population parameters through the sample values.

n X i

  1

X i n



• This sample mean is unbiased point estimator • That is to say that:

 

  .

Point Estimation

• The sample variance:

2 (

) 

i n

  1 

X i n

  1

X n

is an unbiased point estimator of

σ 2

 2 • Variance of the mean :

Var

 

  2

• We can estimate this variance of mean by:

Var

 



2 (

) .

• This is true only if

X 1 , X 2 , X 3 …X n

are IID.

Point Estimation

• However, most often in simulation experiment, the data is

correlated

. • In that case, estimation using sample variance is dangerous. Because it underestimates the actual population variance .



2 (

)    2 , and

 

(

)   

Var

 

Interval Estimation

• Let

X 1 , X 2 , X 3 …X n

be a sequence of IID random variables (observations) having a finite population mean

and finite population variance

σ 2

(

> 0

• We want to construct confidence interval for mean

• Let

Z n F n (z)

. be a random variable with a probability distribution

Z n

 

X n

 2  / 

 .

F n

(

)  Pr 

Z n



 .

Interval Estimation

• Central Limit Theorem states that:

F n

(

)   (

)

as n

  .

where is the standard normal distribution with mean 0 and variance 1. • Often, we don’t know the population variance

σ 2

• It can be shown that CLT applies if we replace

σ 2

variance

S 2 (n)

by sample

t n

 

X n

  

2 (

) /

• The variable

t n

is approximately normal as

increases. 31

Standard Normal distribution

• Standard Normal distribution is

N(0,1)

• The cumulative distributive function (CDF) at any given value (

) can be found using standard statistical tables. • Conversely, if we know the probability, we can compute the corresponding value of

such that,

(

1 )  Pr 



1   1   2 .

• This value is

z 1-α/2

and is called the critical point • Similarly, the other critical point (

z 2 F

(

2 )  Pr 



2    2 .

for

N(0,1)

z 1-α/2

) is such that: 32

Interval Estimation

• It follows for a large

: Pr  

1   2

Z n z

1    2  Pr   

1   2 

X n

 

2 (

) /



1   2  Pr

X n



1   2  1   .

2 (

)

  

X n



1   2

2 (

)

Interval Estimation

• Therefore, if

n 100(1-α)

is sufficiently large, an approximate percent confidence interval of

is given by:

X n



1   2

2 (

) .

• If we construct a large number of independent

100(1-α)

percent confidence intervals each based on

different observations (

sufficiently large), the proportion of these confidence intervals that contain

should be

1-α

Interval Estimation

• What if the

is not “sufficiently large”?

• If

X i

’s are normal random variables, the random variable

t n

-distribution with

n-1

degrees of freedom . has • In this case, the

100(1-α)

percent confidence interval for

is given by:

X n



t n

 1 , 1   2

2 (

) .

Interval Estimation

• In practice, the distribution of

X i

’s is rarely normal and the confidence interval (with

-distribution) will be

approximate

.  1 , 1   2 

1   2

” is larger than the one with “

”.

• Hence, it is recommended that we use the CI with “

” . Why? • However,

t n

 1 , 1   2 

1   2 as

  .

Hypotheses testing

• Assume that

X 1 , X 2 , X 3 …X n

are normally distributed (or be approximately normal) and that we would like to test whether

µ = µ 0

, where

µ 0

is a fixed hypothesized value of

• If is large then our hypothesis is not true.

0 • To conduct such test (whether the hypothesis is true or not), we need a statistical parameter whose distribution is known when the hypothesis is true. • Turns out, if our hypothesis is true (

µ = µ 0

), then the statistic

t n

has a

-distribution with

n-1

df. 37

Hypotheses testing

• We form our two-tailed hypothesis (

H 0

) to test for

µ = µ 0 If t n

   

t n

 1 , 1   2 Reject

t n

 1 , 1   2 ``Accept' '

0 as: • The portion of real line that corresponds to the rejection of

H 0

is called the critical region for the test.

• The probability that the statistic

t n

given that

H 0

level of the test. falls in the critical region is true, which is clearly equal to

, is called • Typically if the

t n

doesn’t fall in the rejection region, we “do not reject” the

H 0

. 38

Hypotheses testing

• Type I error : If one rejects

H 0

when it is true, this is called Type I error, which is again equal to

. This errors is under experimenter's control.

• Type II error : If one accepts

H 0

error. It is denoted by

when it is false, it is Type II • We call

δ = 1- β

as power of test which is the probability of rejecting

H 0

when it is false. • For a fixed

, power of the test can only be increased by increasing

. 39

Basic Probability and Statistics

Transcript Basic Probability and Statistics

Basic Probability and Statistics

Random variables Distribution functions Various probability distributions

Definitions

Properties of distribution function

Random Variables

Random Variables

Random Variables

Properties of mean

Property of variance

Joint Distribution

Conditional probability





 

   

Dependency





Two random numbers in simulation experiment



    

 

 

 

Bernoulli trial

Binomial r.v.

Binomial r.v.

Poisson r.v.

Uniform r.v.

Uniform r.v.

Normal r.v.

Normal r.v.

Central Limit Theorem

Exponential r.v.

pdf:

cdf:

Exponential r.v.

Poisson process

Useful property of Poisson process

Covariance stationary processes

Point Estimation

Point Estimation

Point Estimation

Interval Estimation

Interval Estimation

Standard Normal distribution

Interval Estimation

Interval Estimation

Interval Estimation

Interval Estimation

Hypotheses testing

Hypotheses testing

Hypotheses testing

Directory