Econ 3780: Business and Economics Statistics

Download Report

Transcript Econ 3780: Business and Economics Statistics

Econ 3790: Business and
Economics Statistics
Instructor: Yogesh Uppal
Email: [email protected]
Lecture Slides 5


Random Variables
Probability Distributions

Discrete Distributions



Discrete Uniform Probability Distribution
Binomial Probability Distribution
Continuous Distribution

Normal Distribution
Random Variables
A random variable is a numerical description of the
outcome of an experiment.
A discrete random variable may assume either a
finite number of values or an infinite sequence of
values.
A continuous random variable may assume any
numerical value in an interval or collection of
intervals.
Example: JSL Appliances

Discrete random variable with a finite
number of values
Let x = number of TVs sold at the store in one day,
where x can take on 5 values (0, 1, 2, 3, 4)
Example: JSL Appliances

Discrete random variable with an infinite sequence
of values
Let x = number of customers arriving in one day,
where x can take on the values 0, 1, 2, . . .
We can count the customers arriving, but there is no
finite upper limit on the number that might arrive.
Random Variables
Question
Family
size
Random Variable x
x = Number of dependents
reported on tax return
Type
Discrete
Distance from x = Distance in miles from
home to store
home to the store site
Continuous
Own dog
or cat
Discrete
x = 1 if own no pet;
= 2 if own dog(s) only;
= 3 if own cat(s) only;
= 4 if own dog(s) and cat(s)
Discrete Probability Distributions
The probability distribution for a random variable
describes how probabilities are distributed over
the values of the random variable.
We can describe a discrete probability distribution
with a table, graph, or equation.
Discrete Probability Distributions
The probability distribution is defined by a
probability function, denoted by p(x), which provides
the probability for each value of the random variable.
The required conditions for a discrete probability
function are:
p(x) > 0
p(x) = 1
Discrete Probability Distributions


Using past data on TV sales, …
a tabular representation of the probability
distribution for TV sales was developed.
Units Sold
0
1
2
3
4
Number
of Days
80
50
40
10
20
200
x
0
1
2
3
4
p(x)
.40
.25
.20
.05
.10
1.00
80/200
Discrete Probability Distributions
Graphical Representation of Probability Distribution
.50
Probability

.40
.30
.20
.10
0
1
2
3
4
Values of Random Variable x (TV sales)
Expected Value and Variance
The expected value, or mean, of a random variable
is a measure of its central location.
E(x) =  =  p(x) *x
The variance summarizes the variability in the
values of a random variable.
Var(x) =  2 = p(x)*(x - )2
The standard deviation, , is defined as the positive
square root of the variance.
Expected Value
x
0
1
2
3
4
p(x)
x*p(x)
.40
.00
.25
.25
.20
.40
.05
.15
.10
.40
E(x) = 1.20
expected number of
TVs sold in a day
Variance and Standard Deviation
x
x-
0
1
2
3
4
-1.2
-0.2
0.8
1.8
2.8
(x - )2
p(x)
p(x)*(x - )2
1.44
0.04
0.64
3.24
7.84
.40
.25
.20
.05
.10
.576
.010
.128
.162
.784
Variance of daily sales =  2 = 1.660
Standard deviation of daily sales = 1.2884 TVs
Types of Discrete Probability
Distributions:


Uniform
Binomial
Discrete Uniform Probability Distribution
The discrete uniform probability distribution is the
simplest example of a discrete probability
distribution given by a formula.
The discrete uniform probability function is
p(x) = 1/n
the values of the
random variable
are equally likely
where:
n = the number of values the random
variable may assume
Discrete Uniform Probability Distribution

Suppose, instead of looking at the past sales
of the TVs, I assume (or think) that TVs sales
have a uniform probability distribution, then
the example done above would change as
follows:
Expected Value
x
0
1
2
3
4
p(x)
x*p(x)
.2
.00
.2
.20
.2
.40
.2
.60
.2
.80
E(x) = 2.0
expected number of
TVs sold in a day
Variance and Standard Deviation
x
x-
0
1
2
3
4
-2.0
-1.0
0.0
1.0
2.0
(x - )2
p(x)
p(x)*(x - )2
4.0
1.0
0.0
1.0
4.0
.2
.2
.2
.2
.2
0.8
0.2
0.0
0.2
0.8
Variance of daily sales =  2 = 2.0
Standard deviation of daily sales = 1.41 TVs
Example: I am bored


Imagine this situation. There is heavy
snowstorm. Everything is shut down. You and
everybody in your family have to stay home. You
are utterly bored. You catch hold of your sibling
and get him or her to play this game.
The game is to bet on the toss of a coin.
Example: I am bored




If it turns up heads exactly once in three tosses, you win
or otherwise you lose.
Lets call the event of getting heads on anyone trial as a
success. Similarly, the event of getting tails is a failure.
Suppose the probability of getting heads (or of a
success) is 0.6.
The big question is that you want to find out the
probability of getting exactly 1 head on three tosses.
Tree Diagram
Trial 2
H
H
T
H
T
Trial 3
H
Outcomes
HHH = (0.6)3*(0.4)0= 0.216
T
H
HHT = (0.6)2*(0.4)1=0.144
T
HTT = (0.6)1*(0.4)2 =0.096
H
T
T
H
Trial 1
T
HTH = (0.6)2*(0.4)1 =0.144
THH = (0.6)2*(0.4)1 = 0.144
THT = (0.6)1*(0.4)2 =0.096
TTH = (0.6)1*(0.4)2 =0.096
TTT = (0.6)0*(0.4)3 =0.064
So, what is the probability of you
winning the game?



What is the random variable here?
What is the probability of getting 2 heads in
three tosses?
= P(HHT) + P(HTH) + P(THH)
= 0.144 + 0.144 +0.144
= 0.432
Or 43.2%
How does the probability distribution of our
Random variable look?
Binomial Distribution
1. The experiment consists of a sequence of n
identical trials.
2. Two outcomes, success and failure, are possible
on each trial.
3. The probability of a success, denoted by p, does
not change from trial to trial.
4. The trials are independent.
Binomial Distribution
Our interest is in the number of successes
occurring in the n trials.
We let x denote the number of successes
occurring in the n trials.
Binomial Distribution is highly useful
when the number of trials is large.
Binomial Distribution

Binomial Probability Function
# of ways. p x .(1  p) n x
where:
n = the number of trials
p = the probability of success on any one trial
Counting Rule for Combinations

Another useful counting rule (esp. when n is large)
enables us to count the number of experimental
outcomes when x objects are to be selected from a
set of N objects.
•Number of Combinations of n Objects Taken x at a Time
n!
C 
x!(n  x)!
n
x
where:
n! = n(n  1)(n  2) . . . (2)(1)
x! = x(x  1)(x 2) . . . (2)(1)
0! = 1
Example: I am bored

Using binomial distribution, the probability
of 1 head in 3 tosses is
 3. (0.6) .(1  0.6)
1
 3. (0.6) .(0.4)
1
 0.288
2
31
Example: I am bored

Suppose, you won. But knowing your
sibling, she or he says that bet was getting
exactly 2 heads in 3 tosses. Since you are
bored, you have no choice but continuing to
play:
 3. (0.6) .(1  0.6)
1
 3. (0.6) .(0.4)
2
 0.432
1
31
Example: I am bored

She again cheats. She says that bet was
getting at least 2 heads in 3 tosses.

What does this mean: Getting 2 or more
heads P(2 heads) + P(3 heads)
Example: I am bored
P(2 heads)  3. (0.6) 2 .(1  0.6)32
 3. (0.6) 2 .(0.4)1
 0.432
P(3 heads)  1. (0.6)3 .(1  0.6)33
 1. (0.6)3 .(0.4) 0
 0.216
P(2 heads)  P(3 heads)  0.432 0.216  0.648
Binomial Distribution
Expected Value
E(x) =  = n*p
Variance
Var(x) =  2 = np(1  p)
Standard Deviation
  np(1  p)
Example: I am bored

Mean (or expected value)
E(x) =  = n*p= 3*0.6 = 1.8

Variance:
Var(x) =  2 = np(1  p)
= 3*(0.6)*(1-0.6) = 0.72

Standard Deviation
  Var( x)  0.72  0.84
Chapter 6
Continuous Probability Distributions

Normal Probability Distribution
p(x)
Normal
x
Normal Probability Distribution


The normal probability distribution is the
most important distribution for describing a
continuous random variable.
It is widely used in statistical inference.
Normal Probability Distribution

It has been used in a wide variety of applications:
Heights
of people
Scientific
measurements
Normal Probability Distribution

It has been used in a wide variety of applications:
Test
scores
Amounts
of rainfall
Normal Distributions

The probability of the random variable assuming a
value within some given interval from x1 to x2 is
defined to be the area under the curve between x1
and x2.
f (x)
Normal
x1 x2
x
Normal Probability Distribution

Characteristics
The distribution is symmetric; its skewness
measure is zero.
x
Normal Probability Distribution

Characteristics
The highest point on the normal curve is at the
mean, which is also the median and mode.
Mean = 
x
Normal Probability Distribution

Characteristics
The entire family of normal probability
distributions is defined by its mean  and its
standard deviation  .
Standard Deviation 
Mean 
x
Normal Probability Distribution

Characteristics
The mean can be any numerical value: negative,
zero, or positive. The following shows different normal
distributions with different means.
-10
x
0
20
Normal Probability Distribution

Characteristics
The standard deviation determines the width of the
curve: larger values result in wider, flatter curves.
 = 15
 = 25
x
Same Mean
Normal Probability Distribution

Characteristics
Probabilities for the normal random variable are
given by areas under the curve. The total area
under the curve is 1 (.5 to the left of the mean and
.5 to the right).
.5
.5
Mean 
x
Standardizing the Normal Values or the
z-scores

Z-scores can be calculated as follows:
z
x

•We can think of z as a measure of the number of
standard deviations x is from .
Standard Normal Probability Distribution
A standard normal distribution is a normal distribution with mean of
0 and variance of 1. If x has a normal distribution with mean (μ) and
Variance (σ), then z is said to have a standard normal distribution.
1
0
z
Example: Air Quality



I collected this data on the air quality of various
cities as measured by particulate matter index
(PMI). A PMI of less than 50 is said to
represent good air quality.
The data is available on the class website.
Suppose the distribution of PMI is
approximately normal.
Example: Air Quality


The mean PMI is 41 and the standard
deviation is 20.5.
Suppose I want to find out the probability that
air quality is good or what is the probability
that PMI is greater than 50.