Transcript Document

Section 2.3
Suppose X is a discrete-type random variable with outcome space S
and p.m.f f(x).
The mean of X is
 x f(x) = E(X) =  .
xS
The variance of X is
 (x – )2f(x) = E[(X – )2] =
xS
E[X2 – 2X + 2] = E(X2) – 2E(X) + E(2) =
E(X2) – 22 + 2 = E(X2) – 2 = 2 = Var(X) .
The standard deviation of X is  = Var(X) .
Suppose x1 , x2 , … , xn represents a collection of n observed values of a
random variable X. Such a collection of observed values is called a
sample.
1. The random variable X in Class Exercise #4 of Section 1.1 was
found to have p.m.f.
if x = 0
1/9
f(x) =
3/9 = 1/3
if x = 1
5/9
if x = 2
Find the mean, variance, and standard deviation of X.
 = E(X) = (0)(1/9) + (1)(3/9) + (2)(5/9) = 13/9
To find the variance, we first find
E(X2) = (0)2(1/9) + (1)2(3/9) + (2)2(5/9) = 23/9
2 = Var(X) = E(X2) – 2 = 23/9 – (13/9)2 =
38
 = ——
9
23/9 – 169/81 = 38/81
2. The random variable X in Class Exercise #1 of Section 2.2 was
found to have p.m.f.
3/15 = 1/5 if x = 2
6/15 = 2/5 if x = 3
f(x) = 4/15
if x = 4
2/15
if x = 5
Find the mean, variance, and standard deviation of X.
3
6
4
2 50 10
 = E(X) = (2) — + (3) — + (4) — + (5) — = — = —
15
15
15
15 15 3
This calculation was previously done in Class Exercise #1 of
Section 2.2. To find the variance, we first find
3
6
4
2
180
2
2
2
2
2
E(X ) = (2 ) — + (3 ) — + (4 ) — + (5 ) — = —– = 12
15
15
15
15
15
We now have
22
 = ——
2 = Var(X) = E(X2) – 2 = 12 – (10/3)2 = 8/9
3
3. Suppose k is a positive integer, and the random variable X has p.m.f.
f(x) = 1/k if x = 1, 2, …, k .
Find the mean and variance of X.
X = E(X) = (1)(1/k) + (2)(1/k) + (3)(1/k) + … + (k)(1/k) =
1 k(k + 1) (k + 1)
(1/k)(1 + 2 + 3 + … + k) = — ——— = ———
k
2
2
X2 = Var(X) = E(X2) – [E(X)]2 =
(1)2(1/k) + (2)2(1/k) + (3)2(1/k) + … + (k)2(1/k) – [(k + 1)/2]2 =
(1/k)(12 + 22 + 32 + … + k2) – (k + 1)2/4 =
(k + 1)(2k + 1)
(k + 1)2
(k2 – 1)
——————— – ——— = ———
6
4
12
Return to Section 2.1
Suppose a set contains N = N1 + N2 items, where N1 items are of one
type, N2 items are of another type, and n items are selected from the N
items at random and without replacement. If the random variable X is
defined to be the number of selected n items that are from the N1 items,
then X is said to have a hypergeometric distribution.
N1
N2
n–x
x
The p.m.f. of X is f(x) =
N
n
if x is a nonnegative integer
such that
x  min{n , N1}
and
n – x  N2
The mean of X is E(X) =  x
x
N1
x
N2
n–x
N
n
N1
n— 
N x
= x
x
N1 – 1
x–1
N1
—
x
N1 – 1
x–1
N
—
n
N2
n – 1 – (x – 1)
N–1
n–1
N2
n – 1 – (x – 1)
N–1
n–1
=
N1
n—
N
=
To find the variance of X (using Theorem 2.2-1), find
E[X(X – 1)] =  x(x – 1)
N1
x
N
n
x
 x(x – 1)
N1 N1 – 1
— ——–
x x–1
N1(N1 – 1)
n (n – 1) ————
N(N – 1)
N1 – 2
x–2
N N–1
— ——
n n–1
x

N1 – 2
x–2
x
Since E[X(X – 1)] =
E(X2)
N2
n–x
=
N2
n – 2 – (x – 2)
N–2
n–2
N2
n – 2 – (x – 2)
N–2
n–2
=
N1(N1 – 1)
= n(n – 1) ————
N(N – 1)
N1(N1 – 1)
– E(X) = n(n – 1) ————
N(N – 1)
,
then
E(X2)
N1(N1 – 1)
N1
= n(n – 1) ———— + n —
N(N – 1)
N
,
and Var(X) = E(X2) – [E(X)]2 =
N1(N1 – 1)
N1
n(n – 1) ———— + n — –
N(N – 1)
N
N1
n—
N
N1
n—
N
N1 – 1
N1
(n – 1) ——— + 1 – n —
N–1
N
2
=
=
N1 (n – 1)(N1 – 1)N + N(N – 1) – nN1(N – 1)
n — ———————————————— =
N
N(N – 1)
N1 N2 – N1N – nN + nN1
n — —————————
N
N(N – 1)
N1 N2 N – n
n — — ——
N N N–1
N1 NN2 – nN2
= n — ————— =
N N(N – 1)
Return to Class Exercise #3 in Section 2.1
3. An urn contains 4 clear marbles and 10 colored marbles.
(c) Consider the random variable Q = “the number of clear marbles
when 3 marbles are selected at random without replacement” with
p.m.f. f(q). Find f(q), E(Q), and Var(Q).
4
10
q
3–q
if q = 0, 1, 2, 3
f(q) =
14
3
4
6
E(Q) = Q = (3) — = —
14 7
4 10 14 – 3
330
Var(Q) = Q2 = (3) — — ——— = —–
14 14 14 – 1
637
3. An urn contains 4 clear marbles and 10 colored marbles.
(d) Consider the random variable V = “the number of clear marbles
when 7 marbles are selected at random without replacement” with
p.m.f. g(v). Find g(v), E(V), and Var(V).
4
10
v
7–v
g(v) =
if v = 0, 1, 2, 3, 4
14
7
4
E(V) = V = (7) — = 2
14
4 10 14 – 7
10
Var(V) = V2 = (7) — — ——— = —
14 14 14 – 1
13
3. An urn contains 4 clear marbles and 10 colored marbles.
(e) Consider the random variable W = “the number of colored marbles
when 7 marbles are selected at random without replacement” with
p.m.f. h(w). Find h(w), E(W), and Var(W). (Note that V + W = 7.)
10
4
w
7–w
h(w) =
if w = 3, 4, 5, 6, 7
14
7
10
E(W) = W = (7) — = 5
14
10 4 14 – 7
10
Var(W) = W2 = (7) — — ——— = —
14 14 14 – 1
13
Return to
Section 2.3
4. Let X be a random variable with the space {u1 , u2 , ..., uk} and
p.m.f. f(x), that is, P(X = ui) = f(ui) for i = 1, 2, ..., k. Define the
random variable Y = aX + b, where a and b are constants.
(a) Find the space of the random variable Y.
outcome space for Y = {au1 + b, au2 + b, …, auk + b}
(b) If X is the mean for the random variable X, use the p.m.f. of Y to
prove that the mean for the random variable Y is Y = aX + b .
The p.m.f. for Y is g(aui + b) = f(ui) for i = 1, 2, ..., k.
k
k
i=1
i=1
Y =  [(aui + b) g(aui + b)] =  [(aui + b) f(ui)] =
k
k
k
k
 aui f(ui) +  b f(ui) = a  ui f(ui) + b  f(ui) = aX + b
i=1
i=1
i=1
i=1
(c) If X is the mean for the random variable X, use Theorem 2.2-1 to
prove that the mean for the random variable Y is Y = aX + b .
Y = E(Y) = E(aX + b) = aE(X) + b = aX + b
(d) If X2 is the variance for the random variable X, use the p.m.f. of Y
to prove that the variance for the random variable Y is Y2 = a2X2 .
k
2 g(au + b)] =

[(au
+
b
–

)
i
Y
i
Y = i = 1
2
k
 [(aui + b – (aX + b))2 f(ui)] =
i=1
k
 [(aui –
i=1
aX)2
f(ui)] =
a2
k
 [(ui – X)2 f(ui)] = a2X2
i=1
(e) If X2 is the variance for the random variable X, use Theorem 2.2-1
to prove that the variance for the random variable Y is Y2 = a2X2 .
Y2 = E[(Y – Y)2] = E[(aX + b – (aX + b))2] =
E[(aX – aX)2] = E[a2(X – X)2] = a2E[(X – X)2] = a2X2
5. The random variable X has p.m.f. f(x) =
1/10
1/10
1/5
1/5
2/5
if x = –2
if x = –1
if x = 0
if x = 1
if x = 2
(a) Find the mean and variance of X.
X = E(X) =
(–2)(1/10) + (–1)(1/10) + (0)(2/10) + (1)(2/10) + (2)(4/10) =
7/10 = 0.7
E(X2) =
(–2)2(1/10) + (–1)2(1/10) + (0)2(2/10) + (1)2(2/10) + (2)2(4/10) =
23/10
X2 = Var(X) = E(X2) – [E(X)]2 = 23/10 – [7/10]2 =
23/10 – 49/100 = 181/100 = 1.81
(b) Find the p.m.f. g(y), the mean, and the variance of Y = X + 3 .
g(y) =
1/10
1/10
1/5
1/5
2/5
if y = 1
if y = 2
if y = 3
if y = 4
if y = 5
Y = E(Y) = E(X + 3) = 1X + 3 =
7/10 + 3 = 37/10 = 3.7
Y2 = Var(Y) = Var(X + 3) = 12X2 =
181/100 = 1.81
(c) Find the p.m.f. h(v), the mean, and the variance of V = 5X .
h(v) =
1/10
1/10
1/5
1/5
2/5
if v = –10
if v = –5
if v = 0
if v = 5
if v = 10
V = E(V) = E(5X) = 5X =
(5)(7/10) =35/10 = 7/2 = 3.5
V2 = Var(V) = Var(5X) = 52X2 =
(25)(181/100) = 181/4 = 45.25
(d) Find the p.m.f. s(w), the mean, and the variance of W = X2 .
s(w) =
1/5
3/10
1/2
if w = 0
if w = 1
if w = 4
W = E(W) = E(X2) = 23/10 = 2.3
(from calculations done in part (a))
To find the variance, we first find
E(W2) = (0)2(1/5) + (1)2(3/10) + (4)2(1/2) = 83/10
W2 = Var(W) = E(W2) – W2 = 83/10 – (23/10)2 = 301/100 = 3.01
6. Verify that the p.m.f. given in Text Exercise 2.3-15 is correct.
The 16 equally likely outcomes when the two dice are rolled:
P(Larger roll is 1) = 1 / 16
(1,1) (1,2) (1,3) (1,4)
(2,1) (2,2) (2,3) (2,4)
P(Larger roll is 2) = 3 / 16
(3,1) (3,2) (3,3) (3,4)
P(Larger roll is 3) = 5 / 16
(4,1) (4,2) (4,3) (4,4)
P(Larger roll is 4) = 7 / 16
Section 2.3
Suppose X is a discrete-type random variable with outcome space S
and p.m.f f(x).
The mean of X is
 x f(x) = E(X) =  .
xS
The variance of X is
 (x – )2f(x) = E[(X – )2] =
xS
E[X2 – 2X + 2] = E(X2) – 2E(X) + E(2) =
E(X2) – 22 + 2 = E(X2) – 2 = 2 = Var(X) .
The standard deviation of X is  = Var(X) .
Suppose x1 , x2 , … , xn represents a collection of n observed values of a
random variable X. Such a collection of observed values is called a
sample.
n
n
The sample mean is  xi(1/n) =
i=1
 xi
i=1
= x
n
n
(xi – x)2(1/n) =
The variance of the empirical distribution is 
i=1
n
 (xi –
i=1

x i2
i=1

=
n
n
n
x)2
–
2nx 2
n
+
(xi2
i=1
– 2xi x +
n

x 2)
=
n
x i2
i=1
n
n
i=1
i=1
– 2x  xi +  x 2
n
n
 xi2 – nx 2
nx 2
=
i=1
n
The sample variance is
= v
n
 (xi –
i=1
n–1
n
 xi2 – nx 2
x)2
=
The sample standard deviation is s = s2
i=1
n–1
= s2
=
7. A fair, six-sided die is rolled 10 times, and the number of spots
facing up is recorded for each roll with the following results:
4 6 2 3 5 6 5 1 1 3.
Find the sample mean, the sample variance, and the sample standard
deviation.
n
 xi
x=
i=1
n
=
4+6+2+3+5+6+5+1+1+3
10
36
= — = 3.6
10
n
 (xi – x)2
s2
=
i=1
n–1
=
(4–3.6)2+(6–3.6)2+(2–3.6)2+(3–3.6)2+(5–3.6)2+(6–3.6)2+(5–3.6)2+(1–3.6)2+(1–3.6)2+(3–3.6)2
9
32.4
= —— = 3.6
9
s  1.897
8. Suppose a sample of observations of the random variable X results
in the values x1 , x2 , ..., xn , with sample mean x and sample
variance sX2. Consider the values
y1 = ax1 + b , y2 = ax2 + b , … , yn = axn + b ,
where a and b are constants.
(a) Prove that the sample mean for values y1 , y2 , ..., yn is y = ax + b .
n
n
 yi
y=
i=1
 (axi + b)
=
n
i=1
n
i=1
i=1
n
 axi +  b
=
n
n
n
a xi + nb
=
i=1
n
= ax + b
(b) Prove that the sample variance for values y1 , y2 , ..., yn is
sY2 = a2sX2 .
n
 (yi –
2
sY =
i=1
n–1
n
 (axi + b – (ax +
y)2
=
i=1
n–1
n
 (axi – ax)2
b))2
=
i=1
n–1
=
n

a2(x
i=1
i–
n–1
n
a2
x)2
=
(xi – x)2
i=1
n–1
= a2sX2 .
(c) The sample of 10 values x1 , x2 , ..., x10 in Class Exercise #7 of this
section was found to have mean x = 3.6 and variance sX2 = 3.6 .
Find the mean y and variance sY2 for the values
y1 = 0.5x1 – 4 , y2 = 0.5x2 – 4 , … , y10 = 0.5x10 – 4 .
y = 0.5x – 4 = 0.5(3.6) – 4 = – 2.2
sY2 = (0.5)2(3.6) = (0.9)
9. Add a sheet to the Excel file Describe_Data (created previously)
which displays the mean, variance, and standard deviation for a
sample.
(a) Edit the Excel file by performing the following steps:
(1) Insert a new worksheet named Summary Stats.
(2) In Summary Stats, select cells A1:A500, and color these cells with a light color such as
yellow.
(3)
(4)
With cells A1:A500 still selected, select from the main menu the Formulas tab, select
the option Name Manager, type the range name Sample, click the OK button to
return to the Name Manager dialog box, and click the Close button to close the dialog
box.
Enter the labels displayed in columns B through H, and right justify the labels in cells
G3:G6.
(5)
(6)
Format cells H3:H6 so that the displays are centered.
(7)
Save the file as Describe_Data (in your personal folder on the college network).
Enter the following formulas respectively in cells H3:H6:
=COUNT(Sample)
=IF(COUNT(Sample)>0,AVERAGE(Sample),"-")
=IF(COUNT(Sample)>1,VAR(Sample),"-")
=IF(COUNT(Sample)>1,STDEV(Sample),"-")
(b) Use the Excel file Describe_Data to obtain the sample mean,
sample variance, and sample standard deviation for the data in
Class Exercise #7 of this section.