Transcript Document

SKEMA Ph.D programme
2010-2011
Class 2
Statistical Inference
Lionel Nesta
Observatoire Français des Conjonctures Economiques
[email protected]
Hypothesis Testing
The Notion of Hypothesis in Statistics
 Expectation
 An hypothesis is a conjecture, an expected explanation of why a given
phenomenon is occurring
 Operational -ity
 An hypothesis must be precise, univocal and quantifiable
 Refutability
 The result of a given experiment must give rise to either the refutation or
the corroboration of the tested hypothesis
 Replicability
 Exclude ad hoc, local arrangements from experiment, and seek universality
Examples of Good and Bad Hypotheses

« The stakes Peugeot and Citroen have the same variance »

« God exists! »

« In general, the closure of a given production site in Europe is
positively associated with the share price of a given company
on financial markets. »

« Knowledge has a positive impact on economic growth »
Hypothesis Testing
 In statistics, hypothesis testing aims at accepting or rejecting a
hypothesis
 The statistical hypothesis is called the “null hypothesis” H0
 The null hypothesis proposes something initially presumed true.
 It is rejected only when it becomes evidently false, that is, when the
researcher has a certain degree of confidence, usually 95% to 99%,
that the data do not support the null hypothesis.
 The alternative hypothesis (or research hypothesis) H1 is the
complement of H0.
Hypothesis Testing
 There are two kinds of hypothesis testing:
 Homogeneity test compares the means of two samples.
 H0 : Mean(x) = Mean(y) ; Mean(x) = 0
 H1 : Mean(x) ≠ Mean(y) ; Mean(x) ≠ 0
 Conformity test looks at whether the distribution of a given sample
follows the properties of a distribution law (normal, Gaussian, Poisson,
binomial).
 H0 : ℓ(x) = ℓ*(x)
 H1 : ℓ(x) ≠ ℓ*(x)
The Four Steps of Hypothesis Testing
1.
Spelling out the null hypothesis H0 et and the alternative
hypothesis H1.
2.
Computation of a statistics corresponding to the distance
between two sample means (homogeneity test) or between the
sample and the distribution law (conformity test).
3.
Computation of the (critical) probability to observe what one
observes.
4.
Conclusion of the test according to an agreed threshold around
which one arbitrates between H0 and H1 .
The Logic of Hypothesis Testing

We need to say something about the reliability (or
representativeness) of a sample mean

Large number theory; Central limit theorem

The notion of confidence interval

Once done, we can whether two mean are alike

If so (not), their confidence intervals are (not) overlapping
Statistical Inference

In real life calculating parameters of populations is
prohibitive because populations are very large.

Rather than investigating the whole population, we take
a sample, calculate a statistic related to the parameter of
interest, and make an inference.

The sampling distribution of the statistic is the tool that
tells us how close is the statistic to the parameter.
Prerequisite 1
Standard Normal Distribution
The Standard Normal Distribution
The standard normal distribution, also called Z
distribution, represents a probability density
function with mean μ = 0 and standard
deviation σ = 1. It is written as N (0,1).
The Standard Normal Distribution
(μ=0 and σ=1)
0,45
0,40
0,35
0,30
0,25
0,20
0,15
0,10
0,05
0,00
-5,0
-4,0
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
5,0
Since the standard deviation is by definition 1, each unit on the horizontal
axis represents one standard deviation
The Standard Normal Distribution
Because of the shape of the Z distribution (symmetrical),
statisticians have computed the probability of occurrence of events
for given values of z.
1  12 z2
f ( z )  p( z ) 
e
2
z  ; 
The Standard Normal Distribution
0.45
68% of
observations
0.4
0.35
0.3
95% of
observations
0.25
0.2
99.7% of
observations
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
The Standard Normal Distribution
0.45
0.4
0.35
0.3
0.25
95% of observations
0.2
0.15
0.1
2.5%
2.5%
0.05
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
The Standard Normal Distribution (z scores)
0.45
0.4
0.35
0.3
0.25
P(Z ≥ 0)
P(Z < 0)
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
Probability of an event (z = 0.51)
0.45
0.4
0.35
0.3
0.25
P(Z ≥ 0.51)
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
Probability of an event (z = 0.51)

The z-score is used to compute the probability of
obtaining an observed score.

Example
 Let z = 0.51. What is the probability of observing
z=0.51?
 It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)
= ??
Standard Normal Distribution Table
z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.500
0.496
0.492
0.488
0.484
0.480
0.476
0.472
0.468
0.464
0.1
0.460
0.456
0.452
0.448
0.444
0.440
0.436
0.433
0.429
0.425
0.2
0.421
0.417
0.413
0.409
0.405
0.401
0.397
0.394
0.390
0.386
0.3
0.382
0.378
0.375
0.371
0.367
0.363
0.359
0.356
0.352
0.348
0.4
0.345
0.341
0.337
0.334
0.330
0.326
0.323
0.319
0.316
0.312
0.5
0.309
0.305
0.302
0.298
0.295
0.291
0.288
0.284
0.281
0.278
0.6
0.274
0.271
0.268
0.264
0.261
0.258
0.255
0.251
0.248
0.245
0.7
0.242
0.239
0.236
0.233
0.230
0.227
0.224
0.221
0.218
0.215
0.8
0.212
0.209
0.206
0.203
0.201
0.198
0.195
0.192
0.189
0.187
0.9
0.184
0.181
0.179
0.176
0.174
0.171
0.169
0.166
0.164
0.161
1.0
0.159
0.156
0.154
0.152
0.149
0.147
0.145
0.142
0.140
0.138
1.6
0.055
0.054
0.053
0.052
0.050
0.050
0.049
0.048
0.047
0.046
1.9
0.029
0.028
0.027
0.027
0.026
0.026
0.025
0.024
0.024
0.023
2.0
0.023
0.022
0.022
0.021
0.021
0.020
0.020
0.019
0.019
0.018
2.5
0.006
0.006
0.006
0.006
0.006
0.005
0.005
0.005
0.005
0.005
2.9
0.002
0.002
0.002
0.002
0.002
0.002
0.002
0.002
0.001
0.001
Probability of an event (Z = 0.51)

The Z-score is used to compute the probability of
obtaining an observed score.

Example
 Let z = 0.51. What is the probability of observing
z=0.51?
 It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)
 P(z ≥ 0.51) = 0.3050
Prerequisite 2
Normal Distribution
Normal Distributions
Normal distributions are just like standard normal
distributions (or z distributions) with different
values for the mean μ and standard deviation σ.
This law is written N (μ,σ ²). The normal
distribution is symmetrical.
The Normal Distribution
In probability, a random variable follows a normal distribution
law (also called Gaussian, Laplace-Gauss distribution law) of
mean μ and standard deviation σ if its probability density
function is such that
f ( x) 
1
e
 2
1  x 
 

2  
2
This law is written N (μ,σ ²). The density function of a normal
distribution is symmetrical.
Normal distributions for different values
of μ and σ
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
-5
-4
-3
-2
(μ=0;σ=1)
-1
0
(μ=0.5;σ=1.1)
1
2
(μ=-2;σ=0.5)
3
4
5
Standardization of Normal Distributions
Still, it would be nice to be able to say something about
these distributions just like we did with the z distribution.
For example, textile companies (and clothes
manufacturers) may be very interested in the distribution
of heights of men and women, for a given country
(provided that we have all observations).
How could we compute the proportion of men taller than
1.80 meters?
Standardization of Normal Distributions
Assuming that the heights of men is distributed normal,
is there any way we could express it in terms of a z
distribution?
1. We must center the distribution around 0. We must
express any value in terms of deviation around the
mean : (X – μ)
2. We must express (or reduce) each deviation in
terms of number of standard deviation σ. (X – μ) / σ
Standardization of Normal Distributions
Standardization of a normal distribution is the operation
of recovering a z distribution from any other distribution,
assuming the distribution is normal. It is achieved by
centering (around the mean) and reducing (in terms of
number of standard deviations) each observation.
The obtained z value expresses each observation by its
distance from the mean, in terms of number of standard
deviations.
z
x

Example
 Suppose that for a population of students of a famous business school in
Sophia-Antipolis, grades are distributed normal with an average of 10
and a standard deviation of 3. What proportion of them
 Exceeds 12 ; Exceeds 15
 Does not exceed 8 ; Does not exceed 12
 Let the mean μ = 10 and standard deviation σ = 3:
12  10
3
15  10
z
3
8  10
z
3
12  10
z
3
z
 0.66. P( z  0.66)  0.255  25.5%
 1.66. P( z  1.66)  0.049  4.9%
 0.66. P( z  0.66)  P( z  0.66)  0.255  25.5%
 0.66. P( z  0.66)  1 - P( z  0.66)  1  0.255  74.5%
Implication 1
Intervals of likely values
Inverting the way of thinking

Until now, we have thought in terms of observations x
and mean μ and standard deviation σ to produce the z
score.

Let us now imagine that we do not know x, we know μ
and σ. If we consider any interval, we can write:
z 
x-

 z
z   x -    z 
  z   x    z 
Inverting the way of thinking

If z∈[-2.55;+2.55] we know that 99% of z-scores will fall
within the range

If z∈[-1.64;+1.64] we know that 90% of z-scores will fall
within the range

Let us now consider an interval which comprises 95%
of observations. Looking at the z table, we know that
z=1.96
Pr    1.96    x    1.96     0.95
Example
 Take the population of students of this famous business
school in Sophia-Antipolis, with average of 10 and a
standard deviation of 3. What is the 99% interval ? 95%
interval? 90% interval?
99% interval 10  2.55  3  x  10  2.55  3 
2.35  x  17.65
95% interval 10  1.96  3  x  10  1.96  3 
4.12  x  15.88
90% interval 10 1.64  3  x  10 1.64  3 
5.08  x  14.92
Prerequisite 3
Sampling theory
Why worrying about sampling theory?
The social scientist is not so much interested in the
characteristics of the sample itself.
Most of the time, the social scientist wants to say
something about the population itself looking at the
sample.
In other words, s/he wants to infer something about the
population from the sample.
On the use of random samples
The quality of the sample is key to statistical inference.
The most important thing is that the sample must be
representative of the characteristics of the population.
The means by which representativeness can be achieved
is by drawing random samples, where each individual
observation have equal probability to be drawn.
Because we would be inferring wrong conclusions from
biased samples, the latter are worse than no sample at
all.
Use of random samples
The quality of the sample is key to statistical inference.
The most important thing is that the sample must be
representative of the characteristics of the population.
The means by which representativeness can be achieved
is by drawing random samples, where each individual
observation have equal probability to be drawn. Hence
observations are mutually independent.
Because we would be inferring wrong conclusions from
biased samples, the latter are worse than no sample.
Reliability of random samples
The ultimate objective with the use of random samples is
to infer something about the underlying population.
Ideally, we want the sample mean X to be as close as
possible to the population mean μ. In other words, we are
interested in the reliability of the sample.
The are two ways to deal with reliability:
1. Monte Carlo Simulation (infinite number of samples)
2. Sampling theory (moments of a distribution)
Moment 1 – The Mean
Our goal is to estimate the population mean μ from the
sample mean X .
How is the sample mean a good estimator of the
population mean ?
Reminder : the sample mean is computed as follows.
X
1
 X 1  X 2  ...  X n 
n
The trick is to consider each observations as a random
variable, in line with the idea of a random sample.
Moment 1 – The Mean
1
 X 1  X 2  ...  X n 
n
1
What is the expected value of Xi – E(Xi) – if
E X   E  X 1   E  X 2   ...  E  X n   I draw it an infinite number of times
?
n
Obviously if samples are random, then
1
E X       ...    the expected value of X is μ.
i
n
1
… working out the math…
E X  n
n
X
 
 
 
 
E X 
On average, the sample mean will be on target,
that is, equal to the population mean.
Moment 2 – The Variance
1
1
1

var X  var X 1  X n    X n 
n
n
n

1
var X  2 var X 1  var X n    var X n 
n
1
var X  2  2   2     2
n
n 2  2
var X  2 
n
n

 X  Standard errorof X 


n
Doing just the same with the variance. We
simply need to know that if two variables
are independent, then the following holds:
var  aX  bY   a2 var( X )  b2 var( y)
The standard deviation of the sample means
represents the estimation error of
approximation of the population mean by the
sample mean, and therefore it is called the
standard error.
Forms of sampling distributions
 With random samples, sample means X vary around the population
mean μ with a standard deviation of σ/√n (the standard error).
 Large number theory tells us that the sample mean will converge to the
population (true) mean as the sample size increases.
 But what about the shape of the distribution, essential if we want to
use z-scores?! The shape of the distribution will be normally
distributed, regardless of the form of the underlying distribution of
the population, provided that the sample size is large enough.
 Central Limit Theorem tells us that for many samples of like and sufficiently
large size, the histogram of these sample means will appear to be a normal
distribution.
The Dice Experiment
Value
P(X = x)
1
1/6
2
1/6
1 x 6
21
E  X   X   x 
 3.5
6 x 1
6
0.20
3
1/6
4
1/6
0.12
5
1/6
0.08
1/6
0.04
6
0.16
0.00
1
2
3
x
4
5
6
The Dice Experiment (n = 2)
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
 
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
E X  X 
1
1
X1 
X2 
36
36

Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
1
X 36  3.5   X
36
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
6/36
5/36
4/36
3/36
2/36
1/36
1
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5 6.0
x
From SKEMA sample grade distribution…
…to SKEMA sample mean distribution
…to SKEMA sample mean distribution
0,14
0,12
0,10
0,08
0,06
From SKEMA sample
grade distribution…
0,04
0,02
0,00
0
5
10
15
20
0,70
0,60
0,50
…to SKEMA sample
mean distribution
0,40
0,30
0,20
0,10
0,00
0
5
10
15
…to SKEMA sample mean distribution
Note the change in
horizontal axis !!
Implication 2
Confidence Interval
Confidence Interval

In statistics, a confidence interval is an interval within
which the value of a parameter is likely to be (the
unknown population mean). Instead of estimating the
parameter by a single value, an interval of likely
estimates is given.

Confidence intervals are used to indicate the reliability
of an estimate.

Reminder 1. The sample mean is a random variable following a
normal distribution

Reminder 1. The sample values X and σs can be used to
approximate the population mean μ and its s.d. on σp.
Remember intervals!
Unknown value
  z   x    z 
Fully known
sample mean
Sample standard
deviation
Confidence Interval
Unknown value :
population mean
X  z
Sample mean
used as a guess
for population
mean

N
   X  z

N
Standard error as a
guess for standard
deviation of errors
Confidence Interval
X  z pc 
X  1.96 
X  1.64 

N

N

N
   X  z pc 
   X  1.96 
   X  1.64 

N

General definition
Definition for 95% CI
N

N
Definition for 90% CI
Standard Normal Distribution and CI
0.45
90% of
observations
0.4
0.35
0.3
95% of
observations
0.25
0.2
99.7% of
observations
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
Application of Confidence Interval
 Let us draw a sample of 25 students from SKEMA (n = 25), with X = 10
and σ = 3. What can we say about the likely values of the population
mean μ? Let us build the 95% CI
10  1.96 
3
25
   10  1.96 
3
25
 8.8    11.2
SKEMA Average grades
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
5
10
15
20
SKEMA sample mean distribution
95% of chances that the
population mean is indeed
located within this interval
8.8
11.2
Application of Confidence Interval
 Let us draw a sample of 25 students from SKEMA (n = 25), with X = 10
and σ = 3. What can we say about the likely values of the population
mean μ? Let us build the 95% CI
10  1.96 
3
25
   10  1.96 
3
25
 8.8    11.2
 Let us draw a sample of 25 students from HEC (n = 30), with X = 11.5
and σ = 4.7. What can we say about the likely values of the population
mean μ? Let us build the 95% CI
11.5  1.96 
4.7
30
   11.5  1.96 
4.7
30
 9.8    13.2
HEC Sample Mean Distribution
95% of chances that the mean is
indeed located within this interval
9.8
13.2
Hypothesis Testing
 Hypothesis 1 : Students from SKEMA have an average grade which is
not significantly different from 11 at 95% CI.
 H0 : Mean (SKEMA) = 11
 H1 : Mean (SKEMA) ≠ 11
I Accept H0 and reject H1
because 11 is within the
confidence interval.
 Hypothesis 2 : Students from HEC have an average grade which is
not significantly different from 11 at 95% CI.
 H0 : Mean (SKEMA) = 11
 H1 : Mean (SKEMA) ≠ 11
I Accept H0 and reject H1
because 11 is within the
confidence interval.
Implication 3
Critical probability
Example
We have concluded that the mean grade of the population of students
from SKEMA is not significantly different from 11. To do so, we had to
agree beforehand that 95% CI what the relevant confidence interval.
But it is clear that if we had chosen another confidence interval (90%,
80%) our conclusion would have been different.
10  1.96 
10  1.64 
3
25
   10  1.96 
3
25
3
3
   10  1.64 
25
25
 8.8    11.2
 9.016    10.984
Critical probability

The purpose of hypothesis testing is to
determine whether there is enough statistical
evidence in favor of a certain belief about a
parameter.

There are two hypotheses


H0 - the null hypothesis (Against you intuition)
H1 - the alternative hypothesis (What you want to prove)
Critical probability
The confidence interval is designed in such a way that for
each z statistics chosen, we define a share of observations
which this CI is comprising.
 When z = 1.96, we have 95% CI
 When z = 2.55, we have 99% CI
If the tested value is within the confidence interval, we accept H0.
If the tested value is within the confidence interval, we accept Ha.
Critical probability
An alternative method by which decision about H0 and Ha
can be made is provided with the computation of the critical
probability – or p-value. What is the threshold value of z for
which one concludes in favor of Ha against H0 ?
One can compute directly the z value from the sample as
follows:
X 
z
s n
Target (or tested)
value for the
population mean
Critical probability

The p - value provides information about the amount of
statistical evidence that supports the null hypothesis.
The p-value of a test is the probability of observing a test
statistic at least as extreme as the one computed, given
that the null hypothesis is true.
Computing the critical probability
 Let us draw a sample of 25 students from SKEMA (n = 25), with X =
10 and σ = 3. What can we say about the likely values of the
population mean μ? Let us compute the z value.
z
10  11
 1.6667
3/ 25
 Looking at the z table, it is now straightforward to recover the critical
probability, for which we are indifferent between accepting or rejecting
H0.
Pr ( z  1.6667)  Pr ( z  1.6667)  0.049  0.049  9.8%
SKEMA Average grades
90.2% of chances that the mean is
indeed located within this interval
4.9%
4.9%
Interpreting the critical probability
The probability of observing a test statistic at least as
extreme as 11, given that the null hypothesis is true is
9.8%.
We can conclude that the smaller the p-value the more
statistical evidence exists to support the alternative
hypothesis.
But is 9.8% low enough to reject H0 and to accept Ha?
Interpreting the critical probability
 The practice is to reject H0 only when the
critical probability is lower than 0.1, or 10%
 Some are even more cautious and prefer to
reject H0 at a critical probability level of 0.05,
or 5%.
 In any case, the philosophy of the statistician
is to be conservative.
Interpreting the critical probability
If the p-value is less than 1%, there is overwhelming
evidence that support the alternative hypothesis.
If the p-value is between 1% and 5%, there is a strong
evidence that supports the alternative hypothesis.
If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.
If the p-value exceeds 10%, there is no evidence that
supports of the alternative hypothesis.
Decisions Using the Critical probability

The p-value can be used when making decisions
based on rejection region methods as follows:
1. Define the hypotheses to test, and the required
significance level a.
2. Perform the sampling procedure, calculate the
test statistic and the p-value associated with it.
3. Compare the p-value to a. Reject the null
hypothesis only if p <a; otherwise, do not reject
the null hypothesis.
Decisions Using the Critical probability


If we reject the null hypothesis, we conclude that
there is enough evidence to infer that the alternative
hypothesis is true.
If we do not reject the null hypothesis, we conclude
that there is not enough statistical evidence to infer
that the alternative hypothesis is true.
The alternative hypothesis is the more important
one. It represents what we are investigating.
Prerequisite 4
Student T test
The Student Test
 Thus far, we have assumed that we know both the standard
deviation of the population. But in fact, we do not know it: σ
is unknown.
 When the sample is small, we should be imprecise. To take
account of sample size, we use the t distribution, not z.
 The Student t statistics is then preferred to the z statistics.
Its distribution is similar (identical to z as n → +∞). The CI
becomes
s
  X  tcpdf 
N
Application of Student t to CI’s
 Let us draw a sample of 25 students from SKEMA (n = 25), with μ = 10
and σ = 3. Let us build the 95% CI
3
3
24
   10  t2.5

25
25
3
3
10  2.06 
   10  2.06 
8.76    11.23
25
25
24
10  t2.5

 Let us draw a sample of 25 students from HEC (n = 30), with μ = 11.5
and σ = 4.7. Let us build the 95% CI
11.5  2.06 
4.7
30
   11.5  2.06 
4.7
30
 9.73    13.26
STATA Application: Student t
 Import SKEMA_LMC into Stata
 Produce descriptive statistics for sales; labour, and R&D expenses
 A newspaper writes that by and large, LMCs have 95,000 employees.
 Test statistically whether this is true at 1% level
 Test statistically whether this is true at 5% level
 Test statistically whether this is true at 10% and 20% level
 Write out H0 and H1
Results (at 1% level)
 X  95000  t


0.01

X  95000  2.573 




s2
s2
0.01

    95000   X  95000  t 
N
N
96400
1634
    95000   X  95000  2.573 
9851.20     95000   2448.94
85148.8    97448.94
Pr  85148.8    97448.94   0.99
96400
1634
STATA Application: Student t
Two ways of computing confidence intervals:
mean var1
ttest var1==specific value, option
For example:
mean lnassets
ttest lnassets == 11
Or even manually (for a sample of more than 100 observations):
sum lnassets
display r(mean)-1.96*r(sd)/r(N)^(1/2)
display r(mean)+1.96*r(sd)/r(N)^(1/2)
STATA Application: Student t
Stata Instruction
Descriptive statistics
T value
. ttest labour==95000
One-sample t test
Variable
Obs
Mean
labour
1634
91298.87
mean = mean(labour)
Ho: mean = 95000
Ha
Ha: mean < 95000
Pr(T < t) = 0.0604
Std. Err.
Std. Dev.
[95% Conf. Interval]
2384.818
96400.96
86621.25
H0
t =
degrees of freedom =
Ha: mean != 95000
Pr(|T| > |t|) = 0.1209
I accept H0
95976.5
-1.5520
1633
Ha: mean > 95000
Pr(T > t) = 0.9396
Critical probability
 With t = 1.552, I can conclude the following:
 12% probability that μ belongs to the distribution
where the population mean = 95,000
 I have 12% chances to wrongly reject H0
 88% probability that μ belongs to another
distribution where the population mean ≠ 95,000
 I have 88% chances to rightly reject H0
Shall I the accept or reject H0?
88.0%
6.1%
6.1%
Critical probability
 With t = 1.552, I can conclude the following:
 12% probability that μ belongs to the distribution
where the population mean = 95,000
 I have 12% chances to wrongly reject H0
 88% probability that μ belongs to another
distribution where the population mean ≠ 95,000
 I have 88% chances to rightly reject H0
I accept H0 !!!
SPSS Application: Student t
 Import SKEMA_LMC into SPSS
 Produce descriptive statistics for sales; labour, and R&D expenses
 Analyse  Statistiques descriptives  Descriptive
 Options: choose the statistics you may wish
 A newspaper writes that by and large, LMCs have 95,000 employees.
 Test statistically whether this is true at 1% level
 Test statistically whether this is true at 5% level
 Test statistically whether this is true at 10% and 20% level
 Write out H0 and H1
 Analyse  Comparer les moyennes  Test t pour échantillon unique
 Options: 99; 95, 90%
SPSS Application: t test at 99% level
Statistiques sur échantillon unique
labour
N
1634
Moyenne
91298.87
Ecart-type
96400.957
Erreur
standard
moyenne
2384.818
Test sur échantillon unique
Valeur du test = 95000
labour
t
-1.552
ddl
1633
Sig.
(bilatérale)
.121
Différence
moyenne
-3701.130
Intervalle de confiance
99% de la différence
Inférieure
Supérieure
-9851.20
2448.94
SPSS Application: t test at 95% level
Statistiques sur échantillon unique
labour
N
1634
Moyenne
91298.87
Ecart-type
96400.957
Erreur
standard
moyenne
2384.818
Test sur échantillon unique
Valeur du test = 95000
labour
t
-1.552
ddl
1633
Sig.
(bilatérale)
.121
Différence
moyenne
-3701.130
Intervalle de confiance
95% de la différence
Inférieure
Supérieure
-8378.75
976.50
SPSS Application: t test at 80% level
Statistiques sur échantillon unique
labour
N
1634
Moyenne
91298.87
Ecart-type
96400.957
Erreur
standard
moyenne
2384.818
Test sur échantillon unique
Valeur du test = 95000
labour
t
-1.552
ddl
1633
Sig.
(bilatérale)
.121
Différence
moyenne
-3701.130
Intervalle de confiance
80% de la différence
Inférieure
Supérieure
-6758.63
-643.63
Implication 4
Comparison of means
Comparison of means
 Sometimes, the social scientist is interested
in comparing means across two population.
 Mean wage across regions
 Mean R&D investments across industries
 Mean satisfaction level across social classes
 Instead of comparing a sample mean with a
target value, we will compare the two sample
means directly
Comparing the Means Using CI’s
 The simplest way to do so is to compute the
confidence intervals of the two population
means.
 Confidence interval for population 1
 Confidence interval for population 2
Comparing the Means Using CI’s
 If the two confidence interval overlap, we will
conclude that the two sample mean come from the
same population. We do not reject the null
hypothesis H0 that µ1 = µ2
 If the two confidence interval overlap, we will
conclude that the two sample mean come from the
same population. We reject the null hypothesis and
accept the alternative hypothesis Ha that µ1 ≠ µ2
Example
 Competition across business schools is fierce.
Imagine you want to compare the performance of
students between and SKEMA HEC schools.
 Hypothesis 1: Students from SKEMA have similar
grades as students from HEC
 H0 : µSKEMA = µHEC
 H1 : µSKEMA ≠ µHEC
SKEMA sample mean distribution
8.8
11.2
HEC Average grades
9.8
13.2
Comparison of sample mean Distributions
Since the two confidence
interval overlap, we conclude
that the two sample means
come from the same population.
We do not reject the null
hypothesis H0 that µSKEMA = µHEC
CI SKEMA
8
8.8
9
10
11 11.2
12
CI HEC
9
9.810
11
12
13 13.2
14
A Direct Comparison of Means Using Student t
 Another way to compare two sample means is to
calculate the CI of the mean difference. If 0 does not
belong to CI, then the two sample have significantly
different means.
 1  2    X 1  X 2   t pc 
s
2
p



X1  X1

2

sp
n1  n2
  X2  X2
(n1  1)  (n2  1)

2
Standard error, also called pooled
variance
Stata Application: t test comparing means
 Another newspaper argues that American (US + Canada) companies are
much larger than those from the rest of the world. Is this true?
 Produce descriptive statistics labour comparing the two groups
 Produce a group variables which equals 1 for US firms, 0 otherwise
 This is called a dummy variable




Write out H0 and H1
Run the student t test
What do you conclude at 5% level?
What do you conclude at 1% level?
STATA Application: Student t
 We again use the same command as before. But since
we compare means, we need to mention to two groups
we are comparing.
Two ways of computing confidence intervals:
ttest var1, by(catvar)
For example:
ttest labour, by(usgroup)
STATA Application: Student t
Stata Instruction
Descriptive statistics
. ttest labour , by(american)
T value
Two-sample t test with equal variances
Group
Obs
Mean
0
1
1006
628
combined
1634
diff
Std. Err.
Std. Dev.
[95% Conf. Interval]
87234.9
97808.99
2661.101
4499.817
84403.47
112765.1
82012.95
88972.45
92456.85
106645.5
91298.87
2384.818
96400.96
86621.25
95976.5
-10574.08
4897.135
-20179.42
-968.7514
diff = mean(0) - mean(1)
Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.0155
Ha
H0
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.0310
-2.1592
1632
Ha: diff > 0
Pr(T > t) = 0.9845
SPSS Application: t test comparing means
Statistiques de groupe
labour
AM
1
0
N
628
1006
Moyenne
97808.99
87234.90
Ecart-type
112765.1
84403.469
Erreur
standard
moyenne
4499.817
2661.101
Test d'échantillons indépendants
Test de Levene sur
l'égalité des variances
F
labour
Hypothèse de
variances égales
Hypothèse de
variances inégales
Sig.
.024
.877
Test-t pour égalité des moyennes
t
ddl
Sig.
(bilatérale)
Différence
moyenne
Différence
écart-type
Intervalle de confiance
95% de la différence
Inférieure
Supérieure
2.159
1632
.031
10574.084
4897.135
968.751
20179.417
2.023
1061.268
.043
10574.084
5227.792
316.102
20832.067
SPSS Application: t test comparing means
Statistiques de groupe
labour
AM
1
0
N
628
1006
Moyenne
97808.99
87234.90
Ecart-type
112765.1
84403.469
Erreur
standard
moyenne
4499.817
2661.101
Test d'échantillons indépendants
Test de Levene sur
l'égalité des variances
F
labour
Hypothèse de
variances égales
Hypothèse de
variances inégales
Sig.
.024
.877
Test-t pour égalité des moyennes
t
ddl
Sig.
(bilatérale)
Différence
moyenne
Différence
écart-type
Intervalle de confiance
99% de la différence
Inférieure
Supérieure
2.159
1632
.031
10574.084
4897.135
-2054.870
23203.038
2.023
1061.268
.043
10574.084
5227.792
-2916.075
24064.243
Implication 5
Bi- or Uni- Lateral Tests?
Bilateral versus Unilateral tests
 Up to now, we have always thought in terms of

whether two means are equal. The alternative
hypothesis is that the two means are different
There are many instances for which one may be
willing to test inequalities between means.
 Biotech companies have a higher R&D intensity
than big pharmas (large pharmaceutical

companies)
Biotech (pharma) companies publish / patent /
innovate more (less)
Unilateral tests
 To answer this question, we need to rewrite H0 and
Ha as follows.
 H0 stands for the hypothesis which contradicts your intuition
 Ha stands for the hypothesis in favour of your intuition
 In the case of R&D intensity, our intuition is that
biotech companies are more R&D intensive. Hence
 H0 : µbiotech ≤ µpharma
 Ha : µbiotech > µpharma
The Bilateral Tests
 Reminder on the method of the bilateral test.
H0 : µbiotech = µpharma
Ha : µbiotech ≠ µpharma
Reject H0 if |z| ≥ |za|
-za
za
Superior Unilateral Tests
 The trick is simply to put the confidence interval on
one side of the distribution.
H0 : µbiotech ≤ µpharma
Ha : µbiotech > µpharma
Reject H0 if z ≥ za
za
Inferior Unilateral Tests
 The trick is simply to put the confidence interval on
one side of the distribution.
H0 : µbiotech ≥ µpharma
Ha : µbiotech < µpharma
Reject H0 if z ≤ za
za
STATA Application: Student t
Stata Instruction
Descriptive statistics
. ttest labour , by(american)
T value
Two-sample t test with equal variances
Group
Obs
Mean
0
1
1006
628
combined
1634
diff
Std. Err.
Std. Dev.
[95% Conf. Interval]
87234.9
97808.99
2661.101
4499.817
84403.47
112765.1
82012.95
88972.45
92456.85
106645.5
91298.87
2384.818
96400.96
86621.25
95976.5
-10574.08
4897.135
-20179.42
-968.7514
diff = mean(0) - mean(1)
Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.0155
Ha
H0
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.0310
-2.1592
1632
Ha: diff > 0
Pr(T > t) = 0.9845
Stata Application: t test comparing means
 Another newspaper argues that American (US + Canada) companies are
much larger than those from the rest of the world. Is this true?
 Produce descriptive statistics labour comparing the two groups
 Produce a group variables which equals 1 for US firms, 0 otherwise
 This is called a dummy variable




Write out H0 and H1
Run the student t test
What do you conclude at 5% level?
What do you conclude at 1% level?
STATA Application: Student t
 We again use the same command as before. But since
we compare means, we need to mention to two groups
we are comparing.
Two ways of computing confidence intervals:
ttest var1, by(catvar)
For example:
ttest labour, by(usgroup)