summary - vscht.cz
Download
Report
Transcript summary - vscht.cz
SUMMARY
β’ Z-distribution
β’ Central limit theorem
Statistical inference
If we canβt conduct a census, we
collect data from the sample of a
population.
Goal: make conclusions about
that population
Confidence interval
π
for π β₯ 30: π₯ ± π ×
π
critical value
kritická hodnota
margin of error
moΕΎná odchylka
π
for π < 30: π₯ ± π‘πβ1 ×
π
What a confidence interval does tell us?
β’ When we say "we are 95% confident that the true value of
the parameter is in our confidence interval", we express
that 95% of the observed confidence intervals will hold the
true value of the parameter.
β’ After a sample is taken, the population parameter is either
in the interval made or not. It is not a matter of chance!
β’ A confidence interval does not predict that the true value
of the parameter has a particular probability of being in
the confidence interval given the data actually obtained.
neco ×
β’
Just to summarize, the margin of error depends on
the confidence level (common is 95%)
2. the sample size π
1.
β’
β’
3.
the variability of the data (i.e. on Ο)
β’
β’
β’
as the sample size increases, the margin of error decreases
For the bigger sample we have a smaller interval for which
weβre pretty sure the true population lies.
more variability increases the margin of error
Margin of error does not measure anything else
than chance variation.
It doesnβt measure any bias or errors that happen
during the proces.
β’
It does not tell anything about the correctness of your
data!!!
π
π
HYPOTHESIS TESTING
Aim of hypothesis testing
β’ decision making
Engagement ratio
β’ Hopefully, you like this course so far.
β’ How to measure this?
number of minutes awake
Engagement ratio =
total minutes avilable
Engagement distribution
π = 100
π = 0.077
Ο = 0.107
π = ππ
π = π. ππ
0.0
0.5
1.0
π = 100, π = 0.077, Ο = 0.107
β’ If all students attended to my lesson with singing, what is our point estimate for the
engagement ratio?
β’ 0.13, number of minutes awake = 100 × 0.13 = 13 min
β’ Interval estimate
β’ What is the standard error of the mean that we would to use to compare this sample
mean (0.13) with the means of other samples of the same size?
β’
ππΈ =
0.107
30
= 0.019
95%
0.077
0.13
95%
95%
M
0.13
95%
M
95%
0.13
M
π = 30, π₯ = 0.13, Ο = 0.107
Confidence interval
π
π₯±π×
π
β’ But in our case we know the population standard
deviation π = 0.107. So instead of the sample π we can
use the population π.
β’ What is a 95% confidence interval in our case?
0.13 ± 1.96 ×
0.107
30
βΉ 0.090 β¦ 0.170
Confidence inteval
β’ Our interval estimate for 95% confidence interval has a lower
bound of 0.090 and an upper bound of 0.170.
β’ Remember, what these numbers mean. This is the ratio of the
βminutes awakeβ during a lesson to the βtotal minutes available
in a lessonβ (which is 100 minutes).
Engagement ratio ER =
number of minutes awake
100
β’ So weβre predicting if I incorporate this musical lesson then the
entire population of 100 students will be awaken between 9
minutes and 17 minutes.
β’ Without my song the population of 100 students was awaken
7.7 minutes.
β’ Thus weβre pretty sure a singing works, it will keep you awaken
at least 9 min, but possibly up to 17 min.
Self-assessment
β’ The engagement ratio may not be perfect, it may have
few flaws.
β’ For example a fact youβre not sleeping does not
necessarily mean youβre engaged.
β’ Another option β you can self-report how engaged you
think youβre (at the scale between 1 and 10).
β’ And you can also self-assess how much you think you
learnt (at the scale between 1 and 10).
Results
Measure of βEngagementβ
Measure of βLearningβ
β’ ππΈ = 7.5, ππΈ = 0.64
β’ ππΏ = 8.2, ππΏ = 0.73
β’ π₯πΈ = 8.94
β’ π₯πΏ = 8.35
Quiz
β’ Ultimately we want to know if incorporating a song about
the concepts in the lesson will lead to higher engagement
and learning. What statistics should we calculate to
determine this?
1.
2.
3.
4.
Note if the sample means are less than or greater than a
population mean.
Calculate the actual difference between each sample mean and
population mean.
Find where each sample mean falls on the distribution of sample
means for their respective populations.
Find how many πs each sample mean is from the population
mean.
Measure of βEngagementβ
Measure of βLearningβ
β’ ππΈ = 7.5, ππΈ = 0.64
β’ ππΏ = 8.2, ππΏ = 0.73
β’ π = 30
β’ π = 30
β’ ππΈ = ?
β’ ππΏ = ?
β’ ππΈπΈ = ?
β’ ππΈπΏ = ?
ππ¬ =
π. ππ β π. π
π. ππ
ππ
= ππ. ππ
Measure of βEngagementβ
ππ³ =
π. ππ β π. π
π. ππ
ππ
= π. ππ
Measure of βLearningβ
β’ ππΈ = 7.5, ππΈ = 0.64
β’ ππΏ = 8.2, ππΏ = 0.73
β’ π = 30
β’ π = 30
β’ ππΈ = 7.5
β’ ππΈπΈ =
0.64
30
ππ¬ = π. ππ
= 0.12
β’ ππΏ = 8.2
β’ ππΈπΏ =
0.73
30
ππ¬ = π. ππ
= 0.13
Probability of getting a given mean
β’ ππΈ = 12.32, ππΏ = 1.12
β’ What is the probability of randomly selecting a sample of
size 30 and getting a mean at least 8.94 for an
engagement and 8.35 for a learning?
β’ For ππΈ the probability is really low.
β’ For ππΏ the probabilty is 1.0 β 0.8686 ~ 0.13.
β’ So what does this mean, what can we conclude? Check
all what applies.
Conclusions?
1. The song seems to have had an effect on learning, but
not engagement.
2. The song seems to have had an effect on engagement,
but not learning.
3. The song caused and increase in bot engagement and
learning.
4. The song caused an increase in engagement, but not in
learning.
Summary of our findings
Dependent variable
(scale from 1 to 10)
Sample mean π
(n=30)
engagement
8.94
π βͺ 0.01
learning
8.35
π β 0.13
Probability
Likely or unlikely?
Summary of our findings
Dependent variable
(scale from 1 to 10)
Sample mean π
(n=30)
engagement
???
π = 0.05
learning
???
π = 0.10
Probability
Likely or unlikely?
Levels of likelihood - πΌ levels
0.05 (5%)
0.01 (1%)
0.001 (0.1%)
β’ Three conventional levels of (un)likelihood
β’ If the probability of getting a sample mean is less than
0.05 β 0.01 β 0.001 then it is usually considered unlikely.
β’ These are called the
πΆ levels. Or significance levels
(hladiny významnosti).
β’ πΌ level is our criteria for deciding if something is likely or
unlikely.
Quiz
β’ Focus on πΌ = 0.05
Sampling
distribution
Z*
β’ Which of the following are true?
1. If the probability of getting a particular sample mean is less than
πΌ, it is unlikely to occur.
2. If a sample mean has a Z-score greater than Z*, it is βunlikelyβ to
occur.
3. If the probability of getting a particular sample mean is βunlikelyβ,
the sample mean is in he orange region.
4. The alpha level corresponds to the orange region.
Z-critical value
If the probability of obtaining a
particular sample mean is less than
alpha level then it will fall in this tail
which is called the critical region.
Z*
Z-critical value
If the Z-score of the sample mean is greater than the Zcritical value we have an evidence that this mean is
different from the regular population (the population that
had not watched the musical lesson).
Critical regions
β’ What is the Z-critical value for πΌ = 0.05?
β’ Using Z-table you find Z-value for 0.95 probability. Which is 1.65.
β’ What is the Z-critical value for πΌ = 0.01?
β’ 2.33
β’ What is the Z-critical value for πΌ = 0.001?
β’ 3.08
β’ We take a sample mean from a sample size π.
β’ Then we calculate its Z-score
π₯βπ
π=π
π
β’ And we get a Z-score of 1.82.
β’ We say that this is significant at π < 0.05.
β’ 1.82 is somewhere in the red region at the previous picture. It is
less than 0.05, but not less than 0.01.
β’ It means that a probability of obtaining this sample mean is less
than 5%, but is not less than 1%.
β’ And remember, 0.05 is the alpha level.
Significance quiz
Z-score
3.14
2.07
2.57
14.31
Significant at
p<
p<
p<
p<
πΌ level
Z-critical value
0.05
1.65
0.01
2.32
0.001
3.08
Significance quiz
Z-score
3.14
2.07
2.57
14.31
Significant at
p < 0.001
p < 0.05
p < 0.01
p < 0.001
πΌ level
Z-critical value
0.05
1.65
0.01
2.32
0.001
3.08