Statistics 400 - Lecture 2

Download Report

Transcript Statistics 400 - Lecture 2

Statistics 270 - Lecture 20
• Last Day…completed 5.1
• Today Parts of Section 5.3 and 5.4
Sampling
• In chapter 1, we concerned ourselves with numerical/graphical
summeries of samples (x1, x2, …, xn) from some population
• Can view each of the Xi’s as random variables
• We will be concerned with random samples
• The xi‘s are independent
• The xi‘s have the same probability distribution
• Often called
Definitions
• A parameter is a numerical feature of a distribution or population
• Statistic is a function of sample data (e.g., sample mean, sample
median…)
• We will be using statistics to estimate parameters (point estimates)
• Suppose you draw a sample and compute the value of a statistic
• Suppose you draw another sample of the same size and compute
the value of the statistic
• Would the 2 statistics be equal?
• Use statistics to estimate parameters
• Will the statistics be exactly equal to the parameter?
• Observed value of the statistics depends on the sample
• There will be variability in the values of the statistic over repeated
sampling
• The statistic has a distribution of its own
• Probability distribution of a statistic is called the sampling
distribution (or distribution of the statistic)
• Is the distribution of values for the statistic based on all possible
samples of the same size from the population?
• Based on repeated random samples of the same size from the
population
Example
• Large population is described by the probability distribution
X
P(X=x)
0
0.2
3
0.3
12
0.5
• If a random sample of size 2 is taken, what is the sampling
distribution for the sample mean?
Sampling Distribution of the Sample Mean
• Have a random sample of size n
• The sample mean is
n
 xi
x  i 1
n
• What is it estimating?
Properties of the Sample Mean
• Expected value:
• Variance:
• Standard Deviation:
Properties of the Sample Mean
• Observations
Sampling from a Normal Distribution
• Suppose have a sample of size n from a N(m,s2) distribution
• What is distribution of the sample mean?
Example
• Distribution of moisture content per pound of a dehydrated protein
concentrate is normally distributed with mean 3.5 and standard
deviation of 0.6.
• Random sample of 36 specimens of this concentrate is taken
• Distribution of sample mean?
• What is probability that the sample mean is less than 3.5?
Central Limit Theorem
• In a random sample (iid sample) from any population with mean, m,
and standard deviation, s, when n is large, the distribution of the
sample mean X is approximately normal.
• That is,
Central Limit Theorem
• For the sample total,
Implications
• So, for random samples, if have enough data, sample mean is
approximately normally distributed...even if data not normally
distributed
• If have enough data, can use the normal distribution to make
probability statements about x
Example
• A busy intersection has an average of 2.2 accidents per week with a
standard deviation of 1.4 accidents
• Suppose you monitor this intersection of a given year, recording the
number of accidents per week.
• Data takes on integers (0,1,2,...) thus distribution of number of
accidents not normal.
• What is the distribution of the mean number of accidents per week
based on a sample of 52 weeks of data
Example
• What is the approximate probability that X is less than 2
• What is the approximate probability that there are less than 100
accidents in a given year?