Transcript Document
CHEMISTRY 59-320
ANALYTICAL CHEMISTRY
Fall - 2010
Lecture 6
Applications
• Example: Daily level
of an impurity in a
reactor has a mean
4.0 and = 0.3.
What is the probability
that the impurity level
on a randomly chosen
day will exceed 4.4?
4.4 4
z
1.333
0.3
Tail = 0.0918 or
~ 9%
• The more times you measure a quantity, the more confident that the average of
your measurements is close to the true population mean.
• Uncertainty decreases in proportion to
1
, where n is the number of measurement
n
4-2 Confidence intervals
Confidence interval: Interval within which the true
value almost certainly lies!
Confidence Limit: How sure
are you?
• In equation 4 - 6
t is a statistical factor that depends on the
number of degrees of freedom
(degrees of freedom = N-1).
• n is the number of measurements
Values of t at different confidence levels and
degrees of freedom are located in table 4.2
• Exercise 4A: For the numbers 116.0, 97.9,
114.2, 106.8 and 108.3, find the mean,
standard deviation, and 90% confidence
interval for the mean.
• Solution:
the mean = (116.0 + 97.9+
114.2+106.8+108.3)/5 = 108.64
the standard deviation s = …
the t value from Table 4-2 is: 2.132
use equation 4-6 to calculate the
confidence interval:
The meaning of a confidence level
• Standard deviation is
frequently used as
the estimated
uncertainty.
• It is a good practice
to report the number
of measurement so
that confidence level
can be calculated
4-3 Comparison of Means with
Student’s t
• Confidence limits and the t test assume that data
follow a Gaussian distribution. If they do not,
different formulas would be required.
• t test can be used to compare whether two sets
of measurements are “the same”, i.e. whether
the observed difference between the two means
arises from purely random measurement error.
• We customarily accept the result if we have a
95% chance that the conclusion is correct.
Case 1: Comparing a measured
result with a “known” value
• Computing the 95% confidence interval for
your answer and check if that range
includes the “known” answer.
• If the known answer is not within the 95%
confidence interval, the results do not
agree.
• A reliable assay shows that the ATP (adenosine
triphosphate) content of a certain cell type is 111
μmol/100 mL. You developed a new assay, which
gave the following values for replicate analyses: 117,
119, 111, 115, 120 μmol/100 mL (average = 116.4).
Can you be 95% confident that your result differs from
the “known” value?
The 95% confidence interval does not include the
accepted value of 111 μmol/100 mL, so the
difference is significant.
Case 2: Comparing replicate
measurements
• Lord Rayleigh’s experiments: the discovery of Argon.
• For two sets of data consisting of n1 and n2
measurements with averages x1 and x2 , calculate a
value of t with the formula
• Find t in Table 4-2 for n1+ n2 -2 degree of freedom. If
tcalculate >Ttable(95%), the difference is significant
Case 3: Pared t test for computing
individual difference
• Situation: using two methods to make
single measurements on different
samples, i.e. no measurement has been
duplicated.
•
• To see if there is a significant difference
between the methods, one uses paired t
test.
T = 2.228 for 95% CI
Related: Problems: 4.1 to 4.4 and 4.7,
4.17, 4.19 to 4.22.
4-4 Comparison of standard
deviations with the F test
• If the standard deviations of the two data set are
significantly different, then the following equation
is needed for the t test.
• The F test tells us whether two standard
deviations are significantly different from each
other.
• F = s12/s22
• Use degrees of freedoms 1 and 2 to
find a F value from Table 4-4.
• If the calculated F value exceeds a
tabulated F value at a selected
confidence level (95%), then there is a
significant difference between the
variances of the two methods.
• Problem 4-17. If you measure a quantity 4 times and the
standard deviation is 1.0% of the average, can you be 90%
confidence that the true value is within 1.2% of the measured
average.
4-6: Rejection of a Result:
The Q Test
• The Q test is used to determine if an
“outlier” is due to a determinate error. If it
is not, then it falls within the expected
random error and should be retained.
• Q = gap/w where gap = difference
between “outlier” and nearest result and
w = range of results.
• If Qcalculate > Qtable, the questionable point
should be discarded.
0.55 < 0.64
4-7: The method of least squares
(Regression Analysis)
y mx b
The straight line model
Starting point: Line through the origin y mx
Experience suggests that there is an error in the response,
therefore,
yobs mxi i ; representstheerror
yobs ; istheobserved value
The method of least squares takes
the best fitting model by minimizing the quantity,
n
S S ( ) ( yobs xi )2
i 1
A plot of S as a function of Beta produces a minimum with
a constant least square estimate for beta “m”.
After “m” is known, you have all the calculated values
yi mxi
The difference between these two values is the residual,
and the sum of the squares of the residuals is also a minimum value.
n
S R ( yobs yi )
i 1
2
n
S S ( ) ( yobs xi )
2
i 1
yi mxi
n
S R ( yobs yi )
i 1
2
Estimate of the experimental error variance, s2
n
SR
S R ( yobs yi )
s
n 1
i 1
The coefficient of determination R2 is the proportion of
variability in a data set that is accounted for by a statistical
model.
The version most common in statistics texts is based on
analysis of variance decomposition as follows:
SSR
2
R
SST
2
2
n
SS R ( yi y )
2
i 1
n
SST ( yobs y )
i 1
2
4-8 calibration curves