Chapter 4 - Errors in experimental measurements.

Download Report

Transcript Chapter 4 - Errors in experimental measurements.

Errors in Experimental
Measurements




Sources of errors
Accuracy, precision, resolution
A mathematical model of errors
Confidence intervals



For means
For proportions
How many measurements are needed for
desired error?
Copyright 2004 David J. Lilja
1
Why do we need statistics?
1. Noise, noise, noise, noise, noise!
OK – not really this type of noise
Copyright 2004 David J. Lilja
2
Why do we need statistics?
445 446 397 226
388 3445 188 1002
47762 432 54 12
98 345 2245 8839
77492 472 565 999
1 34 882 545 4022
827 572 597 364
2. Aggregate data into
meaningful
information.
x  ...
Copyright 2004 David J. Lilja
3
What is a statistic?

“A quantity that is computed from a sample
[of data].”
Merriam-Webster
→ A single number used to summarize a larger
collection of values.
Copyright 2004 David J. Lilja
4
What are statistics?
“A branch of mathematics dealing with the
collection, analysis, interpretation, and
presentation of masses of numerical data.”
Merriam-Webster
→ We are most interested in analysis and
interpretation here.

“Lies, damn lies, and statistics!”

Copyright 2004 David J. Lilja
5
Goals
Provide intuitive conceptual background for
some standard statistical tools.

•
•
Draw meaningful conclusions in presence of
noisy measurements.
Allow you to correctly and intelligently apply
techniques in new situations.
→ Don’t simply plug and crank from a formula.
Copyright 2004 David J. Lilja
6
Goals
Present techniques for aggregating large
quantities of data.

•
•
Obtain a big-picture view of your results.
Obtain new insights from complex
measurement and simulation results.
→ E.g. How does a new feature impact the
overall system?
Copyright 2004 David J. Lilja
7
Sources of Experimental Errors

Accuracy, precision, resolution
Copyright 2004 David J. Lilja
8
Experimental errors


Errors → noise in measured values
Systematic errors




Result of an experimental “mistake”
Typically produce constant or slowly varying bias
Controlled through skill of experimenter
Examples


Temperature change causes clock drift
Forget to clear cache before timing run
Copyright 2004 David J. Lilja
9
Experimental errors

Random errors



Result of




Unpredictable, non-deterministic
Unbiased → equal probability of increasing or decreasing
measured value
Limitations of measuring tool
Observer reading output of tool
Random processes within system
Typically cannot be controlled

Use statistical tools to characterize and quantify
Copyright 2004 David J. Lilja
10
Example: Quantization
→ Random error
Copyright 2004 David J. Lilja
11
Quantization error


Timer resolution
→ quantization error
Repeated measurements
X±Δ
Completely unpredictable
Copyright 2004 David J. Lilja
12
A Model of Errors
Error
Measured
value
Probability
-E
x–E
½
+E
x+E
½
Copyright 2004 David J. Lilja
13
A Model of Errors
Error 1
Error 2
Probability
-E
Measured
value
x – 2E
-E
+E
x
¼
+E
-E
x
¼
+E
+E
x + 2E
¼
-E
Copyright 2004 David J. Lilja
¼
14
A Model of Errors
Probability
0.6
0.5
0.4
0.3
0.2
0.1
0
x-E
x
x+E
Measured value
Copyright 2004 David J. Lilja
15
Probability of Obtaining a
Specific Measured Value
Copyright 2004 David J. Lilja
16
A Model of Errors



Pr(X=xi) = Pr(measure xi)
= number of paths from real value to xi
Pr(X=xi) ~ binomial distribution
As number of error sources becomes large



n → ∞,
Binomial → Gaussian (Normal)
Thus, the bell curve
Copyright 2004 David J. Lilja
17
Frequency of Measuring Specific
Values
Accuracy
Precision
Mean of measured values
Resolution
Copyright 2004 David J. Lilja
True value
18
Accuracy, Precision,
Resolution

Systematic errors → accuracy


Random errors → precision


How close mean of measured values is to true
value
Repeatability of measurements
Characteristics of tools → resolution

Smallest increment between measured values
Copyright 2004 David J. Lilja
19
Quantifying Accuracy,
Precision, Resolution

Accuracy


Hard to determine true accuracy
Relative to a predefined standard


Resolution


E.g. definition of a “second”
Dependent on tools
Precision

Quantify amount of imprecision using statistical
tools
Copyright 2004 David J. Lilja
20
Confidence Interval for the
Mean
1-α
α/2
c1
c2
Copyright 2004 David J. Lilja
α/2
21
Normalize x
xx
z
s/ n
n  number of measurement s
n
x  mean   xi
i 1
2
(
x

x
)
i1 i
n
s  standarddeviation
Copyright 2004 David J. Lilja
n 1
22
Confidence Interval for the
Mean

Normalized z follows a Student’s t distribution



(n-1) degrees of freedom
Area left of c2 = 1 – α/2
Tabulated values for t
1-α
α/2
c1
c2
Copyright 2004 David J. Lilja
α/2
23
Confidence Interval for the
Mean

As n → ∞, normalized distribution becomes
Gaussian (normal)
1-α
α/2
c1
c2
Copyright 2004 David J. Lilja
α/2
24
Confidence Interval for the
Mean
c1  x  t1 / 2;n 1
c2  x  t1 / 2;n 1
s
n
s
n
T hen,
Pr(c1  x  c2 )  1  
Copyright 2004 David J. Lilja
25
An Example
Experiment
1
2
3
4
5
6
7
8
Measured value
8.0 s
7.0 s
5.0 s
9.0 s
9.5 s
11.3 s
5.2 s
8.5 s
Copyright 2004 David J. Lilja
26
An Example (cont.)

x
n
x
i 1 i
 7.94
n
s  samplestandarddeviation 2.14
Copyright 2004 David J. Lilja
27
An Example (cont.)


90% CI → 90% chance actual value in interval
90% CI → α = 0.10


1 - α /2 = 0.95
n = 8 → 7 degrees of freedom
1-α
α/2
c1
Copyright 2004 David J. Lilja
c2
α/2
28
90% Confidence Interval
a  1   / 2  1  0.10 / 2  0.95
t a;n 1  t0.95;7  1.895
1.895(2.14)
c1  7.94 
 6.5
8
1.895(2.14)
c2  7.94 
 9.4
8
a
n
0.90
…
5
…
…
…
1.476 2.015 2.571
6
7
1.440 1.943 2.447
1.415 1.895 2.365
…
∞
…
…
…
1.282 1.645 1.960
Copyright 2004 David J. Lilja
0.95
0.975
29
95% Confidence Interval
a  1   / 2  1  0.10 / 2  0.975
t a;n 1  t0.975;7  2.365
2.365(2.14)
 6.1
8
2.365(2.14)
c2  7.94 
 9.7
8
c1  7.94 
a
n
0.90
…
5
…
…
…
1.476 2.015 2.571
6
7
1.440 1.943 2.447
1.415 1.895 2.365
…
∞
…
…
…
1.282 1.645 1.960
Copyright 2004 David J. Lilja
0.95
0.975
30
What does it mean?

90% CI = [6.5, 9.4]


95% CI = [6.1, 9.7]


90% chance real value is between 6.5, 9.4
95% chance real value is between 6.1, 9.7
Why is interval wider when we are more
confident?
Copyright 2004 David J. Lilja
31
Higher Confidence → Wider
Interval?
90%
9.4
6.5
95%
6.1
9.7
Copyright 2004 David J. Lilja
32
Key Assumption


Measurement errors are
Normally distributed.
Is this true for most
measurements on real
computer systems?
1-α
α/2
c1
Copyright 2004 David J. Lilja
c2
α/2
33
Key Assumption

Saved by the Central Limit Theorem
Sum of a “large number” of values from any
distribution will be Normally (Gaussian)
distributed.

What is a “large number?”

Typically assumed to be >≈ 6 or 7.
Copyright 2004 David J. Lilja
34
How many measurements?



Width of interval inversely proportional to √n
Want to minimize number of measurements
Find confidence interval for mean, such that:

Pr(actual mean in interval) = (1 – α)
(c1 , c2 )  (1  e) x , (1  e) x 
Copyright 2004 David J. Lilja
35
How many measurements?
(c1 , c2 )  (1  e) x
s
n
 x  z1 / 2
z1 / 2
s
 xe
n
 z1 / 2 s 
n

 xe 
Copyright 2004 David J. Lilja
2
36
How many measurements?



But n depends on knowing mean and
standard deviation!
Estimate s with small number of
measurements
Use this s to find n needed for desired
interval width
Copyright 2004 David J. Lilja
37
How many measurements?



Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of
actual mean.
Copyright 2004 David J. Lilja
38
How many measurements?







Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of
actual mean.
α = 0.90
(1-α/2) = 0.95
Error = ± 3.5%
e = 0.035
Copyright 2004 David J. Lilja
39
How many measurements?
 z1 / 2 s   1.895(2.14) 
  212.9
n
  
 x e   0.035(7.94) 
2
213 measurements
→ 90% chance true mean is within ± 3.5% interval

Copyright 2004 David J. Lilja
40
Proportions


p = Pr(success) in n trials of binomial
experiment
Estimate proportion: p = m/n


m = number of successes
n = total number of trials
Copyright 2004 David J. Lilja
41
Proportions
c1  p  z1 / 2
p (1  p )
n
c2  p  z1 / 2
p (1  p )
n
Copyright 2004 David J. Lilja
42
Proportions



How much time does processor spend in
OS?
Interrupt every 10 ms
Increment counters


n = number of interrupts
m = number of interrupts when PC within OS
Copyright 2004 David J. Lilja
43
Proportions



How much time does processor spend in
OS?
Interrupt every 10 ms
Increment counters



n = number of interrupts
m = number of interrupts when PC within OS
Run for 1 minute


n = 6000
m = 658
Copyright 2004 David J. Lilja
44
Proportions
(c1 , c2 )  p  z1 / 2
p (1  p )
n
0.1097(1  0.1097)
 0.1097 1.96
 (0.1018,0.1176)
6000


95% confidence interval for proportion
So 95% certain processor spends 10.2-11.8% of its
time in OS
Copyright 2004 David J. Lilja
45
Number of measurements for
proportions
(1  e) p  p  z1 / 2
p (1  p )
n
p (1  p )
ep  z1 / 2
n
2
z1 / 2 p (1  p )
n
2
(ep )
Copyright 2004 David J. Lilja
46
Number of measurements for
proportions



How long to run OS experiment?
Want 95% confidence
± 0.5%
Copyright 2004 David J. Lilja
47
Number of measurements for
proportions





How long to run OS experiment?
Want 95% confidence
± 0.5%
e = 0.005
p = 0.1097
Copyright 2004 David J. Lilja
48
Number of measurements for
proportions
z12 / 2 p (1  p )
n
2
(ep )
(1.960) (0.1097)(1  0.1097)

2
0.005(0.1097)
 1,247,102
2
10 ms interrupts
→ 3.46 hours

Copyright 2004 David J. Lilja
49
Important Points

Use statistics to



Deal with noisy measurements
Aggregate large amounts of data
Errors in measurements are due to:
Accuracy, precision, resolution of tools
 Other sources of noise
→ Systematic, random errors

Copyright 2004 David J. Lilja
50
Important Points: Model errors
with bell curve
Accuracy
Precision
Mean of measured values
Resolution
Copyright 2004 David J. Lilja
True value
51
Important Points


Use confidence intervals to quantify precision
Confidence intervals for



Confidence level


Mean of n samples
Proportions
Pr(actual mean within computed interval)
Compute number of measurements needed
for desired interval width
Copyright 2004 David J. Lilja
52