Transcript Document

Boxplot
30
20
10
0
-10
Temperature (degree F)
40
50
Ithaca Temperature, January 1987
Tmax
Tmin
IQR = 33 –26 = 7 °F
One step = 1.5*7 = 10.5 °F
Lower inner fence = 26 – 10.5 = 15.5 °F
Upper inner fence = 33 + 10.5 = 43.5 °F
One step = 1.5*IQR
The whiskers are drawn to the most extreme temperatures inside the inner fences, 37 and 17 °F. The
whiskers are therefore shortened to extend only to the last observation within one step
beyond either end of the box (“adjacent values”).
Boxplot
Boxplots provide visual summaries of:
1. The center of the data (the median – the center line of the
box).
2. The variation or spread of the data (interquartile range –
the box height).
3. The skewness of the data (the relative size of the box
halves).
4. Presence or absence of unusual values (“outside” and
“far outside” values).
Boxplots are typically put side-by-side to visually compare
and contrast groups of data.
Quantile Plots
Quantile plot of Ithaca Maximum Temperature, January 1987
Quantile plots visually
portray the
quantiles, or
percentiles (which
equal the quantiles
times 100) of the
distribution of
sample data.
1.0
0.9
Cumulative Frequency
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
10
20
30
40
50
Maximum Temperature (degree F)
2. Every point has a distinct position,
without overlap.
3. Arbitrary categories are not required,
as with histograms.
60
Advanatges:
1. All of the data are
displayed, unlike a
boxplot.
Quantile Plots
Quantile plot of Ithaca Maximum Temperature, January 1987
1.0
0.9
Limitation of
i/N weighting!
0.7
0.6
0.5
0.4
0.3
0.2
Quatile plot of Ithaca Maximum Temperature, January 1987
1.0
0.1
0.0
0
10
20
30
40
50
60
0.4
0.2
0.0
Quantile plots are sample
approximations of the cumulative
distribution function (cdf) of a
continuous random variable. More
specifically, the empirical cdf
(ecdf).
0.6
0.8
Maximum Temperature (degree F)
Cumulative Frequency
Cumulative Frequency
0.8
10
20
30
40
Maximum Temperature (degree F)
50
Plotting Positions
General formula:
p = (i- a)/(N + 1 – 2a)
• Weibull (a = 0)
• Blom (a = 0.375)
Application of quantile plots:
1. To compare two or more data distributions (a Q-Q
plot).
2. To compare data to a normal distribution (a
probability plot), and
3. To calculate frequencies of exceedance (e.g., a flowduration curve).
Histogram
Histogram of tmax
4
Frequency
10
0
2
5
0
Frequency
6
15
8
Histogram of tmin
0
10
20
30
tmax
40
50
60
-10
0
10
tmin
Histogram of Ithaca temperature, January 1987.
20
30
0.03
0.02
0.01
0.00
Density
0.04
0.05
Histogram and Probability Density
Function
0
10
20
30
40
Maximum temperature
50
60
Mathematics of Probability