Transcript Document
Boxplot 30 20 10 0 -10 Temperature (degree F) 40 50 Ithaca Temperature, January 1987 Tmax Tmin IQR = 33 –26 = 7 °F One step = 1.5*7 = 10.5 °F Lower inner fence = 26 – 10.5 = 15.5 °F Upper inner fence = 33 + 10.5 = 43.5 °F One step = 1.5*IQR The whiskers are drawn to the most extreme temperatures inside the inner fences, 37 and 17 °F. The whiskers are therefore shortened to extend only to the last observation within one step beyond either end of the box (“adjacent values”). Boxplot Boxplots provide visual summaries of: 1. The center of the data (the median – the center line of the box). 2. The variation or spread of the data (interquartile range – the box height). 3. The skewness of the data (the relative size of the box halves). 4. Presence or absence of unusual values (“outside” and “far outside” values). Boxplots are typically put side-by-side to visually compare and contrast groups of data. Quantile Plots Quantile plot of Ithaca Maximum Temperature, January 1987 Quantile plots visually portray the quantiles, or percentiles (which equal the quantiles times 100) of the distribution of sample data. 1.0 0.9 Cumulative Frequency 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 10 20 30 40 50 Maximum Temperature (degree F) 2. Every point has a distinct position, without overlap. 3. Arbitrary categories are not required, as with histograms. 60 Advanatges: 1. All of the data are displayed, unlike a boxplot. Quantile Plots Quantile plot of Ithaca Maximum Temperature, January 1987 1.0 0.9 Limitation of i/N weighting! 0.7 0.6 0.5 0.4 0.3 0.2 Quatile plot of Ithaca Maximum Temperature, January 1987 1.0 0.1 0.0 0 10 20 30 40 50 60 0.4 0.2 0.0 Quantile plots are sample approximations of the cumulative distribution function (cdf) of a continuous random variable. More specifically, the empirical cdf (ecdf). 0.6 0.8 Maximum Temperature (degree F) Cumulative Frequency Cumulative Frequency 0.8 10 20 30 40 Maximum Temperature (degree F) 50 Plotting Positions General formula: p = (i- a)/(N + 1 – 2a) • Weibull (a = 0) • Blom (a = 0.375) Application of quantile plots: 1. To compare two or more data distributions (a Q-Q plot). 2. To compare data to a normal distribution (a probability plot), and 3. To calculate frequencies of exceedance (e.g., a flowduration curve). Histogram Histogram of tmax 4 Frequency 10 0 2 5 0 Frequency 6 15 8 Histogram of tmin 0 10 20 30 tmax 40 50 60 -10 0 10 tmin Histogram of Ithaca temperature, January 1987. 20 30 0.03 0.02 0.01 0.00 Density 0.04 0.05 Histogram and Probability Density Function 0 10 20 30 40 Maximum temperature 50 60 Mathematics of Probability