Statistics Chapter 2 Exploring Distributions

Download Report

Transcript Statistics Chapter 2 Exploring Distributions

Section 2.1

Visualizing Distributions: Shape, Center and Spread

 To learn the basic shapes of distributions of data:      Uniform, normal and skewed To describe characteristics of a shape of distribution:  Symmetry, skewness, modes, outliers, gaps and clusters To describe a uniform distribution using range and frequency To estimate graphically the mean and standard deviation of a normal distribution and use them to describe the distribution.

To estimate graphically the median and quartiles and use to describe a skewed distribution.

 A graph that shows:  Spread of the data  How many times a value in the data occurs  How have we used a distribution?

 To see where data from a simulation lies.

 To explore probabilities of a random selection

 Uniform (Rectangular)  All values occur equally often  Selecting the last digit of the numbers in a phone book  Selecting the last digit of social security #s or you student id #s  randInt(start,end,n) ie: randInt(0,9,100) L1  Why?

▪ All digits 0-9 would be used and there would be no reason any one of them would be used more than the others.

       Normal Distributions (bell-shaped)  Very common in our world and will be used throughout the year.

Measure a ball.

 Measure the diameter to the nearest mm and record your result.

As a class create a dot plot that shows the distribution of our measurements.

What do you notice? Why do you think that occurred.

Normal Distribution Video #1 Normal Distribution Video #2

 Characteristics of a Normal Distribution:  Symmetric: The mean (avg. value) of the data is the center point. If it is truly normal, the mode and median of the data is also at the center.

 These are called measures of center.

 Standard deviation (SD) is a measure of the spread of a normal distribution. The SD happens to be the distance from the center out to the inflection point on the curve.

 One SD out from the center in both directions will give boundaries for an area of 68% of the total under the curve.

 This is a measure of the spread of the data.

 Skewed distribution (a longer tail on one side)  Skewed right: tail stretches to the right  Not a line of symmetry  Median is typically used to describe a measure of center since there is not line of symmetry.

▪ Divide the plot into equal #s of data points on each side of the median.

 Quartiles are a measure of spread for this. ▪ Lower quartile divides the lower half of the data ▪ Upper quartile divides the upper half of the data

 Bimodal Distributions (two peaks)  Cases often represents two groups when this occurs: Male/Female, Majority/Minority…  Outliers: A data value that stands apart from the bulk of the data.

 These deserve special attention  Sometimes they are mistakes  Sometimes there are unusual circumstances that can be important to great discoveries.

 Gaps in where the data values lie.

 Could also call the areas where the bulks of the data lie, clusters.

 When describing a distribution, you must include the following:  Shape (as we have just described)  Measure of center (or centers if bimodal) ▪ mean, mode, median  Measure of spread  Locations of Gaps or Clusters

 Discussion D4  Practice P1-3, 5

 Page 39 E1, 3, 5, 8, 11  AP only: also E4, and 6