AP Stats - Joule 2.0

Download Report

Transcript AP Stats - Joule 2.0

AP Stats Review day 1 April 2, 2013

Basics

• Two Parts (90 Minutes each part) – 40 Multiple Choice • Content Questions (10-15) • Calculation Questions(25-30) – 6 Free Response • Part A 5 Questions (12 minutes) 75% of Part II Score • Part B 1 Questions (25 minutes) 25 % of Part II Score

• • •

Constructing and Interpreting Graphical Displays of Univariate Data

Two Types of Data – Categorical (qualitative) example: colors, gender – Quantitative (numerical) example: age, scores Types of Graphs (bar, pie, dotplot, histogram, stem and leaf) Describing or Comparing Distributions: – Shape: overall pattern of the data; unimodal, bimodal, uniform, symmetrical, skewness – Spread: Range of values – – Center: Point that divides the data roughly in half Outliers: 1.5 x IQR rule

Review Questions P. 59 (4 minutes)

• • • • 1. Which of the following are true statements?

2. Which of the following is inappropriate for displaying quantitative data?

3. A graphical display of data that shows the cumulative counts across each of the possible data values or ranges of data values is a C 4. The height of Mrs. Clark’s tomato plant is what type of data?

E C B

Summarizing Distributions of Univariate Data

• • • • Compared by measures of center: Mean & Median – Mean is pulled toward skewness – Median = Mean when symmetric Measures of Spread: range, IQR, variance and standard deviation Measures of position: quartiles and percentiles (z score) Boxplot: Min, Q1, Med, Q3, Max

Review Questions P. 74 & 75

• • Multiple Choice Questions – – 1. The mean assessed value of homes in Southern County is $158,000 with a standard deviation of $32,000. If the county supervisors decided to increase everyone’s assessment by $5,000, the new mean and standard deviation would be C 2. What was the average B – 3. A distribution is skewed right if C – 4. When a constant is added to every data value Free Response Question D

Comparing Distributions

• • P. 84 & 85 Review questions – – 1. Which of the following is true? 2. Consider the following back-to- back – 3. Which of thf following could not be used to compare these data graphically?

– 4. Which of the following statements cannot be justified?

Free Response Question: The following data represents the hours of continuous use for two brands of batteries. 1. E 2. A 3. C 4. A

Free Response

• From my boxplot comparisons, Brand A has a symmetric distribution, larger mean (67.2) and median (67), and range was 10. Brand B is skewed right, smaller mean (66.1) and median (65), and range was 6.

Exploring Bivariate Data

• • • • • Scatterplots Linear correlation coefficient (r) Least Squares Regression Line Residual Plot Transforming Data to achieve linearity

Review Questions P. 101

• 1. A scatterplot is obtained by C • 2. A perfect positive correlation means • 3. In a linear regression model, the slope • 4. In the regression equation y = 12 + 6x, 12 D • Free Response Question P. 102 A D

Exploring Categorical Data

• • • 1. Categorical data is exhibited using frequency tables and bar charts 2. Joint frequencies occur where a row category meets a column category; if the frequencies appear in the margins of the table, they are called marginal frequencies.

3. Conditional relative frequencies are expressed in percentages

P. 111 Review Questions

• • • • 1. Relative frequency 2. Frequency of a category 3. What is the joint relative conditional frequency for male Republicans given if the marginal row totals are fixed?

4. Which of the following is a joint freqency?

1. B 2. A 3. B 4. A

Methods of Data Collection

• • • For all types of studies, the primary goal of the investigator is to eliminate bias. Bias is any systematic tendency to favor certain outcomes at the expense of others.

Experiment vs Observational Study Observational study – Census (population/parameter) – Sample survey (sample/statistic)

Experiments

• Experiments – Deliberately impose a treatment on a set of individuals. (subjects if people) – To determine whether the the treatment imposed causes the effects measured in the subjects.

– These measured effects are the response variables in an experiment. – Explanatory Variables are often referred to as factors, each of which can take multiple values called levels.

Example

• Suppose you are investigating the effects of various amounts of sunlight and moisture on the growth of tulip plants. The plants would be your experimental units, and you could measure the height of the tulips as your response variable. You could subject samples to direct sunlight, partial shade, and full shade. You could also water them either everyday or every other day. What are the factors and levels? How many treatments?

Two factors: Sunlight and moisture Three levels for sunlight, Two levels for moisture. 6 treatments

Review Questions P. 119

• • • • 1. What kind of study is this?

2. Which of the following statements about observational studies is true?

3. Which describes the factors, levels, and/or treatments in this experiment?

If time permits look at the free response question on page 120 1. D 2. E 3. A

Sources of Bias

• • Sample selection – Undercoverage – Nonresponse – Convenience sampling – Voluntary response bias Incorrect Measurement – Poorly worded questions – Response bias – Deliberately lie

Sampling

• • • • • Simple Random Sample (SRS) Stratified Random Sampling Systematic Sampling Cluster Sampling Multistage sampling refers to any random sampling scheme with at least two steps that incorporates elements of stratified sampling, cluster, or SRS

Review Questions P. 129

• • • 1. What type of sampling?

2. Which of the following is not a property of a large table of random digits?

3. Most significant sources of bias 1. A 2. B 3. D

Planning and Conducting Experiments

• • • Control: Compare multiple treatments Randomization: Randomly assigned to treatment groups.

Replication: Multiple subjects/Repeat to validate results

Poorly designed experiments

• • • Confounding variables Lurking variables Placebo effect

Experimental Design

• • • • Completely Randomized Randomized Block Matched Pairs Blinding/Double Blind

Review Questions P. 139

• • • 1. Which is the most serious problem with this design?

2. Which of the following is the least important way in which the designer of an experiment can guard against confounding?

3. Which of the following is the most appropriate design for this experiment?

1. A 2. A 3. E

Probability

• • • • • • Events are independent if the outcome of one event does not influence the outcome of any other event Events are mutually exclusive if they cannot occur together.

P(A or B) = P(A) + P(B) – P(A and B) If A and B are independent: P(A&B)= P(A)  P(B) P(B|A) = P(A and B)/P(A) The most common way to check for independence is simply to check that P(B) = P(B|A)

Review Questions

• • • • • 1. What is the probability?

2. What is the probability?

3. The two events are 4. What is the probability?

5. What is the probability that exactly one is defective?

1. C 2. B 3.D

4. A 5. D

Random Variables

• • • • • 𝜇 𝑎+𝑏𝑥 = 𝑎 + 𝑏 𝑥 𝜎 2 𝑎+𝑏𝑥 = 𝑏 2 𝜎 2 𝑥 𝜇 𝑋+𝑌 𝜎 2 𝑋+𝑌 𝜎 2 𝑋−𝑌 = 𝜇 𝑋 = 𝜎 = 𝜎 2 2 + 𝜇 𝑋 𝑋 𝑌 + 𝜎 + 𝜎 2 2 𝑌 𝑌

Binomial Distribution

• • • • • • Each observation is either a success or failure The number of observations is the fixed number n The n observations are all independent The probability of success, p, is the same for each observation.

Mean: np Standard deviation 𝑛𝑝(1 − 𝑝)

Geometric Distribution

• • • • • Each observation is either a success or failure The variable of interest is the number of trials required to obtain the first success The n observations are all independent The probability of success, p, is the same for each observation Mean: 1/p • Standard Deviation: 1−𝑝 𝑝

Review Questions P. 182

The Normal Distribution