Chapter_5_Experimental_Design

Transcript Chapter_5_Experimental_Design

Daniel S. Yates

The Practice of Statistics

Third Edition Chapter 5:

Producing Data

Chapter 5 – Producing Data

• Sampling – a technique used to study a part or sample of a larger group in order to gain information about the entire group. The sample must be chosen carefully.

• Experiment – involves more than observations or questions of individuals. A condition is imposed on the individuals in order to observe a response. Experiments must be designed carefully.

• Confounding variables can disguise the effects of explanatory variables on response variables. Experiments must be designed to control these variables.

Section 5.1 – Designing Samples

• Sample design is the method used to select a sample from a population. Poor sample design will lead to bias and misleading conclusions

• Simple random samples - a sampling design method which attempts to eliminate bias.

• The easiest way to construct a SRS is to place names, numbers, etc. in a “hat” and chose.

• Using a random number table to generate a random sample.

1. Assign a numerical label to every individual in the population.

2. Use table B to select labels at random.

  Don’t scramble labels as you assign them. The table will randomize.

All labels must have the same number of digits. Ex. If choosing 5 individuals out of 30. Assign: 01,02,03,04,……30 not 1,2,3,4,…….30.

 You can read Table B in any order and start anywhere. Standard practice is to read across rows.

Other sampling designs

Other sampling designs - continued Multistage Sample – each stage is selected by a SRS Ex. Want to personally interview 60,000 people in the U.S.

Stage 1 - Take a SRS of the 3000 counties in the U.S.

Stage 2 - Take a SRS of the towns within each chosen county.

Stage 3 – Select a SRS of streets within each chosen town.

Stage 4 – Select a SRS of households on each street.

Cautions about sample surveys • Sample Bias may be introduced by the following: • Response Bias – Respondents may lie or be influenced by the race, sex, attitude or questioning techniques of the interviewer. Wording of the question can introduce bias.

• Even if great care is taken to design and carry out a sample survey, it is highly unlikely that the sample reflects the population exactly.

• However, the results do obey the laws of probability because of random sampling. So we can determine The margin of error. This is called statistical inference.

•

Large samples tend to give more accurate results than smaller samples.

observing and measuring specific characteristics without attempting to modify the subjects being studied

Observational Study

apply some treatment and then observe its effects on the subjects or experimental units

Experiment

subjects selected in such a way that every possible sample of the same size

has the same chance of being chosen

Simple Random Sample

selecting members from a population in such a way that each member of the population has a known (but not necessarily the same) chance of being selected

Probability Sample

Select some starting point and then select every kth element in the population

Systematic Sampling

Convenience Sampling

use results that are easy to get

subdivide the population into at least two different subgroups that share the same characteristics, then draw a sample from each subgroup

Stratified Sampling

divide the population area into sections; randomly select some of those sections; choose all members from selected sections

Cluster Sampling

Collect data by using some combination of the basic sampling methods Pollsters select a sample in different stages, and each stage might use different methods of sampling

Multistage Sampling



Randomization

is used when subjects are assigned to different groups through a process of random selection. The logic is to use chance as a way to create two groups that are similar.



Replication

is the repetition of an experiment on more than one subject. Samples should be large enough so that the erratic behavior that is characteristic of very small samples will not disguise the true effects of different treatments.



Blinding

is a technique in which the subject doesn’t know whether he or she is receiving a treatment or a placebo. Blinding allows us to determine whether the treatment effect is significantly different from a placebo effect, which occurs when an untreated subject reports improvement in symptoms.



Double-Blind

Blinding occurs at two levels: (1) The subject doesn’t know whether he or she is receiving the treatment or a placebo (2) The experimenter does not know whether he or she is administering the treatment or placebo



Confounding

occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors.



Controlling Effects of Variables

Completely Randomized Experimental Design assign subjects to different treatment groups through a process of random selection

  

Randomized Block Design a block is a group of subjects that are similar, but blocks differ in ways that might affect the outcome of the experiment Rigorously Controlled Design carefully assign subjects to different treatment groups, so that those given each treatment are similar in ways that are important to the experiment Matched Pairs Design compare exactly two treatment groups using subjects matched in pairs that are somehow related or have similar characteristics

Summary

Three very important considerations in the design of experiments are the following: 1. Use randomization to assign subjects to different groups 2. Use replication by repeating the experiment on enough subjects so that effects of treatment or other factors can be clearly seen.

3. Control the effects of variables by using such techniques as blinding and a completely randomized experimental design

Section 5.3 – Simulating Experiments

• Simulation – The imitation of chance behavior, based on a model that accurately reflects the experiment under consideration.

– Ex. Flipping a coin to simulate the birth of a baby. Heads-> Boy or Tails -> Girl

• Basic Simulation procedure 1. State the problem or describe the experiment.

• Ex. What is the likelihood of a run of 3 consecutive heads or 3 consecutive tails when a coin is tossed 10 times.

2. State assumptions • • A head or a tail are equally likely to occur on each toss.

Tosses are independent of each other.

3. Assign digits to represent outcomes • Use random number table or calculator.

• • One digit represents one toss of the coin.

Odd digits represent heads; even digits represent tails.

4. Simulate many repetitions 5. State conclusion.