Chapter 13 - Souderton Math
Download
Report
Transcript Chapter 13 - Souderton Math
Chapter 13: Experiments and
Observational Studies
AP Statistics
Observational Studies
Observational Studies: Researchers observe. They
don’t assign choices or manipulate anything
(unlike experiments).
We just use an existing situation (or data), neither
choosing who or what treatments.
Example: A recent study showed that men who
have had a heart attack, have a greater chance of
having a second heart attack if a certain protein is
present in their blood.
Observational Studies
• They are not based on random samples, nor do
they randomly impose treatments. The results
cannot be generalized, nor can they show causeand-effect.
• They, however, are not worthless.
• They can show us trends and possible
relationships—even if we can’t show cause-andeffect.
• They can show us variables related to certain
outcomes
Types of Observational Studies
Retrospective: Subjects are selected and then
their previous conditions and behaviors are
determined
restricted to small part of population
Prone to errors—looking at historical data
Usually focus on estimating differences between
groups or associations between variables
Types of Observational Studies
Prospective: Study where subjects are followed
to observe future outcomes.
Focus is on estimating differences among groups
that might appear as groups are followed
Because no treatment is applied, it is NOT and
experiment.
Randomized, Comparative
Experiments
• Only method by which we can prove cause-andeffect.
• We want to see if learning math on a computer is
better than learning it in a traditional
classroom—randomly assign half of a group of
students to classroom where the content was
only taught on computer and the other half to a
classroom where the content was never taught
on the computer, then we would compare the
results.
Randomized, Comparative
Experiments
Comparative just means we are
comparing the results at the end of
the experiment.
Randomized, Comparative
Experiments
Also called a
“factor”
Each factor has levels—values that
the experimenter chooses for the
factors
Randomized, Comparative
Experiments
An experiment is designed to test the claim that
those people who sleep less than 8 hours a night
have a decreased ability to remember
information. The experimenter has obtained 50
subjects and has randomly placed them in two
groups. All subjects will be given a memory test
as a baseline. One group will be required to sleep
at least 8 hours for one night and the other
groups will be prevented from sleeping 8 hours a
night. The next day, each group will be given a
test of memory and differences in the test will be
recorded.
Randomized, Comparative
Experiments
Important Concepts
• The experimenter actively and deliberately
manipulates the factors to control the details
of the possible treatments.
• The subjects are assigned to the treatments
randomly.
Four Principles of Experimental Design
• Control
• Randomization
• Replicate
• Block
Control
• We want to control sources of variation other
than the factors we are testing by making
conditions as similar as possible for all treatment
groups.
– We control a factor by assigning subjects to different
factor levels because we want to see how the
response will change at those different levels.
– We control other sources of variation to prevent them
from changing and affecting the response variable.
Control
• Controlling extraneous sources of variation
reduces the variability of the responses,
making it easier to detect differences among
the treatment groups.
• Making generalizations from the experiment
to other levels of the controlled factor can be
risky.
Randomize
• Allows us to equalize the effects of unknown
or uncontrollable sources of variation
– Doesn’t eliminate those effect of these sources,
but it spreads them out across all treatment
levels, so that they “even out” and can be looked
past.
– If not randomized, you will not be able to draw
conclusions from the experiment
– “control what you can, randomize the rest”
Replicate
• 1st type: We need to repeat the experiment,
applying the treatment to a number of
subjects. If we don’t assess the variation, it is
not complete. The outcome of an experiment
on a single subject is an anecdote—not an
experiment
Replicate
• 2nd type: Occurs when our experimental units
(subjects) are not representative of the
population of interest. We will need to repeat
the experiment with a different experimental
units. Replication of an entire experiment with
the controlled sources of variation at different
levels is an essential step in science. If your
subjects are from an Intro to Psychology class,
you can’t generalize the results—so you will need
to replicate the experiment
Block
• Sometimes random assignment to treatments
from our subjects is not the way to go.
• Sometimes we need to block. This is when we
group experimental subjects that are known
before the experiment to be similar in some
way that is expected to affect the response to
the treatments.
• The randomization comes within the blocks—
where we assign treatments in each block
Logic to Experimental Design
• Randomization produces groups of subjects
that should be similar in all respects before we
apply treatments
• Comparative design ensures that influences
other than the experimental treatment
operate equally on all groups.
• Therefore, differences in the response variable
must be due to the effects of the treatments
Experimental Diagram
Diagram of a randomized comparative
experiment.
An experiment that was designed to test the effectiveness of the drug hydroxyurea for
treating sickle cell anemia. There were 299 adult patients who had at least three
episodes of pain from sickle cell anemia in the past year.
Experimental Design
Randomized Comparative Experiment
Experiment to test the effectiveness of see what treatment may reduce the number
of repeat offenders.
Randomized Blocked Design
• Blocking is used instead of randomizing subjects
to treatments.
• Blocking is used if it is believed that there will be
differences in how set groups of subjects will
respond to the explanatory variable(s)
• Blocking may be used if we suspect that some
issue we cannot control may introduce variability
in the response. (maybe gender will produce
variability in the response, therefore, we will
block by gender)
Randomized Block Design
• Randomization is introduced when we randomly
assign treatments within each block.
• By blocking, we isolate the variability attributable to
the differences between the blocks, so that we can see
the differences caused by the treatments more clearly.
• We block to reduce variability so we can see the
effects of the factors
• When we block, we are not usually interested in
studying the effects of the blocks themselves (no need
to compare the results between/among the blocks)
Randomized Block Design
Example: Suppose that the drug we are testing
works effectively. That should show up as a
difference in response between the
experimental group and control group.
However, if both groups are mixed gender and
men and women respond differently to the
drug, then the variability between the genders
can drown out the true effect of the drug in
each gender. We won’t see that the drug is
effective.
Randomized Block Design
Example (cont.): We cannot cope with this
variability problem with randomization (can’t
randomize by gender). Instead, we block by
gender to reduce this variability.
Randomized Block Design
Matched Pairs Design
• Subjects are sometimes paired because they are similar
in ways not under the study
• When we match subjects in this way we can reduce
variability in much the same way as does blocking
• If we have study that is trying to determine if playing
sports increases mathematical achievement, we might
want to pair a subject that has a high IQ and plays a
sport, with a subject that has a high IQ and does not
play a sport (this would be an observational study).
• The matching would reduce the variation due to IQ
differences.
Matched Pairs Design
• When we have a matched pairs design that is an
experiment, we need to introduce randomization
• Suppose we use a matched pairs design in an
experiment that looks at whether or not children
can determine the difference between the facial
expressions of fear and anger. In this situation,
we could match subjects and then randomly
assign the order in which the pictures of facial
expression are shown. One part of pair will get
anger and then fear, the other part of the pair will
get fear and then anger.
Matched Pairs Design Diagram
Confounding and Lurking Variables
Confounding and Lurking Variables
Confounding and Lurking Variables
Confounding and Lurking Variables
• Lurking variables are most common in
observational studies
• Confounding are most common in
experiments
Statistically Significant
How do we know if the results of an experiment
really show that there is a difference?
Suppose we tested a medication to see if it reduced
blood pressure better than on older medication.
In our results, we calculated that, on average, the
medication reduced blood pressure by about 10%
more than the older medication. Is this evidence
that the new medication worked better than the
new? What if it only reduced it by 3%? 20%
Statistically Significant
In order to determine if the difference is enough
we need to say that the differences are
statistically significant.
This means that the differences we observed are
too big to be explained by chance differences.
If they are too big to be explained by chance,
then we can attribute the differences to the
experiment.
Statistically Significant
Suppose we flip a coin 100 times and it comes up heads
47 times. Is the coin fair, or is it weighted?
We know it should come up heads 50 times
(theoretically), but this difference between what we
got and what we should get is not large and can be
explained by chance differences (it is not unlikely to get
those results if the coin was fair). However, if we got
30 heads, we may determine that a difference that big
is too large for us to say that it is due to chance
differences (it is very unlikely to get those results if the
coin was fair). This last result shows that the
differences are statistically significant and that the coin
is probably not fair.
Statistically Significant
We will learn how to determine that in later
chapters.
Always be skeptical of studies and experiments
that discuss how much better (or worse) one
thing is than another without stating that the
results are statistically significant. That is the
only way to determine if there really is a
difference between two (or more) things.
Experiments and Samples
(differences, similarities and other info)
Similarities
• Both use randomization to get unbiased data
Experiments and Samples
(differences, similarities and other info)
Differences
• Samples try to estimate the population
parameters, so randomizing is an attempt at
having the sample be representative of the
population
• Experiments try to assess the effectiveness of
treatments, so randomization is used to assign
treatments so as to reduce (eliminate) unwanted
problems. Experiments rarely draw their
subjects from random samples of the
population.
Experiments and Samples
(differences, similarities and other info)
Differences
• If our objective is to learn something about a
population
Sample Survey
• If our objective is to see if there is a difference in
the effects of two treatments
Experiment
• If our objective is just to use an existing situation
to look for trends and/or contributing factors
Observational study
Experiments and Samples
(differences, similarities and other info)
Other
• If the subjects in an experiment are not random
samples from the population, be cautious about
generalizing the results of an experiment until the
experiment has been replicated using different
subjects, environments, etc.
• Experiments typically draw stronger conclusions than
surveys (even if experimental subjects are not random
samples).
– Because, by looking only at the differences across
treatment groups, experiments cancel out many of the
sources of bias.
Control Treatments
If we are attempting to see if a medication
effectively reduces anxiety, we don’t just want
to give all our subjects the medication and
record if their anxiety decreased or not.
Instead, we want to compare how much their
anxiety decreased compared to people who
did not take the medication. This comparison
group is the control group and that group’s
measurement is called the control treatment.
Blinding
Blinding is when we create an environment where
subjects and/or researchers (physicians,
technicians, psychologists, etc) do not know who
gets treatment and who gets placebo.
Who can affect the outcome of an experiment?
Those who could influence the results (the
subjects, treatment administrators, or
technicians, etc)
Those who evaluate the results (judges, treating
physicians, etc)
Blinding
When individuals in one of those two groups is
blinded we say that the experiment is singleblind.
When individuals in both of those groups are
blinded we say that the experiment is doubleblind.
Blinding
It is important to blind because it is easy for a
person’s knowledge about which treatment is
given to which people to influence there
actions and beliefs. Therefore, blinding will
eliminate this form of bias.
Placebo
A placebo is a “fake” treatment that looks just
like the treatments being tested.
It many times is used for the control group’s
treatment
It is the best way to blind subjects
Placebo
Sometimes groups who are treated with the placebo
(control group) will show an improvement. This is not
uncommon.
This effect (when the control group show an
improvement when treated with placebo) is called the
placebo effect.
It is not uncommon that 20% or more of subjects who are
given a placebo report things such as reduction of pain,
decreased depression and improved health issues
Characteristics of Good Experiments
•
•
•
•
Randomized
Comparative
Double-Blind
Placebo-controlled