MATH408: PROBABILITY & STATISTICS
Download
Report
Transcript MATH408: PROBABILITY & STATISTICS
MATH408: Probability & Statistics
Summer 1999
WEEKS 10 & 11
Dr. Srinivas R. Chakravarthy
Professor of Mathematics and Statistics
Kettering University
(GMI Engineering & Management Institute)
Flint, MI 48504-4898
Phone: 810.762.7906
Email: [email protected]
Homepage: www.kettering.edu/~schakrav
DESIGN OF EXPERIMENTS
• Earlier we talked about the quality of a product
and how statistics is used to continuously to
improve the quality of a product.
• We saw a number of statistical methods to
analyze the data and make interpretations.
DESIGN OF EXPERIMENTS (cont’d)
• One of the important tools of statistics that has
been widely used in evaluating the quality of a
product, identifying the sources that affect the
quality, setting up the values of the parameters that
will optimize the response variable, is the Design
of Experiments.
• Designing an experiment is like designing a
product. The purpose should be clearly defined to
begin with.
• The experiment should be set up to answer a
specific question or a set of questions.
WHAT IS A DESIGNED EXPERIMENT?
• Enables us to observe the behavior of a particular
aspect of reality.
• Experimental design is an organized approach to
the collection of information.
• In most practical problems, many variables
influence the outcome of an experiment.
• Usually these interact in very complex ways.
• A good design allows for estimation and
interpretation of these interactions.
DESIGNED EXPERIMENT (cont’d)
• An experimenter chooses certain factors and in a
controlled environment varies these factors so as
to observe the effects.
• No statistical tool can come to rescue data
obtained from designs conducted haphazardly.
OBJECTIVES
• Maximize the amount of information
• Identify factors that
– (a) affect the average response;
– (b) affect the variability;
– (c) do not contribute significantly.
• Identify the mathematical model relating the
response to the factors.
• Identify “optimum” settings for the factors.
• CONFIRM the settings.
STARTING POINT OF DOE
• Consider the following scenario
• A process engineer in the manufacture of
reinforced pet moldings using injection-molding
process asks the following question:
– We are manufacturing two different parts using twocavity injection molds.
– One part, the shaft, is molded in a 55% glass fiber
reinforced PET polyester, while the other part, the tube
is produced from a 45% fiber reinforced PET.
– Both parts are end gated and we also know where the
areas of failure during a physical testing for these two
parts.
STARTING POINT OF DOE (cont’d)
• We want to find the optimum molding process.
That is, what should be the levels of the factors:
melt temperature, mold pressure, hold time,
injection speed, and hold pressure that will
optimize the strength of the reinforced pet
moldings?
• Almost all DOE’s in practice start with such a
statement.
MAJOR STEPS IN DOE
• Design of experiment (DOE) is an iterative decisionmaking process. Like any area of applied science, the steps
involved in DOE can be grouped into three stages:
analysis, synthesis, and evaluation. These phases are
characterized as:
• Analysis: (a) Recognition of the problem; (b) formulating
the experimental problem; (c) analysis of the experiment.
• Synthesis: (a) Designing the experimental model; (b)
designing the analytical model.
• Evaluation: (a) Conducting the experiment; (b) Deriving
solution(s) from the model; (c) Make appropriate
conclusions and recommendations.
Basic concepts in DOE
• Factor, level, treatment, effect, response, test run,
interaction, blocking, confounding, experimental unit,
replication, randomization, and covariate. Some of these
were seen in our lecture on ANOVA.
• Block: A factor that has influence on the variability of the
response variable.
• Randomization: This refers to assigning the experimental
units randomly to treatments.
• Replication: This refers to the repetition of an experiment.
This should be practiced in all experimental work in order
to increase the precision.
Concepts in DOE (cont’d)
• Block: A group of homogeneous experimental units.
• Confounding: When one or more effects that cannot be
unambiguously be attributed to a single factor or
interaction.
• Covariate: An uncontrollable variable that influences the
response but is unaffected by any other experimental
factors. Covariates are not additional responses and hence
their values are not affected by the factors in the
experiment.
• Test run: Single combination of factor levels that yields an
observation on the response.
SELECTION OF VARIABLES
AND FACTORS
• Usually there will be only one response variable and the
objective of the experiment will indicate the response
variable. The response variable can be qualitative or
quantitative.
• The selection of factors is a critical one and involves a
detailed plan. At first all possible factors, irrespective
whether they are practical to be measured or not, should be
included in the experiment.
• A common approach is to use a cause-and-effect diagram
(refer to Lecture 1 notes for details on this) listing all the
factors.
ILLUSTRATIVE EXAMPLE
• A new brand of printing paper is being considered by a
leading photographic company.
• The study will be focusing on the effects of various factors
on the development time.
• So, the response variable for this is the development time.
• The experiment will consists of the following steps:
– (i) a test negative will be placed on the glass top of a contact
printer;
– (ii) a sample of printing paper will be placed on top of the
negative;
– (iii) the light on the contact printer will be turned on for a specific
amount of time; and
– (iv) the printing paper will be placed on a developing tray until an
image appears.
EXAMPLE (cont’d)
– The following factors are considered to play a role:
(1) exposure time; (2) density of test negative; (3)
temperature of the laboratory where the developing
is done; (4) intensity of exposing light; (5) types of
developer; (6) amount of developer; (7) grade of
printing paper; (8) condition of printing paper; (9)
voltage fluctuations during the experiment; (10)
humidity; (11) number of times the developer will
be used; (12) size of printing paper; and (13)
operator. After careful study, the company decided
to use three factors: exposure time, type of
developer, and grade of printing paper in the
experiment and the remaining factors are either
controlled or made as experimental error.
DOE STRUCTURE
• The design of experiments refers to the structure
of the experiment with reference to
• the set of treatments included
• the set of experimental units
- the rules by which the treatments are
assigned to the units
- the measurements taken
DOE STRUCTURE (cont’d)
• For example if a teacher wishes to compare the
relative merits of four teaching aids: text book
only, text book and class notes, text book and lab
manual, text book, lab manual and class notes.
• Treatments: four teaching aids
• Experimental units: participating students (or
classes)
• Rules: Once the treatments and the experimental
units are selected the rules are required for
assigning the treatments to the experimental units.
RANDOMIZATION
(Sir R. A. Fisher)
• Assigning the units randomly to treatments.
This tends to eliminate the influence of
external factors (or noise factors) not under
the direct control of the experimenter; avoid
any selection bias. Also the variation from
these noise factors can bias the estimated
effects. Hence in order to minimize this
source of bias, randomization technique
should be adopted in all experimental work.
REPLICATION
(Sir R. A. Fisher)
• Repetition of an experiment.
– For example if we have 3 treatments and 6 units, the
assignment of 3 units at random to the 3 treatments
constitute one replication and the assignment of the
remaining 3 units to the 3 treatments constitute another
replication of the experiment.
• Replication should be practiced in all DOE work.
• Also replication is used to assess the error mean
square as well as to increase the precision.
SOME COMMON PROBLEMS IN DOE
• (a) experimental variation hides true factor effects;
• (b) uncontrolled factors compromise experimental
conclusions;
• (c) one-factor-at-a-time designs will not give a
true picture of many-factor experiments.
COMMONLY USED DESIGNS
•
•
•
•
•
Completely Randomized Designs (CRD)
Randomized Block Designs (RBD)
Latin Square Designs (LSD)
2n Factorial Designs.
Fractional Factorial Designs (including
Taguchi’s orthogonal designs)
Completely Randomized Design (CRD)
• This is the basic design.
• All other randomized designs stem from it by
imposing restrictions upon the allocation of the
treatments to the units.
• The units are assigned to treatments at random.
• Thus every unit chosen for the study has an equal
chance of being assigned to any treatment.
• This is useful when the units are homogeneous.
• Most useful in laboratory techniques.
Advantages and Disadvantages of a CRD
•
•
•
•
•
(1) it is felxible
(2) its MSE has a larger degrees of freedom
(3) it allows for missing observations
(4) it has fewer assumptions
Heterogeneous; # of treatments is large
ANALYSIS OF A CRD
• The analysis of single-factor studies that we
discussed in ANOVA is applicable and there
is no need to repeat the analysis here.
Randomized Block Design (RBD)
• When experimental units are heterogeneous to
reduce experimental error variability we need to
sort the units into homogeneous groups called
blocks.
• The treatments are then randomly assigned within
blocks.
• That is, randomization is restricted.
• This procedure is called BLOCKING.
• Since the development of RBD in 1925 this design
has become very popular among all designs.
RBD (cont’d)
• As an example of this design, suppose that a company is
considering buying one of 5 word processors for use in its
offices.
• In order to study the average time for its employees to
learn the word processors, if all have the same ability we
could use a CRD.
• However this will be the case. We can sort the employees
into blocks of 5 and assign randomly the 5 word processors
for learning.
• If we had used a CRD any effect that should have been
attributed to blocks would end up in the error term.
• By blocking we remove a source of variation from the
error term.
Advantages and Disadvantages of a RBD
•
•
•
•
(1) provides precise results with proper blocking
(2) No need to have equal sample sizes
(3) the analysis is simple
(4) one can bring in more variability among the
experimental units, which usually is the case in
practice.
• (1) missing observations; (2) DF are not as large
as with a CRD; (3) Need more assumptions.
ANALYSIS OF A RBD
• The analysis of multi-factor studies that we
discussed in ANOVA is applicable and there is no
need to repeat the analysis here.
Effect of a factor
- Change in response produced by a
change in the level of that factor
averaged over the levels of the other
factor(s).
- Magnitude and direction of factor
effects are to be examined to see which
are likely to be important.
INTERACTION
- Exists if the difference in response between the
levels of one factor is not the same at all levels of
the other factor(s).
• Calculated as the average difference between the
effect of A at high level of B and the effect of A at
the low level of B.
FACTORIAL DESIGNS
FACTORIAL DESIGNS
• In 2k design:
All factor effects will have 1 d.f
If there are n replicates, SSE will have (n-1)2k
d.f.
Replicates are very important in testing for lack
of fit
If n=1, we have no estimate for error [Why?]
Use higher order interactions to get an estimate.
- Plot the estimates on a normal probability paper. All
effects that are insignificant will fall on a line.