Statistical Analysis of Experimental Data
• Table 6.1 shows the results of a set of
60 measurements of air temperature in
• These temperature data are observed
values of a random variable.
• A typical problem associated with data
such as these would be to determine
whether it is likely that the temperature
might exceed certain limits.
• A typical problem
associated with data such
as these would be to
determine whether it is
likely that the temperature
might exceed certain
• Although these data
show no temperatures less
than 1089 C or greater than
1115 C, we might, for
example, ask if there is a
significant chance that the
temperature will ever exceed
1117 C or be less than 1085 C
(either of which might
in some applications).
• This example illustrates a
random variable that can vary
continuously and can take any
real value in a certain domain.
Such a variable is called a
continuous random variable.
• Some experiments produce
discrete (noncontinuous) results,
which are considered to be
values of a discrete random
• Examples of discrete random
variables are the outcome of
tossing a die (which has the only
possible values of 1.,2,3,4,5,or 6)
and fail/no-fail products in a quality
• To apply statistical analysis to experimental data, the data are
usually characterized by determining parameters that specify the
central tendency and the dispersion of the data.
• The next step is to select a theoretical distribution function that
is most suitable for explaining the behavior of the data. The
theoretical function can then be used to make predictions about
various properties of the data.
GENERAL CONCEPTS AND DEFINITIONS
The population comprises the entire collection of objects,
measurements, observations, and so on whose properties are under
consideration and about which some generalizations are to be made.
Examples of population are the entire set of 60-W electric bulbs that
have been produced in a production batch and values of wind speed at
a certain point over a defined period of time.
The mode is the value of the variable that corresponds to the peak value
of the probability of occurrence of the event.
The Median is a value or quantity lying at the midpoint of a frequency
distribution of observed values or quantities, such that there is an equal
probability of falling above or below it
Probability is a numerical value expressing the likelihood of
occurrence of an event relative to all possibilities in a sample space.
The probability of occurrence of an event A is defined as
the number of successful occurrences (m) divided by the
total number of possible outcomes (n) in a sample space,
evaluated for n>>1
For particular situations, experience has shown that the distribution of the random
variable follows certain mathematical functions. Sample data are used to compute
parameters in these mathematical functions, and then we use the mathematical functions
to predict properties of the parent population. For discrete random variables,
these functions are called probability mass functions. For continuous random variables,
the functions are called probability density functions.
(a) Calculate the expected life of the bearings.
(b) If we pick a bearing at random from this batch,
what is the probability that its life (x) will
be less that 20 h, greater than 20 h, and finally,
exactly 20 h?
The Cumulative distribution function
we find that the probability that the life time is
less than 15 h is 0.55.
6.3.2 Some Probability Distribution Functions with Engineering
Binomial Distribution The binomial distribution is a distribution which describes
discrete random variables that can have only two possible outcomes: "success" and
This distribution has application in production quality control, when the quality
of a product is either acceptable or unacceptable. The following conditions need to be
satisfied for the binomial distribution to be applicable to a certain experiment:
1. Each trial in the experiment can have only the two possible outcomes of success
2. The probability of success remains constant throughout the experiment. This
probability is denoted by p and is usually known or estimated for a given population.
3. The experiment consists of n independent trials.
The expected number of successes in n trials for binomial distribution is
The standard deviation of the binomial distribution is
For the data of Example 6.4, calculate the probability of finding up to
and including two defective light bulbs in the sample of four.
Solution: We use E,q. (6.21) for this purpose:
The Poisson distribution is used to estimate the number of random
occurrences of an event in a specified interval of time or space if the
average number of occurrences is already known.
The following two assumptions underline the Poisson distribution:
1. The probability of occurrence of an event is the same for any two
intervals of the same length.
2. The probability of occurrence of an event is independent of the
occurrence of other events.
The probability of occurrence of x events is given by
Where is the expected or mean number of occurrences during the interval
of interest. The expected value of x for the Poisson distribution, the same as
the mean, is given by
It has been found in welds joining pipes that there is an average of
five defects per 10 linear meters of weld (0.5 defects per meter).
What is the probability that there will be
(a) a single defect in a weld that is 0.5 m long or
(b) more than one defect in a weld that is 0.5 m long.
A normal distribution is a very important statistical data distribution
pattern occurring in many natural phenomena, such as height,
blood pressure, lengths of objects produced by machines, etc.
Certain data, when graphed as a histogram (data on the horizontal
axis, amount of data on the vertical axis), creates a bell-shaped
curve known as a normal curve, or normal distribution.
Normal distributions are symmetrical with a single central peak at
the mean (average) of the data. The shape of the curve is described
as bell-shaped with the graph falling off evenly on either side of the
mean. Fifty percent of the distribution lies to the left of the mean and
fifty percent lies to the right of the mean.
The spread of a normal distribution is controlled by the standard
deviation, . The smaller the standard deviation the more concentrated
The mean and the median are the same in a normal distribution.
The lifetime of a battery is normally distributed with a mean life of 40
hours and a standard deviation of 1.2 hours. Find the probability that a
randomly selected battery lasts longer than 42 hours. Answer: 4.779%
The results of a test that follows a normal distribution have a mean
value of 10.0 and a standard deviation of 1. Find the probability that a
single reading is
between 9 and 12.
(b) between 8 and 9.55.
Reading from the chart, we see that approximately 19.1% of
normally distributed data is located between the mean (the peak)
and 0.5 standard deviations to the right (or left) of the mean. (The
percentages are represented by the area under the curve.)
• 50% of the distribution lies within 0.67448 standard deviations of the mean.
The central limit theorem states that