#### Transcript Slide 1

```3
Chapter 4
Describing the
Relation between
Two Variables
Section 4.1 Scatter Diagrams and Correlation
© 2010 Pearson Prentice Hall. All rights
reserved
4-2
© 2010 Pearson Prentice Hall. All rights
reserved
4-3
3
© 2010 Pearson Prentice Hall. All rights
reserved
4-4
© 2010 Pearson Prentice Hall. All rights
reserved
4-5
5
EXAMPLE
Drawing and Interpreting a Scatter Diagram
The data shown to the right are based
on a study for drilling rock. The
researchers wanted to determine
whether the time it takes to dry drill a
distance of 5 feet in rock increases with
the depth at which the drilling begins.
So, depth at which drilling begins is the
explanatory variable, x, and time (in
minutes) to drill five feet is the response
variable, y. Draw a scatter diagram of
the data.
Source: Penner, R., and Watts, D.G. “Mining Information.” The American
Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
© 2010 Pearson Prentice Hall. All rights
reserved
4-6
© 2010 Pearson Prentice Hall. All rights
reserved
4-7
Various Types of Relations in a Scatter Diagram
© 2010 Pearson Prentice Hall. All rights
reserved
4-8
© 2010 Pearson Prentice Hall. All rights
reserved
4-9
© 2010 Pearson Prentice Hall. All rights
reserved
4-10
© 2010 Pearson Prentice Hall. All rights
reserved
4-11
© 2010 Pearson Prentice Hall. All rights
reserved
4-12
© 2010 Pearson Prentice Hall. All rights
reserved
4-13
© 2010 Pearson Prentice Hall. All rights
reserved
4-14
© 2010 Pearson Prentice Hall. All rights
reserved
4-15
EXAMPLE Determining the Linear Correlation Coefficient
Determine the linear correlation
coefficient of the drilling data.
© 2010 Pearson Prentice Hall. All rights
reserved
4-16
© 2010 Pearson Prentice Hall. All rights
reserved
4-17
 xi  x   yi  y 
  s   s 
 x  y 
r
n 1
8.501037

12  1
 0.773
© 2010 Pearson Prentice Hall. All rights
reserved
4-18
18
© 2010 Pearson Prentice Hall. All rights
reserved
4-19
19
© 2010 Pearson Prentice Hall. All rights
reserved
4-20
20
EXAMPLE
Does a Linear Relation Exist?
Determine whether a linear relation exists between time to drill five feet and depth at
which drilling begins. Comment on the type of relation that appears to exist between
time to drill five feet and depth at which drilling begins.
The correlation between drilling depth and
time to drill is 0.773. The critical value for
n = 12 observations is 0.576. Since 0.773
> 0.576, there is a positive linear relation
between time to drill five feet and depth
at which drilling begins.
© 2010 Pearson Prentice Hall. All rights
reserved
4-21
21
© 2010 Pearson Prentice Hall. All rights
reserved
4-22
22
According to data obtained from the Statistical Abstract of the United States,
the correlation between the percentage of the female population with a
bachelor’s degree and the percentage of births to unmarried mothers since
1990 is 0.940.
Does this mean that a higher percentage of females with bachelor’s degrees
causes a higher percentage of births to unmarried mothers?
Certainly not! The correlation exists only because both percentages have
been increasing since 1990. It is this relation that causes the high correlation.
In general, time series data (data collected over time) will have high
correlations because each variable is moving in a specific direction over time
(both going up or down over time; one increasing, while the other is
decreasing over time).
When data are observational, we cannot claim a causal relation exists
between two variables. We can only claim causality when the data are
collected through a designed experiment.
© 2010 Pearson Prentice Hall. All rights
reserved
4-23
23
Another way that two variables can be related even though there is not a causal
relation is through a lurking variable.
A lurking variable is related to both the explanatory and response variable.
For example, ice cream sales and crime rates have a very high correlation. Does
this mean that local governments should shut down all ice cream shops? No! The
lurking variable is temperature. As air temperatures rise, both ice cream sales and
crime rates rise.
© 2010 Pearson Prentice Hall. All rights
reserved
4-24
© 2010 Pearson Prentice Hall. All rights
reserved
4-25
25
This study is a prospective cohort study, which is an observational study.
Therefore, the researchers cannot claim that increased cola consumption causes a
decrease in bone mineral density.
Some lurking variables in the study that could confound the results are:
•
•
•
•
•
•
body mass index
height
smoking
alcohol consumption
calcium intake
physical activity
© 2010 Pearson Prentice Hall. All rights
reserved
4-26
Section 4.2 Least-squares Regression
© 2010 Pearson Prentice Hall. All rights
reserved
4-27
27
Using the following sample data:
(a) Find a linear equation that relates x (the explanatory variable) and y (the
response variable) by selecting two points and finding the equation of the line
containing the points.
Using (2, 5.7) and (6, 1.9):
5.7  1.9
26
 0.95
m
y  y1  m  x  x1 
y  5.7  0.95  x  2 
y  5.7  0.95 x  1.9
y  0.95 x  7.6
© 2010 Pearson Prentice Hall. All rights
reserved
4-28
28
(b) Graph the equation on the scatter diagram.
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
(c) Use the equation to predict y if x = 3. y  0.95 x  7.6
 0.95(3)  7.6
 4.75
© 2010 Pearson Prentice Hall. All rights
reserved
4-29
© 2010 Pearson Prentice Hall. All rights
reserved
4-30
The difference between the observed value of y and the predicted value of y is the
error, or residual.
Using the line from the last example, and the predicted value at x = 3:
residual = observed y – predicted y
= 5.2 – 4.75
= 0.45
7
6
(3, 5.2)
y – predicted y
} residual == observed
5.2 – 4.75
5
= 0.45
4
3
2
1
0
0
1
2
3
4
5
© 2010 Pearson Prentice Hall. All rights
reserved
6
7
4-31
© 2010 Pearson Prentice Hall. All rights
reserved
4-32
© 2010 Pearson Prentice Hall. All rights
reserved
4-33
EXAMPLE Finding the Least-squares Regression Line
Using the drilling data
(a) Find the least-squares regression line.
(b) Predict the drilling time if drilling starts at
130 feet.
(c) Is the observed drilling time at 130 feet
above, or below, average.
(d) Draw the least-squares regression line on
the scatter diagram of the data.
© 2010 Pearson Prentice Hall. All rights
reserved
4-34
(a) We agree to round the estimates of the slope and intercept to four
decimal places.
yˆ  0.0116 x  5.5273
(b)
yˆ  0.0116 x  5.5273
 0.0116(130)  5.5273
 7.035
(c) The observed drilling time is 6.93 seconds. The predicted drilling
time is 7.035 seconds. The drilling time of 6.93 seconds is below
average.
© 2010 Pearson Prentice Hall. All rights
reserved
4-35
(d)
8.5
Time to Drill 5 Feet
8
7.5
7
6.5
6
5.5
0
20
40
60
80
100
120
140
160
180
200
Depth Drilling Begins
© 2010 Pearson Prentice Hall. All rights
reserved
4-36
© 2010 Pearson Prentice Hall. All rights
reserved
4-37
Interpretation of Slope: The slope of the regression line is 0.0116. For each
additional foot of depth we start drilling, the time to drill five feet increases by 0.0116
minutes, on average.
Interpretation of the y-Intercept: The y-intercept of the regression line is 5.5273. To
interpret the y-intercept, we must first ask two questions:
1. Is 0 a reasonable value for the explanatory variable?
2. Do any observations near x = 0 exist in the data set?
A value of 0 is reasonable for the drilling data (this indicates that drilling begins at the
surface of Earth. The smallest observation in the data set is x = 35 feet, which is
reasonably close to 0. So, interpretation of the y-intercept is reasonable.
The time to drill five feet when we begin drilling at the surface of Earth is 5.5273
minutes.
© 2010 Pearson Prentice Hall. All rights
reserved
4-38
If the least-squares regression line is used to make predictions based on values of
the explanatory variable that are much larger or much smaller than the observed
values, we say the researcher is working outside the scope of the model. Never
use a least-squares regression line to make predictions outside the scope of the
model because we can’t be sure the linear relation continues to exist.
© 2010 Pearson Prentice Hall. All rights
reserved
4-39
© 2010 Pearson Prentice Hall. All rights
reserved
4-40
To illustrate the fact that the sum of squared residuals for a least-squares
regression line is less than the sum of squared residuals for any other line, use
the “regression by eye” applet.
© 2010 Pearson Prentice Hall. All rights
reserved
4-41
Section 4.3 Diagnostics on the Least-squares
Regression Line
© 2010 Pearson Prentice Hall. All rights
reserved
4-42
42
© 2010 Pearson Prentice Hall. All rights
reserved
4-43
43
The coefficient of determination, R2, measures the
proportion of total variation in the response variable that is
explained by the least-squares regression line.
The coefficient of determination is a number between 0 and 1,
inclusive. That is, 0 < R2 < 1.
If R2 = 0 the line has no explanatory value
If R2 = 1 means the line variable explains 100% of the variation in
the response variable.
© 2010 Pearson Prentice Hall. All rights
reserved
4-44
44
The data to the right are based
on a study for drilling rock. The
researchers wanted to
determine whether the time it
takes to dry drill a distance of 5
feet in rock increases with the
depth at which the drilling
begins. So, depth at which
drilling begins is the predictor
variable, x, and time (in
minutes) to drill five feet is the
response variable, y.
Source: Penner, R., and Watts, D.G. “Mining
Information.” The American Statistician, Vol. 45, No. 1,
Feb. 1991, p. 6.
© 2010 Pearson Prentice Hall. All rights
reserved
4-45
45
© 2010 Pearson Prentice Hall. All rights
reserved
4-46
Sample Statistics
Mean
Standard Deviation
Depth
126.2
52.2
Time
6.99
0.781
Correlation Between Depth and Time: 0.773
Regression Analysis
The regression equation is
Time = 5.53 + 0.0116 Depth
© 2010 Pearson Prentice Hall. All rights
reserved
4-47
Suppose we were asked to predict the time to
drill an additional 5 feet, but we did not know
the current depth of the drill. What would be
our best “guess”?
© 2010 Pearson Prentice Hall. All rights
reserved
4-48
Suppose we were asked to predict the time to
drill an additional 5 feet, but we did not know
the current depth of the drill. What would be
our best “guess”?
© 2010 Pearson Prentice Hall. All rights
reserved
4-49
Suppose we were asked to predict the time to
drill an additional 5 feet, but we did not know
the current depth of the drill. What would be
our best “guess”?
The mean time to drill an additional 5 feet: 6.99 minutes
© 2010 Pearson Prentice Hall. All rights
reserved
4-50
Now suppose that we are asked to predict the
time to drill an additional 5 feet if the current
depth of the drill is 160 feet?
Our “guess” increased from 6.99 minutes to 7.39
minutes based on the knowledge that drill depth
is positively associated with drill time.
© 2010 Pearson Prentice Hall. All rights
reserved
4-51
© 2010 Pearson Prentice Hall. All rights
reserved
4-52
The difference between the observed value
of the response variable and the mean
value of the response variable is called the
total deviation and is equal to
© 2010 Pearson Prentice Hall. All rights
reserved
4-53
The difference between the predicted value
of the response variable and the mean value
of the response variable is called the
explained deviation and is equal to
© 2010 Pearson Prentice Hall. All rights
reserved
4-54
The difference between the observed value
of the response variable and the predicted
value of the response variable is called the
unexplained deviation and is equal to
© 2010 Pearson Prentice Hall. All rights
reserved
4-55
© 2010 Pearson Prentice Hall. All rights
reserved
4-56
Total Variation = Unexplained Variation + Explained Variation
© 2010 Pearson Prentice Hall. All rights
reserved
4-57
Total Variation = Unexplained Variation + Explained Variation
Unexplained Variation
1=
Explained Variation
+
Total Variation
Total Variation
Explained Variation
=1–
Total Variation
Unexplained Variation
Total Variation
© 2010 Pearson Prentice Hall. All rights
reserved
4-58
To determine R2 for the linear regression model simply
square the value of the linear correlation coefficient.
© 2010 Pearson Prentice Hall. All rights
reserved
4-59
EXAMPLE
Determining the Coefficient of Determination
Find and interpret the coefficient of determination
for the drilling data.
Because the linear correlation coefficient, r, is
0.773, we have that
R2 = 0.7732 = 0.5975 = 59.75%.
So, 59.75% of the variability in drilling time is
explained by the least-squares regression line.
© 2010 Pearson Prentice Hall. All rights
reserved
4-60
Draw a scatter diagram for each of these data sets.
For each data set, the variance of y is 17.49.
© 2010 Pearson Prentice Hall. All rights
reserved
4-61
Data Set A
Data Set B
© 2010 Pearson Prentice Hall. All rights
reserved
Data Set C
4-62
© 2010 Pearson Prentice Hall. All rights
reserved
4-63
Residuals play an important role in determining the
adequacy of the linear model. In fact, residuals
can be used for the following purposes:
• To determine whether a linear model is
appropriate to describe the relation between the
predictor and response variables.
• To determine whether the variance of the
residuals is constant.
• To check for outliers.
© 2010 Pearson Prentice Hall. All rights
reserved
4-64
If a plot of the residuals against the predictor
variable shows a discernable pattern, such as
curved, then the response and predictor
variable may not be linearly related.
© 2010 Pearson Prentice Hall. All rights
reserved
4-65
© 2010 Pearson Prentice Hall. All rights
reserved
4-66
A chemist as a 1000-gram sample of a
radioactive material. She records the amount of
radioactive material remaining in the sample
every day for a week and obtains the following
data.
Day Weight (in grams)
0
1
2
3
4
5
6
7
1000.0
897.1
802.5
719.8
651.1
583.4
521.7
468.3
© 2010 Pearson Prentice Hall. All rights
reserved
4-67
Linear correlation coefficient: -0.994
© 2010 Pearson Prentice Hall. All rights
reserved
4-68
© 2010 Pearson Prentice Hall. All rights
reserved
4-69
Linear model not appropriate
© 2010 Pearson Prentice Hall. All rights
reserved
4-70
If a plot of the residuals against the
explanatory variable shows the spread of
the residuals increasing or decreasing as
the explanatory variable increases, then a
strict requirement of the linear model is
violated.
This requirement is called constant error
variance. The statistical term for constant
error variance is homoscedasticity
© 2010 Pearson Prentice Hall. All rights
reserved
4-71
© 2010 Pearson Prentice Hall. All rights
reserved
4-72
A plot of residuals against the explanatory
variable may also reveal outliers. These
values will be easy to identify because the
residual will lie far from the rest of the plot.
© 2010 Pearson Prentice Hall. All rights
reserved
4-73
© 2010 Pearson Prentice Hall. All rights
reserved
4-74
EXAMPLE Residual Analysis
Draw a residual plot of the drilling time data.
Comment on the appropriateness of the linear
least-squares regression model.
© 2010 Pearson Prentice Hall. All rights
reserved
4-75
© 2010 Pearson Prentice Hall. All rights
reserved
4-76
Boxplot of Residuals for the Drilling Data
© 2010 Pearson Prentice Hall. All rights
reserved
4-77
© 2010 Pearson Prentice Hall. All rights
reserved
4-78
An influential observation is one that has a
disproportionate affect on the value of the
slope and y-intercept in the least-squares
regression equation.
© 2010 Pearson Prentice Hall. All rights
reserved
4-79
Explanatory, x
Influential observations typically exist when the point
is large relative to its X value. So, Case 3 is likely
influential.
© 2010 Pearson Prentice Hall. All rights
reserved
4-80
EXAMPLE
Influential Observations
Suppose an additional data point is added to the
drilling data. At a depth of 300 feet, it took 12.49
minutes to drill 5 feet. Is this point influential?
© 2010 Pearson Prentice Hall. All rights
reserved
4-81
© 2010 Pearson Prentice Hall. All rights
reserved
4-82
© 2010 Pearson Prentice Hall. All rights
reserved
4-83
With influential
Without
influential
© 2010 Pearson Prentice Hall. All rights
reserved
4-84
As with outliers, influential observations should be
removed only if there is justification to do so.
When an influential observation occurs in a data
set and its removal is not warranted, there are two
courses of action:
(1) Collect more data so that additional points
near the influential observation are obtained, or
(2) Use techniques that reduce the influence of
the influential observation (such as a
transformation or different method of estimation e.g. minimize absolute deviations).
© 2010 Pearson Prentice Hall. All rights
reserved
4-85
Section 4.4 Contingency Tables and Association
© 2010 Pearson Prentice Hall. All rights
reserved
4-86
86
A professor at a community college in New Mexico conducted a study to assess
the effectiveness of delivering an introductory statistics course via traditional
lecture-based method, online delivery (no classroom instruction), and hybrid
instruction (online course with weekly meetings) methods, the grades students
received in each of the courses were tallied.
The table is referred to as a contingency table, or two-way table, because it
relates two categories of data. The row variable is grade, because each row in the
table describes the grade received for each group. The column variable is
delivery method. Each box inside the table is referred to as a cell.
© 2010 Pearson Prentice Hall. All rights
reserved
87
4-87
© 2010 Pearson Prentice Hall. All rights
reserved
4-88
88
A marginal distribution of a variable is a frequency or
relative frequency distribution of either the row or
column variable in the contingency table.
© 2010 Pearson Prentice Hall. All rights
reserved
4-89
89
EXAMPLE
Determining Frequency Marginal Distributions
A professor at a community college in New Mexico conducted a study to assess
the effectiveness of delivering an introductory statistics course via traditional
lecture-based method, online delivery (no classroom instruction), and hybrid
instruction (online course with weekly meetings) methods, the grades students
received in each of the courses were tallied. Find the frequency marginal
distributions for course grade and delivery method.
© 2010 Pearson Prentice Hall. All rights
reserved
4-90
90
EXAMPLE
Determining Relative Frequency Marginal Distributions
Determine the relative frequency marginal distribution for course grade and delivery
method.
© 2010 Pearson Prentice Hall. All rights
reserved
4-91
91
© 2010 Pearson Prentice Hall. All rights
reserved
4-92
92
A conditional distribution lists the relative
frequency of each category of a variable given a
specific value of the other variable in the
contingency table.
© 2010 Pearson Prentice Hall. All rights
reserved
4-93
93
EXAMPLE
Determining a Conditional Distribution
Construct a conditional distribution
of course grade by method of
delivery. Comment on any type of
association that may exist between
course grade and delivery method.
It appears that students in the hybrid course are
more likely to pass (A, B, or C) than the other two
methods.
© 2010 Pearson Prentice Hall. All rights
reserved
4-94
94
EXAMPLE
Drawing a Bar Graph of a Conditional Distribution
Using the results of the previous example, draw a bar graph that represents the
conditional distribution of grade earned by method of delivery.
© 2010 Pearson Prentice Hall. All rights
reserved
4-95
95
The following contingency table shows the survival status and demographics of
passengers on the ill-fated Titanic. Draw a conditional bar graph of survival status by
demographic characteristic.
Survival Status on the Titanic
0.9
0.8
Relative Frequeny
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Men
Women
Boys
Girls
Survived
0.19716647
0.753554502
0.453125
0.6
Died
0.80283353
0.246445498
0.546875
0.4
© 2010 Pearson Prentice Hall. All rights
reserved
4-96
96
© 2010 Pearson Prentice Hall. All rights
reserved
4-97
97
EXAMPLE
Insulin dependent (or Type 1) diabetes is a disease that results in the
permanent destruction of insulin-producing beta cells of the pancreas. Type 1
diabetes is lethal unless treatment with insulin injections replaces the missing
hormone. Individuals with insulin independent (or Type 2) diabetes can
produce insulin internally. The data shown in the table below represent the
survival status of 902 patients with diabetes by type over a 5-year period.
Type 1
Type 2
Total
Survived
253
326
579
Died
105
218
323
358
544
902
From the table, the proportion of patients with Type 1 diabetes who died
was 105/358 = 0.29; the proportion of patients with Type 2 diabetes who
died was 218/544 = 0.40. Based on this, we might conclude that Type 2
diabetes is more lethal than Type 1 diabetes.
© 2010 Pearson Prentice Hall. All rights
reserved
4-98
98
However, Type 2 diabetes is usually contracted after the age of 40. If we account for
the variable age and divide our patients into two groups (those 40 or younger and
those over 40), we obtain the data in the table below.
Type 1
Survived
Died
Type 2
Total
< 40
> 40 < 40
> 40
129
124
15
311
579
1
104
0
218
323
130
228
15
529
902
Of the diabetics 40 years of age or younger, the proportion of those with Type 1
diabetes who died is 1/130 = 0.008; the proportion of those with Type 2 diabetes
who died is 0/15 = 0.
Of the diabetics over 40 years of age, the proportion of those with Type 1 diabetes
who died is 104/228 = 0.456; the proportion of those with Type 2 diabetes who
died is 218/529 = 0.412.
The lurking variable age led us to believe that Type 2 diabetes is the more
dangerous type of diabetes.
© 2010 Pearson Prentice Hall. All rights
reserved
4-99
99
Simpson’s Paradox represents a situation in which an association between
two variables inverts or goes away when a third variable is introduced to the
analysis.
© 2010 Pearson Prentice Hall. All rights
reserved
100
Section 4.5 Nonlinear Regression
© 2010 Pearson Prentice Hall. All rights
reserved
101
Exponential Model:
y  abx
Power Model:
y  axb
© 2010 Pearson Prentice Hall. All rights
reserved
104
EXAMPLE Using the Definition of a Logarithm
Rewrite the logarithmic expressions to an
equivalent expression involving an exponent.
Rewrite the exponential expressions to an
equivalent logarithmic expression.
(a) log315 = a
(b) 45 = z
© 2010 Pearson Prentice Hall. All rights
reserved
107
In the following properties, M, N, and a are
positive real numbers, with a  1, and r is
any real number.
loga (MN) = loga M + loga N
loga Mr = r loga M
EXAMPLE
Simplifying Logarithms
Write the following logarithms as the sum of
logarithms. Express exponents as factors.
(a) log2 x4
(b) log5(a4b)
If a = 10 in the expression y = logax, the
resulting logarithm, y = log10x is called the
common logarithm. It is common
practice to omit the base, a, when it is
equal to 10 and write the common
logarithm as y = log x
EXAMPLE
Evaluating Exponential and Logarithmic Expressions
Evaluate the following expressions. Round your
answers to three decimal places.
(a) log 23
(b) 102.6
© 2010 Pearson Prentice Hall. All rights
reserved
112
y = abx
Exponential Model
log y = log (abx) Take the common logarithm of
both sides
log y = log a + log bx
log y = log a + x log b
Y=A+Bx
b = 10B
where
a = 10A
EXAMPLE 4
Finding the Curve of Best Fit to
an Exponential Model
A chemist as a 1000gram sample of a
records the amount of
remaining in the sample
every day for a week and
obtains the following
data.
Day Weight (in grams)
0
1
2
3
4
5
6
7
1000.0
897.1
802.5
719.8
651.1
583.4
521.7
468.3
(a) Draw a scatter diagram of the data treating the
day, x, as the predictor variable.
(b) Determine Y = log y and draw a scatter diagram
treating the day, x, as the predictor variable and
Y = log y as the response variable. Comment on the
shape of the scatter diagram.
(c) Find the least-squares regression line of the
transformed data.
(d) Determine the exponential equation of best fit and
graph it on the scatter diagram obtained in part (a).
(e) Use the exponential equation of best fit to predict
the amount of radioactive material is left after 8 days.
© 2010 Pearson Prentice Hall. All rights
reserved
118
y = axb
Power Model
log y = log (axb) Take the common logarithm
of both sides
log y = log a + log xb
log y = log a + b log x
Y=A+bX
where
a = 10A
EXAMPLE Finding the Curve of Best Fit to a
Power Model
Distance
Cathy wishes to measure 1.0
1.1
the relation between a
light bulb’s intensity and 1.2
the distance from some
1.3
light source. She
1.4
measures a 40-watt light 1.5
bulb’s intensity 1 meter
1.6
from the bulb and at 0.1- 1.7
meter intervals up to 2
1.8
meters from the bulb and
1.9
obtains the following data.
2.0
Intensity
0.0972
0.0804
0.0674
0.0572
0.0495
0.0433
0.0384
0.0339
0.0294
0.0268
0.0224
(a) Draw a scatter diagram of the data treating the
distance, x, as the predictor variable.
(b) Determine X = log x and Y = log y and draw a scatter
diagram treating the day, X = log x, as the predictor
variable and Y = log y as the response variable.
Comment on the shape of the scatter diagram.
(c) Find the least-squares regression line of the
transformed data.
(d) Determine the power equation of best fit and graph it
on the scatter diagram obtained in part (a).
(e) Use the power equation of best fit to predict the
intensity of the light if you stand 2.3 meters away from
the bulb.
Modeling is not only a science but also an art
form. Selecting an appropriate model requires
experience and skill in the field in which you are
modeling. For example, knowledge of
economics is imperative when trying to determine
a model to predict unemployment. The main
reason for this is that there are theories in the
field that can help the modeler to select
appropriate relations and variables.
```