Transcript Chapter 1

CHAPTER 1
THE LINEAR REGRESSION MODEL:
AN OVERVIEW
Damodar Gujarati
Econometrics by Example
THE LINEAR REGRESSION MODEL (LPM)
 The general form of the LPM model is:
Yi = B1 + B2X2i + B3X3i + … + BkXki + ui
 Or, as written in short form:
Yi = BX + ui
 Y is the regressand, X is a vector of regressors,
and u is an error term.
Damodar Gujarati
Econometrics by Example
POPULATION (TRUE) MODEL
Yi = B1 + B2X2i + B3X3i + … + BkXki + ui
This equation is known as the population or
true model.
It consists of two components:
(1) A deterministic component, BX (the conditional
mean of Y, or E(Y|X)).
(2) A nonsystematic, or random component, ui.
Damodar Gujarati
Econometrics by Example
REGRESSION COEFFICIENTS
 B1 is the intercept
 B2 to Bk are the slope coefficients
 Collectively, they are the regression coefficients
or regression parameters
 Each slope coefficient measures the (partial) rate
of change in the mean value of Y for a unit change
in the value of a regressor, ceteris paribus
Damodar Gujarati
Econometrics by Example
SAMPLE REGRESSION FUNCTION
The sample counterpart is:
Yi = b1 + b2X2i + b3X3i + … + bkXki + ei
Or, as written in short form:
Yi = bX + ei
where e is a residual.
The deterministic component is written as:

Yi  b1  b2 X 2i  b3 X 3i  ...  bk X ki  bX
Damodar Gujarati
Econometrics by Example
THE NATURE OF THE Y VARIABLE
 Ratio Scale:
 Ratio of two variables, distance between two variables, and
ordering of variables are meaningful
 Interval Scale:
 Distance and ordering between two variables meaningful, but
not ratio
 Ordinal Scale:
 Ordering of two variables meaningful (not ratio or distance)
 Nominal Scale:
 Categorical or dummy variables, qualitative in nature
Damodar Gujarati
Econometrics by Example
THE NATURE OF DATA
 Time Series Data
A set of observations that a variable takes at different
times, such as daily (e.g., stock prices), weekly (e.g.,
money supply), monthly (e.g., the unemployment
rate), quarterly (e.g., GDP), annually (e.g.,
government budgets), quinquenially or every five
years (e.g., the census of manufactures), or
decennially or every ten years (e.g., the census of
population).
Damodar Gujarati
Econometrics by Example
THE NATURE OF DATA
 Cross-Section Data
Data on one or more variables collected at the same
point in time.
Examples are the census of population conducted by
the Census Bureau every 10 years, opinion polls
conducted by various polling organizations, and
temperature at a given time in several places.
Damodar Gujarati
Econometrics by Example
THE NATURE OF DATA
 Panel, Longitudinal or Micro-panel Data
Combines features of both cross-section and time
series data
Same cross-sectional units are followed over time
Panel data represents a special type of pooled data
(simply time series, cross-sectional, where the same
cross-sectional units are not necessarily followed over
time)
Damodar Gujarati
Econometrics by Example
METHOD OF ORDINARY LEAST SQUARES
 Method of Ordinary Least Squares (OLS) does not
minimize the sum of the error term, but minimizes
error sum of squares (ESS):
u  (Y  B  B X
2
i
i
1
2
2i
 B3 X3i  ....  Bk X ki )
2
 To obtain values of the regression coefficients,
derivatives are taken with respect to the regression
coefficients and set equal to zero.
Damodar Gujarati
Econometrics by Example
CLASSICAL LINEAR REGRESSION MODEL
Assumptions of the Classical Linear
Regression Model (CLRM):
A-1: Model is linear in the parameters.
A-2: Regressors are fixed or nonstochastic.
A-3: Given X, the expected value of the error
term is zero, or E(ui |X) = 0.
Damodar Gujarati
Econometrics by Example
CLASSICAL LINEAR REGRESSION MODEL
 Assumptions of the Classical Linear Regression Model
(CLRM):
 A-4: Homoscedastic, or constant, variance of ui, or
var(ui|X) = σ2.
 A-5: No autocorrelation, or cov(ui,uj|X) = 0, i ≠ j.
 A-6: No multicollinearity, or no perfect linear
relationships among the X variables.
 A-7: No specification bias.
Damodar Gujarati
Econometrics by Example
GAUSS-MARKOV THEOREM
 On the basis of assumptions A-1 to A-7, the OLS method
gives best linear unbiased estimators (BLUE):
(1) Estimators are linear functions of the dependent
variable Y.
(2) The estimators are unbiased; in repeated applications
of the method, the estimators approach their true values.
(3) In the class of linear estimators, OLS estimators have
minimum variance; i.e., they are efficient, or the “best”
estimators.
Damodar Gujarati
Econometrics by Example
HYPOTHESIS TESTING: t TEST
 To test the following hypothesis:
H0: Bk = 0
H1: Bk ≠ 0
we calculate the following and use the t table to obtain the
critical t value with n-k degrees of freedom for a given level of
significance (or α, equal to 10%, 5%, or 1%):
bk
t
se(bk )
If this value is greater than the critical t value, we can reject H0.
Damodar Gujarati
Econometrics by Example
HYPOTHESIS TESTING: t TEST
 An alternative method is seeing whether zero lies within the
confidence interval:
[bk  t / 2 se(bk )]  (1   )
 If zero lies in this interval, we cannot reject H0.
 The p-value gives the exact level of significance, or the lowest
level of significance at which we can reject H0.
Damodar Gujarati
Econometrics by Example
GOODNESS OF FIT, R2
 R2, the coefficient of determination, is an overall measure of
goodness of fit of the estimated regression line.
 Gives the percentage of the total variation in the dependent
variable that is explained by the regressors.
 It is a value between 0 (no fit) and 1 (perfect fit).
 Let: ExplainedSum of Squares (ESS)  (Yˆ  Y ) 2
Residual Sum of Squares (RSS)  e 2
T otalSum of Squares (T SS)  (Y  Y ) 2
 Then:
ESS
RSS
R 
 1
TSS
TSS
2
Damodar Gujarati
Econometrics by Example
HYPOTHESIS TESTING: F TEST
 Testing the following hypothesis is equivalent to testing the
hypothesis that all the slope coefficients are 0:
H0: R2 = 0
H1: R2 ≠ 0
 Calculate the following and use the F table to obtain the critical
F value with k-1 degrees of freedom in the numerator and n-k
degrees of freedom in the denominator for a given level of
significance:
2
ESS / df
R /(k  1)
F

RSS / df (1  R 2 ) /(n  k )
If this value is greater than the critical F value, reject H0.
Damodar Gujarati
Econometrics by Example