Decision and Risk Analysis

Download Report

Transcript Decision and Risk Analysis

DRA/K V
Decision and Risk Analysis
Business Forecasting and
Regression Analysis
Kiriakos Vlahos
Spring 2000
DRA/K V
Session overview
• Why do we need forecasting?
• Overview of forecasting techniques
• The components of time series
– Trend
– Seasonality
– Cycles
– Randomness
• Trend curves
• Causal forecasting and regression
analysis
• Judgemental forecasting
• Scenario planning
DRA/K V
All forecasts are wrong
Those who claim to forecast the future
are all lying even if, by chance,
they are later proved right
DRA/K V
Forecasting is ...
Forecasting is like trying to drive a car
blindfolded and following directions
given by a person who is looking
out of the back window
DRA/K V
Forecasting in business
Forecasting in business is like sex in
society, we have to have it, we cannot
get along without it, everyone is doing
it, one way or another, but nobody is
sure he is doing it the best way.
G W Plossl
Last Frontiers for Profits
DRA/K V
Forecasting in
organisations
• Marketing
– Sales, prices, social and economic
trends
• Production
– Demand, costs, employment and
machinery requirements
• Finance
– Costs, sales, capital expenditure,
economic climate
• R&D
– Technological developments, new
products
• Top management
– Total sales, costs, pricing, economic
trends, competitors’ positioning
DRA/K V
Formal vrs. informal
forecasting
• Forecasting is a very common activity
• The majority of forecasting is informal
• Why do we need formal forecasting?
– Coping with complexity
– Coping with growth
– Coping with change
– Need for auditability and justification
Formal forecasting provides a vehicle for
communication about the forecast and a
basis for systematic improvement.
DRA/K V
Characteristics of
forecasting problems
• Time horizon
– short-term
– long-term
• Data patterns
– Seasonality
– Trend
– Cycles
– Randomness
• Cost
• Complexity
• Accuracy
Data patterns - Trend
DRA/K V
Medium to long term movements
Upward or downward
e.g.
197 8
197 9
198 0
198 1
198 2
198 3
198 4
198 5
198 6
198 7
DRA/K V
Data patterns - cycles
Long-term irregular movements,
e.g.
Government debt since the
American revolution
Data patterns Seasonality
DRA/K V
Regular periodic oscillations.
They can be monthly, quarterly, etc.
e.g.
Turnover (£m)
Ja n-84
Ja n-85
Ja n-86
Ja n-87
Additive or multiplicative
Ja n-88
DRA/K V
Data patterns Random oscillations
Unsystematic, oscillations around a
constant mean.
No trend cycle or seasonality
11 5
11 0
10 5
10 0
95
90
85
Classification of
forecasting methods
DRA/K V
Forecasting methods
Quantitative
Time Series
Trend Curves
Judgemental
Individual
Group
Seasonal Decomposition
Committees
Exponential Smoothing
Delphi
Causal Forecasting
Regression
Econometric forecasting
Tracking signals
Scenario planning
Market surveys
Role playing
DRA/K V
Regression overview
• Why understanding relationships is
important
• Visual tools for analysing relationships
• Correlation
– Interpretation
– Pitfalls
• Regression
– Building models
– Interpreting and evaluating models
– Assessing model validity
– Data transformations
– Use of dummy variables
DRA/K V
Why analysing
relationships is important
• Development of theory in the social
sciences and empirical testing
• Finance e.g.
– How are stock prices affected by
market movements?
– What is the impact of mergers on
stockholder value?
• Marketing e.g.
– How effective are different types of
advertising?
– Do promotions simply shift sales
without affecting overall volume?
• Economics e.g.
– How do interest rates affect
consumer behaviour?
– How do exchange rates influence
imports and exports?
Sales vrs advertising
Sales (units)
DRA/K V
Advertising (£000)
DRA/K V
Estimating betas
The slope of this line is called the beta of
the stock and is an estimate of its market
risk.
DRA/K V
Scatter plots
• What are they?
A graphical tool for examining the
relationship between variables
• What are they good for?
For determining
• Whether variables are related
• the direction of the relationship
• the type of relationship
• the strength of the relationship
Correlation
DRA/K V
• What is it?
A measure of the strength of linear
relationships between variables
• How to calculate?
a) Calculate standard deviations sx, sy
b) Calculate the correlation using the
formula
rxy 
 ( x  x )( y  y)
i
i
i
( N  1)sx s y
• Possible values
From -1 to 1
DRA/K V
Interpreting the
correlation
DRA/K V
Correlation Pitfalls
• Correlation measures only linear
relationships
• Existence of a relationship does not
imply causality
• Even if there exists a causal
relationship, the direction may not be
obvious
DRA/K V
Correlation and
Causality
Many nations see improving communications as vital
to boost overall economy. A 1% increment in
telephone density yields an increment of about 0.1%
in per-capita GNP, according to a 1983 OECD-ITU
study.
AT&T advertisement in Fortune Dec 97
Ferric Processing
DRA/K V
What are the factors influencing
production costs?
Plant age
Capacity
?
?
Production
costs
?
Plant
location
?
Other plant
features
Predicting production cost is important
for the negotiation of 5-year contracts
with steel companies
Visual inspection
DRA/K V
a) Construct scatter plot
30
cost/ton ($)
25
20
15
10
0
0.5
1
1.5
2
2.5
3
capacity (000 tons/month)
b) Calculate correlation
(excel function CORREL)
The correlation between cost
and capacity is -0.84
c) Candidate model
Cost = a + b Capacity
3.5
Simple Linear
Regression
DRA/K V
Simple regression estimates a linear
equation which corresponds to
straight line that passes through the
data
30
cost/ton ($)
25
20
15
10
0
0.5
1
1.5
2
2.5
3
3.5
capacity (000 tons/month)
Regression model
Cost = 25.2 - 4.4 Capacity
Dependent Constant or
variable
intercept
Coefficient Independent
or slope or explanatory
variable
Least squares
DRA/K V
30
Residuals
cost/ton ($)
25
20
15
10
0
0.5
1
1.5
2
2.5
3
3.5
capacity (000 tons/month)
• Residuals are the vertical distances of
the points from the regression line
• In least squares regression
– The sum of squared residuals is
minimised
– The mean of residuals is zero
– residuals are assumed to be
randomly distributed around the
mean according to the normal
distribution
Excel output
DRA/K V
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
Observe adjusted R2
0.84
0.70
0.66
2.33
10
s
ANOVA
df
SS
100.65
43.59
144.23
MS
100.65
5.45
Coefficients Standard Error
25.19
1.86
-4.40
1.02
t Stat
13.55
-4.30
Regression
Residual
Total
1
8
9
Intercept
Capacity
Read equation
sb
F
Significance F
18.47
0.00
P-value
Lower 95% Upper 95%
0.00
20.91
29.48
0.00
-6.77
-2.04
Observe statistics
The standard error s is simply the st. deviation of the
residuals (a measure of variability)
R2 is the most widely measure of goodness of fit.
s2
residual variance
R  1 2  1
sy
dependentvariablevariance
2
It can be interpreted as the proportion of the
variance of the dependent variable explained by the
model. Use the adjusted R2 ,which accounts for the
no. of observations.
DRA/K V
Hypothesis testing
Does a relationship between capacity
and cost really exist? If we draw a
different sample, would we still see the
same relationship?
Or in stats jargon
Is the slope significantly different from
zero?
y
b=0
x
b=0 implies no relationship between x
and y
Hypothesis testing
Test whether b=0
t-values and p-values
DRA/K V
Distribution
of estimate of
slope if b=0
p-value
0
b
t-value * sb
sb is the st. deviation of the slope estimate b
t-value = b/sb
p-value is the probability of getting an estimate
of slope at least as large as b.
Equivalent tests (5% significance level)
|T-value| > 2
p-value < 0.05
DRA/K V
Checking residuals
Residuals should be random. Any
systematic pattern indicates that our
model is incomplete.
Problematic patterns
Heteroscedasticity
Autocorrelated residuals
Ferric - Residuals
DRA/K V
Line fit Plot
30
Cost/ton
25
Actual
Predicted
20
15
10
0
1
2
3
4
Capacity
Residual Plot
5
4
Residuals
3
2
1
0
-1 0
1
2
3
-2
-3
-4
Capacity
Are residuals random?
Can you see any pattern?
4
Combining theory and
judgement
DRA/K V
The relationship appears to be non linear.
We can fit non-linear relationships by introducing
suitable transformations, e.g.
y
Ln(y)
y=aebx
Ln(y)=ln(a)+bx
x
What transformation is appropriate for
the Ferric data?
Use judgement e.g.
Total Cost (TC) = Fixed Cost + Variable Cost
TC = FC + Unit Cost (UC)* Quantity(Q)
TC/Q = FC/Q + UC e.g.
Average Cost = b/Q + a
This suggests that average costs are inversely
proportionate to capacity
x
Transforming the data
DRA/K V
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.97
0.95
0.94
0.98
10
Coefficients Standard Error
11.75
0.60
7.93
0.67
Intercept
1/Capacity
t Stat
P-value Lower 95% Upper 95%
19.53
0.00
10.36
13.13
11.88
0.00
6.39
9.46
Line Fit Plot
30
Cost/ton
25
20
Actual
15
Predicted
10
0.00
0.50
1.00
1.50
2.00
2.50
1/Capacity
30
cost/ton ($)
25
20
15
10
0
0.5
1
1.5
2
2.5
capacity (000 tons/month)
3
3.5
DRA/K V
Model comparison
• High adusted R2
• All coefficients significant
– t-values or p-values
• Low standard error
• No pattern in residuals
• Is model supported by theory?
• Does the model make sense?
Criteria
High adjusted R2
All coefficients significant
Low residual st. dev. (s)
No pattern in residuals
Equation makes sense
First model
66%
Yes
2.33
No
Yes (?)
Transformed model
94%
Yes
0.98
Yes
Yes
The transformed model is better:
Cost = 11.75 + 7.93 * (1/Capacity)
DRA/K V
Forecasting &
confidence intervals
• If capacity is 2 what is the forecast for
cost?
– Cost = 11.75 + 7.93 (1/2) = 15.71
• Approximate 95% confidence interval:
15.71  2 * s
where s=0.98 is the standard error
• The greater the number of
observations the better the
approximation
• More accurate intervals can be
calculated using statistical packages
Confidence intervals
DRA/K V
Plot of Fitted Model
29
COST
26
23
20
17
14
0
0.5
1
1.5
2
2.5
3
1/CAPACITY
Statgraphics gives two sets of intervals.
• Outer bands are prediction intervals
for an individual plant
• Inner bands are confidence intervals
for the average cost from all plants.
The can be viewed as the confidence
intervals for the regression line.
Is plant age important?
DRA/K V
Multiple regression
Cost = a + b(1/Capacity)+ cYear + e
Correlation matrix
Cost/ton
Cost/ton
1
Year
-0.74237
1/Capacity
0.9728
Year
1/Capacity
1
-0.67071
1
Regression analysis
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
Intercept
Year
1/Capacity
0.98
0.96
0.95
0.90
10
Coefficients Standard Error
542.01
326.41
-0.27
0.16
7.03
0.82
Is this a good model?
t Stat
P-value
1.66
0.14
-1.62
0.15
8.58
0.00
Lower 95%
Upper 95%
-229.83
1313.84
-0.66
0.12
5.09
8.97
Multicollinearity
DRA/K V
Multicollinearity appears when explanatory variables
are highly correlated.
Effects:
• Including Year adds little information, hence fit
does not improve much
• Parameter estimates become unreliable
Remedial action:
• Remove one of the correlated variables
Moral:
• Check for correlations between explanatory
variables
30
81
cost/ton ($)
25
81
20
83
85
84
15
85
86
85
87
87
10
0
1
2
capacity (000 tons/month)
3
4
DRA/K V
Other inappropriate
models
Influential observations and outliers
Clustering of data
Dummy variables
DRA/K V
War years
Bond purchases and national income
Year
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
B
2.6
3.0
3.6
3.7
3.8
4.1
4.4
7.1
8.0
8.9
9.7
10.2
10.1
7.9
8.7
9.1
10.1
Y
2.4
2.8
3.1
3.4
3.9
4.0
4.2
5.1
6.3
8.1
8.8
9.6
9.7
9.6
10.4
12.0
12.9
W
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
Regression equation: B = 1.29+.68Y+2.3W
DRA/K V
•
•
•
•
Regression checklist
Visually inspect the data (scatter plots)
Calculate correlations
Develop and fit sensible model(s)
Assess and compare the model(s)
– Significance of variables (t-values,
p-values)
– adjusted R2
– standard error (s)
– residual plots
• autocorrelation
• heteroscedasticity
• Normality
• Outliers, influencial observations
– Does the model make sense?
• If you are satisfied use the model for
– developing business insights
– forecasting
DRA/K V
Trend curves
• Also known as growth/decay curves
• Most common curves
– Linear
– Quadratic
– Exponential
– Logarithmic
– S-curves
Fitting trend curves
Transform the original data so that a
linear equation of the form y=a+bx
arises. Then apply regression
analysis.
Example:
Yt  ab t
 log(Yt )  log( a )  log( b)t
DRA/K V
Credit card turnover
Visa turnover
Exponential Growth curve
£bn
16
14
12
10
8
6
4
2
0
1978
1981
1984
Actual
1987
Predicted
How would you use such curve for forecasting?
What role does judgement play in trend
projection?
Other trend curves
(S curves)
DRA/K V
Simple modified
exponential
Yt  c  abt
b0
bt  ct 2
Yt  ae
c0
Gompertz curve
Yt  ae
Logarithmic
parabola
be ct
b  0, c  0
Logistic curve
1
Yt 
1  bect
c0
Trend and seasonality
DRA/K V
Quarterly data
Time
1
2
3
4
5
6
7
8
9
Sales
37.2
15.7
11
26.6
28.9
12
6.6
20.9
23.5
q1
0
1
0
0
0
1
0
0
0
q2
0
0
1
0
0
0
1
0
0
q3
0
0
0
1
0
0
0
1
0
q4
1
0
0
0
1
0
0
0
1
$m
Sales
40
35
30
25
20
15
10
5
0
0
10
20
Quarters
30
40
Regression with seasonal dummy variables
Sales = a + b Time + c q2 + d q3 + e q4
Include q1 in the model?
Multiple regression with
seasonal dummies
DRA/K V
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.95
0.90
0.88
2.96
36.00
Coefficients Standard Error
14.78
1.31
-3.75
1.40
8.53
1.40
15.66
1.40
-0.25
0.05
Intercept
q2
q3
q4
Time
t Stat
11.30
-2.69
6.10
11.23
-5.17
P-value Lower 95% Upper 95%
0.00
12.11
17.44
0.01
-6.60
-0.91
0.00
5.68
11.38
0.00
12.82
18.51
0.00
-0.34
-0.15
Equation: ?
Interpretation: ?
Time Line Fit Plot
40
35
Sales
Predicted Sales
30
Sales
25
20
15
10
5
0
0
5
10
15
20
Time
25
30
35
40
DRA/K V
Econometric modelling
Regression Analysis
Sales = f(GNP, price, advertising)
Econometric Modelling
Sales = f(GNP, price, advertising)
advertising = f(salest-1)
production cost = f(sales, labour cost,
materials cost)
price = f(production cost, price of
substitutes)
exogenous - endogenous variables
Simultaneous parameter estimation
DRA/K V
The CEF model
• The CEF model of the UK economy
– Agents
• Individuals
• Banks
• Other financial institutions
• Government
• Overseas agents
– Markets
• Market for goods and services
• Market for labour
• Market for capital goods
• Agents interact in each market
influencing supply and demand, which
in turn determine price and quantities.
– 500 equations (!)
DRA/K V
Judgemental
forecasting
• Individual
– Subjective probability assessment
• Group forecasting
– Sales force method
– Executive committees
– Expert panels
– Delphi method
• (Feedback, reassessment)
• Problems in judgemental forecasting
– Bias
– Anchoring
– Conservatism/Optimism
– Overconfidence
• Combining forecasts
DRA/K V
Forecasting change
DRA/K V
Crude price oil
forecasts
The dangers of straight line forecasts
DRA/K V
Energy forecasting in
West Germany
Energy Consumption - forecasts vrs. actual data
From Diefenbacher and Johnson
“The politics of Energy Forecasting”
Persistence of mental models!
DRA/K V
Airline industry
forecasts
DRA/K V
Forecasting & Planning
• Traditional view of forecasting
– The past explains the future
– Passive or adaptive attitude towards
the future
• Modern view
– Active and creative approaches to
forecasting
– Making things happen
DRA/K V
Scenario planning
“It is impossible to forecast the future
and it may be dangerous to do so”
Use of scenarios in planning
Develop a small number of internally
consistent and credible views of how
the world will look in the future, that
present testing conditions for the
business.
The future will of course be different
from all of these views/scenarios, but if
the company is prepared to cope with
any of them, it will be able to cope with
the real world.
DRA/K V
Scenarios in Shell
Oil shock scenario:
Scenario
design
Strategic
Plan
Event
Result
Shell analyse the impact
of a $15/bbl price on cash
flows and investment plans
Early
1985
Re-evaluation of up-stream
plans and cash-flow position
of the operating companies
Oil price falls
from $28/bbl to $10/bbl
Early
1986
DRA/K V
Advantages of
Scenario Planning
• Challenge preconceived ideas and
single point forecasts
• Explore a wide range of uncertainties
• Encourage an active and creative
attitude to the future
• Provide a background for specific
project evaluation
• Provide a vehicle for communication
between the different parts of the
organisation
DRA/K V
Forecasting - Summary
• All forecasts are wrong!
• Never trust single point forecasts
• Data patterns
– Trend, seasonality, cyclicality,
randomness
• Time-series forecasting
– Trend curves
• Causal forecasting
– Regression
• Judgemental forecasting
• Scenario planning
DRA/K V
Preparation for
Regression workshop
• Read the note on Regression Analysis
• Work on the “Tutorial on Regression
Analysis using Excel”
• Practice on creating descriptive
statistics and histograms in Excel
(ExcelStats.xls)
• Select your workshop partner
• In preparation for the exam work on
regression exercises