Statistics for Marketing and Consumer Research

Download Report

Transcript Statistics for Marketing and Consumer Research

Discriminant analysis
Chapter 11
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
1
2-groups discriminant analysis
• Discriminant analysis is a statistical procedure
which allows us to classify cases in separate
categories to which they belong on the basis of a
set of characteristic independent variables called
predictors or discriminant variables
• The target variable (the one determining allocation
into groups) is a qualitative (nominal or ordinal)
one, while the characteristics are measured by
quantitative variables.
• DA looks at the discrimination between two groups
• Multiple discriminant analysis (MDA) allows for
classification into three or more groups.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
2
Applications of DA
• DA is especially useful to understand the
differences and factors leading consumers to make
different choices allowing them to develop
marketing strategies which take into proper
account the role of the predictors.
• Examples
• Determinants of customer loyalty
• Shopper profiling and segmentation
• Determinants of purchase and non-purchase
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
3
Example on the Trust data-set
• Purchasers of chicken at the butcher’s shop (recorded in
question q8d)
• Respondents may belong to one of two groups
• those who purchase chicken at the butcher’s shop
• those who do not
• Discrimination between these groups through a set of
consumer characteristics
• expenditure on chicken in a standard week (q5)
• age of the respondent (q51)
• whether respondents agree (on a seven-point ranking scale) that butchers
sell safe chicken (q21d)
• trust (on a seven-point ranking scale) towards supermarkets (q43b)
• Does a linear combination of these four characteristics
allow one to discriminate between those who buy chicken
at the butcher’s and those who do not?
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
4
Discriminant analysis(DA)
• Two groups only, thus a single discriminating value
(discriminating score)
• For each respondent a score is computed using the
estimated linear combination of the predictors (the
discriminant function)
• Respondents with a score above the discriminating value
are expected to belong to one group, those below to the
other group.
• When the discriminant score is standardized to have zero
mean and unity variance it is called Z score
• DA also provides information about the discriminating
power of each of the original predictors
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
5
Multiple discriminant analysis(MDA)(1)
• Discriminant analysis may involve more than two
groups, in which case it is termed multiple
discriminant analysis (MDA).
• Example from the Trust data-set
• Dependent variable: Type of chicken purchased ‘in a
typical week’, choosing among four categories: value
(good value for money), standard, organic and luxury
• Predictors: age (q50), stated relevance of taste (q24a),
value for money (q24b) and animal welfare (q24k), plus
an indicator of income (q60)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
6
Multiple discriminant analysis(2)
• In this case there will be more than one
discriminant function.
• The exact number of discriminant functions is
equal to either (g-1), where g is the number of
categories in classification or to k, the number of
independent variables, whichever is the smaller
• Trust example: four groups and five explanatory
variables, the number of discriminant functions is
three (that is g-1 which is smaller than k=5).
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
7
The output of MDS
• Similarities with factor (principal component) analysis
• the first discriminant function is the most relevant for
discriminating across groups, the second is the second most
relevant, etc.
• the discriminant functions are also independent, which means that
the resulting scores are non-correlated.
• Once the coefficients of the discriminant functions are estimated
and standardized, they are interpreted in a similar fashion to the
factor loadings.
• The larger the standardised coefficients (in absolute terms), the
more relevant the respective variables to discriminating between
groups
• There is no single discriminant score in MDA
• group means are computed (centroids) for each of the discriminant
functions to have a clearer view of the classification rule
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
8
Running discriminant analysis
(two groups)
Discriminant function
(Target variable: purchasers of chicken at
the butcher’s shop)
z  0  1x1  2 x2  3 x3  4 x4
Discriminant score
Predictors
• weekly expenditure on chicken
The  discriminant coefficients
need to be estimated
• age
• safety of butcher’s chicken
• trust in supermarkets
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
9
Fisher’s linear discriminant analysis
•
•
The discrimant function is the starting point
Two key assumptions behind linear DA
(a) the predictors are normally distributed;
(b) the covariance matrices for the predictors within each of the groups
are equal.
•
•
•
Departure from condition (a) should suggest use of
alternative methods (logistic regression, see lecture 16)
Departure from condition (b) requires the use of different
discriminant techniques (usually quadratic discriminant
functions).
In most empirical cases, the use of linear DA is appropriate
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
10
Estimation
• The first step is the estimation of the 
coefficients, also termed as discriminant
coefficients or weights
• Estimation is similar to factor analysis or PCA, as
the coefficients are those which maximize the
variability between groups
• In MDA the first discriminating function is the one
with the highest between-group variability, the
second discriminating function is independent from
the first and maximizes the remaining betweengroup variability and so on
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
11
SPSS – two groups case
1. Choose the
target variable
2. Define the range of
the dependent variable
3.Select
the
predictors
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
12
Coefficient estimates
Additional statistics and diagnostics
Fisher’s and
standardized
estimates of the
discriminant function
coefficients need to
be asked for
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
13
Classification options
Decide whether prior
probabilities are
equal across groups
or group sizes reflect
different allocation
probabilities
These are diagnostic
indicators to evaluate
how well the
discriminant function
predict the groups
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
14
Save classification
Create new variables
in the data-set,
containing the
predicted group
membership and/or
the discriminant
score for each case
and each function
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
15
Output – coefficient estimates
Canonical Discriminant Function Coefficients
Standardized Canonical Discriminant Function Coefficients
Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
(Constant)
.095
.454
-.297
.025
-2.515
Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
Unstandardized coefficients
Unstandardized coefficients
depend on the measurement unit
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Standardized
coefficients do not
depend on the
measurement unit
.378
.748
-.453
.394
Most important
predictor
Trust in
supermarkets
has a – sign
(thus it reduces
the discriminant
score)
16
Centroids
Functions at Group Centroids
Butcher
no
yes
Funct ion
1
-.307
.594
Unstandardized c anonical discriminant
functions evaluated at group means
These are the means of the
discriminant score for each of the
two groups
Thus, the group of those not
purchasing chicken at the
butcher’s shop have a negative
centroid
With two groups, the discriminating score is zero
This can be computed by weighting the centroids with the initial probabilities
Prior Probabilities for Groups
Butcher
no
yes
Total
Prior
.660
.340
1.000
Cases Used in Analysis
Unweighted
Weighted
277
277.000
143
143.000
420
420.000
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
From these prior probabilities it
follows that the discriminating score
is -0.307 x 0.66 + 0.594 x 0.34 = 0
17
Output – classification success
Classification Resultsa
Original
Count
%
Butcher
no
yes
Ung rouped cases
no
yes
Ung rouped cases
Predicted Group
Membership
no
yes
244
33
88
55
1
1
88.1
11.9
61.5
38.5
50.0
50.0
Total
277
143
2
100.0
100.0
100.0
a. 71.2% of orig inal grouped cases correctly classified.
Using the discriminant function, it is possible to correctly classify 71.2% of
original cases (244 no-no + 55 yes-yes)/420
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
18
Diagnostics (1)
•Box’s M test. This tests whether covariances are equal
across groups
•Wilks’ Lambda (or U statistic) tests discrimination
between groups. It is related to analysis of variance.
• Individual Wilks’Lambda for each of the predictors in a discriminant
function; univariate ANOVA (are there significant differences in the
predictor’s means between the groups?), p-value from the F distribution.
• Wilks’ Lambda for the function as a whole. Are there significant
differences in the group means for the discriminant function p-value
from the Chi-square distribution?
•The overall Wilks’ Lambda is especially helpful in multiple
discriminant analysis as it allows one to discard those
functions which do not contribute towards explaining
differences between groups.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
19
Diagnostics (2)
• DA returns one eigenvalue (or more eigenvalues for
MDA) of the discriminant function.
• These can be interpreted as in principal component
analysis
• In MDA (more than one discriminant function)
eigenvalues are exploited to compute how each
function contributes to explain variability
• The canonical correlation measures the intensity of
the relationship between the groups and the single
discriminant function
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
20
Trust example: diagnostics
Statistic
P-value
Box's M statistic
37.3
0.000
Overall Wilks' Lambda
0.85
0.000
Expenditure
0.98
0.002
Age
0.97
0.001
Safer for Butcher
0.91
0.000
Trust in Supermarket
0.98
0.002
Wilks Lambda for
Eigenvalue
0.18
Canonical correlation
0.39
% of correct predictions
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
71.2%
Covariance matrices are
not equal
The overall discriminating
power of the DF is good
All of the predictors are
relevant to discriminating
between the two groups
The eigenvalue is the ratio
between variances between
and variance within groups
(the larger the better)
Square root of the ratio between variability
between and total variability
21
MDA
To run MDA in SPSS the only
difference is that the range
has more than two
categories
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
22
Predictors
Tests of Equality of Group Means
Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Wilks'
Lambda
.981
.971
.960
.982
F
1.798
2.761
3.878
1.679
.919
8.272
df1
3
3
3
3
df2
282
282
282
282
Sig.
.148
.042
.010
.172
3
282
.000
Test Results
Box's M
F
Approx.
df1
df2
Sig.
65.212
1.382
45
53286.386
.045
Three predictors only appear to be
relevant in discriminating among preferred
types of chicken
Tests null hypothesis of equal population covariance matrices.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Null rejected at 95%
c.l., but not at 99% c.l.
23
Discriminant functions
Three discriminant functions (four groups minus one) can
be estimated
Eigenvalues
Function
1
2
3
Eigenvalue
.102a
.051a
.014a
% of Variance
61.0
30.8
8.2
Cumulative %
61.0
91.8
100.0
a. First 3 canonical discriminant functions were used in the
analysis.
W ilks' Lambda
Test of Function(s)
1 through 3
2 through 3
3
Wilks'
Lambda
.851
.938
.986
Chi-square
45.098
17.904
3.818
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
df
15
8
3
Sig.
.000
.022
.282
Canonical
Correlation
.304
.221
.116
The first two
discriminant functions
have a significant
discriminating power.
24
Coefficients
Discriminant functions’ coefficients
Unstandardized
1
Value for money
Age
Standardized
2
1
2
-.043
-.009
.603
-.013
-.053
-.148
.746
-.208
Tasty food
.169
.416
.152
.374
Animal welfare
Please indicate your gross
annual household income
range
.186
-.132
.313
-.222
.652
-.033
.870
-.044
-2.298
-4.868
(Constant)
Value for money is
very relevant for
the second
function
Income is
very relevant
for the first
function
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
25
Structure matrix
Structure Matrix
1
Please indicate y our
gross annual hous ehold
income range
Animal welfare
Value for money
Tasty food
Age
.929*
.390*
Income
-.010
.241
-.217
Function
2
Value
and
-.021
taste
-.206
.891*
.660*
-.204
3
.078
.125
.168
Age
.273
.944*
Pooled within-groups correlat ions between discriminating
variables and standardized canonical disc riminant functions
Variables ordered by absolute size of correlation within function.
*. Larges t absolute correlation between eac h variable and
any discriminant function
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
The values in the structure
matrix are the correlations
between the individual
predictors and the scores
computed on the discriminant
functions.
For example, the income
variable has a strong
correlation with the scores of
the first function
The structure matrix help
interpreting the functions
26
Centroids
Functions at Group Centroids
In a typical week, what
type of fresh or frozen
chicken
do you buy for
'Value' chicken
your household's
'Standard' chicken
home consumption?
'Organic' chicken
'Luxury' chicken
1
-.673
.058
.525
.003
Function
2
-.262
.156
-.470
.052
3
-.040
-.065
-.030
.242
Unstandardized canonical discriminant functions evaluated at
group means
The first function
discriminates well between
value and organic (income
matters to organic buyers)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
The second allows some
discrimination standard-organic,
value-standard, organic-luxury
(taste and value matter)
27
Plot of two functions
Tick ‘separate-groups’ to show
graphs of the first two functions
for each individual group
The ‘territorial map’ shows the
scores for the first two
functions considering all groups
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
28
Plots: individual groups
Example: organic chicken
Most cases tend to be
relatively high on function 1
(income)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
29
Plots – all groups
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
30
Prediction results
Classification Resultsa
Original
Count
%
In a typical week, what
type of fresh or frozen
chicken do you buy for
your
household's
'Value'
chicken
home consumption?
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
'Value'
chicken
3
2
1
1
0
7.3
1.3
2.9
1.9
.0
Predicted Group Membership
'Standard'
'Organic'
chicken
chicken
38
0
154
1
30
4
51
1
51
3
92.7
.0
98.1
.6
85.7
11.4
96.2
1.9
94.4
5.6
'Luxury'
chicken
0
0
0
0
0
.0
.0
.0
.0
.0
Total
41
157
35
53
54
100.0
100.0
100.0
100.0
100.0
a. 56.3% of original grouped cases correctly classified.
The functions do not predict well;
most units are allocated to standard
chicken – on average only 56.3% of
the cases are allocated correctly
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
31
Stepwise discriminant analysis
• As for linear regression it is possible to
decide whether all predictors should appear
in the equation regardless of their role in
discriminating (the Enter option) or a subset of predictors is chosen on the basis of
their contribution to discriminating between
groups (the Stepwise method)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
32
The step-wise method
1.
2.
3.
4.
5.
A one-way ANOVA test is run on each of the predictors, where the target
grouping variable determines the treatment levels. The ANOVA test provides
a criterion value and tests statistics (usually the Wilks Lambda). According to
the criterion value, it is possible to identify the predictor which is most
relevant in discriminating between the groups
The predictor with the lowest Wilks Lambda (or which meets an alternative
optimality criterion) enters the discriminating function, provided the p-value
is below the set threshold (for example 5%).
An ANCOVA test is run on the remaining predictors, where the covariates are
the target grouping variables and the predictors that have already entered
the model. The Wilks Lambda is computed for each of the ANCOVA options.
Again, the criteria and the p-value determine which variable (if any) enter
the discriminating function (and possibly whether some of the entered
variables should leave the model).
The procedure goes back to step 3 and continues until none of the excluded
variables have a p-value below the threshold and none of the entered
variables have a p-value above the threshold (the stopping rule is met).
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
33
Alternative criteria
•
•
•
•
Unexplained variance
Smallest F ratio
Mahalanobis distance
Rao’s V
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
34
In SPSS
The step-wise method
allows selection of
relevant predictors
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
35
Output of the step-wise method
Variables in the Analysis
Step
1
2
F to Remove
Tolerance
Please indicate your
gross annual household
income range
Please indicate your
gross annual household
income range
Value for money
Wilks'
Lambda
Only two predictors
are kept in the model
1.000
8.272
1.000
8.241
.960
1.000
3.863
.919
Variables Not in the Analysis
Step
0
1
2
Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Age
Tasty food
Value for money
Animal welfare
Age
Tasty food
Animal welfare
Tolerance
1.000
1.000
1.000
1.000
Min.
Tolerance
1.000
1.000
1.000
1.000
F to Enter
1.798
2.761
3.878
1.679
Wilks'
Lambda
.981
.971
.960
.982
1.000
1.000
8.272
.919
.988
.991
1.000
.992
.987
.821
.992
.988
.991
1.000
.992
.987
.821
.992
1.507
2.437
3.863
1.052
1.549
.793
1.057
.905
.896
.883
.909
.868
.875
.873
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
36