Quick & Simple Introduction to Multidimensional Scaling Professor Tony Coxon

Download Report

Transcript Quick & Simple Introduction to Multidimensional Scaling Professor Tony Coxon

Quick & Simple Introduction to
Multidimensional Scaling

Professor Tony Coxon

Hon. Professorial Research Fellow, University
of Edinburgh ( [email protected] )
see www.tonycoxon.com for information on me
 see www.newmdsx.com for information resource
on MDS and NewMDSX programs/doc.
 See:




“The User’s Guide to MDS” and
“Key Texts in MDS” (readings), Heineman 1982
Available as pdf at £15 from newmdsx
What is Multidimensional Scaling?
A student’s definition:

If you are interested in how certain objects relate to each other …
and if you would like to present these relationships in the form of a
map then MDS is the technique you need” (Mr Gawels, KUB)
A good start!
MDS is a family of models structured by D-T-M:




(DATA) the empirical information on inter-relationships
between a set of “objects”/variables are given in a set of
dis/similarity data
(TRANSFORMATION) which are then re-scaled ( according
to permissible transformations for the data / level of
measurement) , in terms of
(MODEL) the assumptions of the model chosen to represent
the data
MDS Solution
1.


… to produce a SOLUTION, consisting of :
a CONFIGURATION, which is a
i. pattern of points representing the “objects”
ii. located in a space of a small number of dimensions
(hence SSA – “Smallest-Space Analysis”)



iii.
where the distances between the points represent the
dis/similarities between the data-points
iv.
as perfectly as possible
(the imperfection/badness of fit is measured by Stress)
“Low stress is desirable; No stress is perfection”
Distances & Maps

Given a map, it’s easy to calculate the (Euclidean) distances
between the points :
d j ,k 
 (x
ja
 x ka ) 2
a


MDS operates the other way round:
Given the “distances” [data] find the map [configuration] which
generated them



… and MDS can do so when all but ordinal information has been
jettisoned (fruit of the “non-metric revolution”)
even when there are missing data and in the presence of considerable
“noise”/error (MDS is robust).
MDS thus provides at least

[exploratory] a useful and easily-assimilable graphic visualization of a
complex data set (Tukey: “A picture is worth a thousand words”)
What is like MDS?
Related and Special-case Models:
 Metric Scalar Products Models:



Metric and Non-Metric Ultrametric Distance, Discrete models




*Simple (2W2M) and Multiple (3W) Correspondence Analysis
BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO
OFFERS ORDINAL EQUIVALENTS OF:



*Hierarchical Clustering
*Partition Clustering (CONPAR)
Additive Clustering ( 2 and 3-way)
Metric Chi-squared Distance Model for 2W2M and 3W data / Tables


*PRINCIPAL COMPONENTS ANALYSIS
FACTOR ANALYSIS (+ communalities)
*ANOVA
other simple composition models …* UNICON
(All models with asterisk * exist as programs within NewMDSX)
How does MDS differ from other
Multivariate Methods?
Compared to other multivariate methods, MDS models are
usually:

distribution-free




make conservative (non-metric) demands on the structure of the data,
are relatively unaffected by non-systematic missing data,
can be used with a very wide variety of types of data:





(though MLE models do exist – Ramsay’s MULTISCALE)
direct data (pair comparisons, ratings, rankings, triads, sortings)
derived data (profiles, co-occurrence matrices, textual data, aggregated
data)
measures of association/correlation etc derived from simpler data, and
tables of data.
range of transformations

monotonic (ordinal), linear/metric (interval), but also log-interval, power,
“smoothness” – even “maximum variance non-dimensional scaling”
(Shepard)
How does MDS differ from other
Multivariate Methods (2)?
Compared to other multivariate methods, MDS models are
also offer:


range of models (chiefly distance (Euclidean, but also City-block),
factor/vector (scalar-products), simple composition (additive).
Also there are hierarchies of models:





Similarity models: 2W1M METRIC – 3W2M INDSCAL – IDIOSCAL (honest!)
Preference models : Vector-distance-weighted distance-rotated, weighted
(PREFMAP)
Procrustes rotation for putting configurations into maximum conformity,
and then increasingly complex transformations: PINDIS
the solutions are visually assimilable & readily interpretable
the structure is not limited to dimensional information – also other
simple structures (“horseshoes”, radex/circumplex, clusters, directions).
Weaknesses in MDS


Relative ignorance of the sampling properties of stress
prone-ness to local minima solutions



There ARE any??!
(but less so, and interactive programs like PERMAP allow
thousands of runs to check)
a few forms of data/models are prone to degeneracies
(especially MD Unfolding – but see new PREFSCAL in
SPSS)
difficulty in representing the asymmetry of causal models


though external analysis is very akin to dependent-independent
modelling,
there are convergences with GLM in hybrid models such as
CLASCAL (INDSCAL with parameterization of latent classes)
CHARACTERIZATION OF BASIC MDS
& TERMINOLOGY
Structure of MDS specifiable in terms of D-T-M
DATA (specifies input data shape and content)
DATA MATRIX INPUT:


WAY: ‘dimensionality’ of array (2,3,4 ...)
MODALITY: No of distinct sets (to be represented)
(1,2,3 …)


NB: Modality < or = Way
Common examples:



2W1M
2W2M
3W2M
basic models (LTM,UTM,FSM)
rectangular, joint (conditional )mapping
(“stack” of 2W1M) Individual differences
Scaling
CHARACTERIZATION OF BASIC MDS (2)
TRANSFORMATION (form or type of rescaling performed on data)
o Non-Metric /Ordinal:  = M(d)
 Monotonic Increasing (sims) or Decreasing (dissims)
 Order/inequality
o Strong / Guttman: (j,k) > (l,m) -> d(j,k) > d(l,m)
o weak/Kruskal:
(j,k) > (l,m) -> d(j,k) 
d(l,m)
 Equality / ties
o Primary (j,k) = (l,m) -> d(j,k) = OR d(l,m)
o 2ndary (j,k) = (l,m) -> d(j,k) = d(l,m)
o Metric / Linear
 Linear:  = L(d)
  = a + b(d)
CHARACTERIZATION OF BASIC MDS (3)

MODEL: Euclidean Distance
d j ,k 
 (x
ja
 x ka ) 2
a
where x(i,a) is the co-ordinate of point i on dimension a in
the solution configuration X of low dimension
 The basic model is Euclidean distance, but other
Minkowski metrics are available, including:

City Block Model
(Badness of) FIT: Stress
DEFINITION S OF STRESS
Raw Stress   (d jk  d ojk ) 2
(sum of squared residuals from monotone regression )
j,k
Normalisin g Factors :
NF1   d 2jk (sum of squared distances)
j,k

NF2   (d j,k  d ) 2 (sum of squared deviations from mean distance)
j,k
STRESS - FORMULAE
S1 
rawstress
NF1
S2 
rawstress
NF 2
Types of Analysis
INTERNAL:
If the analysis depends solely on the input
data, it is termed “internal”, but
 EXTERNAL:
If the analysis uses additionally to the input
data / solution information relating to the
same points (but from another source), it
is termed “external”.
