Information Value: the Value of Evidence

Download Report

Transcript Information Value: the Value of Evidence

Information value:
the value of evidence
Dr David J Marsay, C.Math FIMA
A presentation to 19 ISMOR
29.08.2002
3
Contents
1 Introduction
2 Examples
3 Theory
4 World-view
5 Implications
6 Conclusions
Section 1
Introduction
1
5
Introduction
• Information is the life-blood of military C4ISR.
• Any time we prefer one set of information to another we
implicitly ‘value’ it.
• We think we could do better:
– lessons identified.
– studies.
• Specifically needed to support UK MOD’s ARP 14
‘Battlefield picture compilation’.
1
6
Introduction
• We use P(A|B) to denote ‘the probability of A given B’
– P(A) is used to denote the [prior] probability.
• For hypotheses {H} and evidence E:
– Shannon’s ‘entropy’ is calculated from the ‘final’
probabilities, {P(H|E)}.
– Jack Good’s ‘weight of evidence’ is calculated from the
likelihoods, {P(E|H)}.
– According to Bayes’ rule, the probabilities can be
calculated from the likelihoods and ‘the priors’.
Section 2
Examples
2
Examples
Control: Suppose that a source sends accurate data to a
deterministic machine.
• Shannon’s concept does not apply. Nor does the notion of
‘priors’.
• The value of the data can be determined by valuing the
function of the machine - no fancy method needed.
• The likelihoods make sense. They are 0 or 1.
8
2
9
Examples
Command - ‘soft’ aspects:
• For an information artefact (e.g., an INTSUM) to represent
the same information implies that all recipients had the
same priors. Thus everyone receives everything in the
same order.
– Is this realistic?
• Alternatively, one could define some privileged ‘central’
viewpoint for which the information is defined.
– Does this fit doctrine?
– Is it helpful?
2
Examples
Command - ‘soft’ aspects:
• The likelihoods {P(E|H)} are a rating of the source of E.
They are thus relatively ‘objective’, ‘knowable’ and
‘shareable’.
• Likelihoods relate to current practice (reliability, accuracy).
10
2
Examples
Compilation:
• The work being reported on has looked at the relatively
‘hard’ problem of compilation, particularly ‘Battlefield picture
compilation’ under ARP 14.
• Weights of evidence can be used. (See accompanying
paper.)
• When is this reliable?
11
Section 3
Theory
3
Theory
Jack Good’s evidence:
• Likelihoods are often straightforward.
E.g., P(‘Heads’|‘Fair Coin’) = 0.5 by definition.
• Lab and field testing traditionally establish, in effect,
likelihoods.
• Surprise = -log(likelihood).
• Weight of evidence (woe) is surprise, normalised by the
prior expected surprise for the same evidence. (So that only
‘relevant detail’ counts.)
13
3
Theory
Evidence is more fundamental than Shannon’s information
• Shannon’s entropy is expected surprise.
• The more useful cross-entropy is likely surprise.
• Woe supports alternative decision methods, such as
sequential testing, hypothesis testing.
14
3
Some questionable assumptions
• Shannon assumes that systems of interest are ‘Markov’.
• Shannon noted that ‘state-determined systems’ are
‘Markov’ with probability 1.
• But Smuts (e.g.) noted that evolution drives dynamical
systems to adopt synergistic ‘emergent’ structures.
• These had a priori probability 0.
• So for social systems, international relations, military
conflict ... we cannot rely on Shannon’s ‘information’.
15
3
Some questionable assumptions
• But can likelihoods be used?
• If we abandon Markov models, how are we to judge if a
given algebra of likelihoods is valid?
• We need a ‘world meta-model’ to replace Markov.
16
Section 4
World-view
4
18
SMUTS
(synthetic modelling of uncertain temporal systems)
Delayed Double Viper
• Allows one to
investigate
emergence within
complex systems.
• Evidence of
piece-wise
Markov behaviour.
• Developed under
the MOD CRP
TGs 0,5,10.
BUBs2D5.5
t = 0.0502
t = 0.2005
4
Alternative ideas
• I postulate a model in which systems of interest to the
military are Markov in space-time ‘zones’, with more
interesting transitions at their boundaries.
• Thus Markov locally, but not globally.
• In essence emergence only happens when an overadaptation is exploited. (E.g. Ashby, Piaget.)
• Thus, as long as we can learn at least as quickly, we should
be able to recognise these situations too.
19
4
20
Supporting evidence
Applications to, and
experiences of:
• warfare
• economics
• international relations.
(My subjective view)
4
21
Reuters data for the Balkans, the 90s
Balkans April 1989- March 1999
KEDS data from w w w .ukans.edu/~keds
Entropy / Value Aggregated Monthly Phase plot
1.8
Transactional Entropy, after Shannon
1.6
1.4
1.2
4/89-4/91
4/91-11/91
1
11/91-10/95
10/95-4/97
4/97-9/97
0.8
9/97-1/99
1/99-3/99
0.6
0.4
0.2
0
0
10
20
30
40
50
60
70
80
90
RSS Value
Evidence of locally Markov behaviour
100
Section 5
Implications
5
Implications for ‘information’
Technical differences:
• The difference between the expected weight of evidence
(woe) and Shannon’s entropy is not a constant.
• Systems of interest tend to have ‘long range’ sources of
uncertainty, in addition to the ‘local’ entropy.
• We need to allow for this and ‘expect the unexpected’ to
achieve robustness.
23
5
Implications for ‘information’
Some cases where Shannon might not be appropriate
• Poor ‘local’ information.
• The ‘situation’ cannot necessarily be recognised.
• The ‘target’ is adaptable (particularly if adapting against us).
24
5
Implications for ‘information’
Typical symptoms that Shannon is inadequate:
• Mistakes often reflect a need to validate assumptions.
• Ossification, atrophy and vulnerability (Ashby / Piaget)
25
5
Implications for ‘information’
Notes:
• We can’t expect to have considered all possible hypotheses
in advance.
• However, we do know when the truth is ‘something else’
because the weights of evidence are poor for the assumed
hypotheses.
• Thus we can detect deception and ‘fixation’ (a form of selfdeception).
26
Section 6
Conclusions
6
Conclusions
• The common concepts of information assume that systems
are globally ‘simple’.
• Our systems of interest are not simple, but may be piecewise ‘simple’.
• Jack Good’s ‘weight of evidence’ can be used to ‘bridge’
‘islands of simplicity’.
• Using ‘weight of evidence’ gives significant ‘added value’ to
using just Shannon information.
28