Eusprig 2011

Download Report

Transcript Eusprig 2011

Living with spreadsheets
Dean Buckner
Financial Services Authority
JULY 2011
AGENDA
• Recap on last year’s talk
– Why we won’t get rid of spreadsheets
• But how can we live with them?
Why we won’t get rid of spreadsheets
• The tower of Babel
• Early views on machine translation
(and why they failed)
• The computer Babel
The tower of Babel
• “And the whole earth was of one language, and of one
speech.
• “And they said, Go to, let us build us a city and a tower,
whose top may reach unto heaven; and let us make us a
name, lest we be scattered abroad upon the face of the whole
earth.
• “And the Lord said, Behold, the people is one, and they have
all one language; and this they begin to do; and now nothing
will be restrained from them, which they have imagined to do.
• “Go to, let us go down, and there confound their language,
that they may not understand one another's speech.
• “Therefore is the name of it called Babel; because the Lord
did there confound the language of all the earth.”
Machine translation
• Proposals for mechanical translators
of languages pre-date the invention of
the digital computer. The first
recognisable application was a
dictionary look-up system developed
at Birkbeck College, London in 1948.
Code breaking
• Warren Weaver had been involved in code-breaking during
the Second World War.
• A simple idea: given that humans of all nations are much the
same (in spite of speaking a variety of languages), a
document in one language could be viewed as having been
written in code.
• Once this code was broken, it would be possible to output the
document in another language.
• From this point of view, Chinese was English in code.
• “… one naturally wonders if the problem of translation could
conceivably be treated as a problem in cryptography. When I
look at an article in Russian, I say: "This is really written in
English, but it has been coded in some strange symbols. I will
now proceed to decode."
• http://www.mt-archive.info/Weaver-1949.pdf
It failed
• US funding of Machine Translation research
cost the U.S. public $20 million by the mid
1960s. The Automatic Language Processing
Advisory Committee (ALPAC) produced a
report on the results of the funding and
concluded that "there had been no machine
translation of general scientific text, and
none is in immediate prospect".
It failed again?
• There was renewed interest in the 1980s with
the emergence of the ‘artificial intelligence’
idea.
• (At least if Google translator is anything to
go by)
– Seinen Lebensabend verbrachte in bad kleinen,
in der Nähe seiner Geburtsstadt Wismar.
– His life was spent in small bathroom, near his
hometown of Wismar.
Why it is difficult
• The teacher sent the boy to the
headmaster because
– he wanted to see him
– he had been throwing stones
– he was fed up with his bad behaviour
The computerised Babel
• In the beginning was the mainframe
– Keep the ‘meaning’ of every symbol in just one place, and
have everything else inside the system point to it directly (a
‘pointer’ is simply a mechanical means of moving from one
address to another’)
– Force users either to check their translation by means of a
‘compiler’ (this is for users called ‘programmers’)
– or have them enter information by means of a menu
system that forces acceptable choice (for common or
garden users).
– This worked reasonably well until the 1990s
The tower crumbles
• The 1980s and 1990s saw increasing
specialisation of systems
– General ledger systems
– Payment systems
– Loan systems
– Claims systems etc
• They couldn’t talk to each other 
The modern Babel
• A modern bank or insurance company
contains dozens, perhaps hundreds of
disparate systems.
• There is no ‘compiler’ to allow
communication between them
• Spreadsheets are the solution to this
communication problem
Deceptively difficult problems
• Deceptively difficult problem: a problem
whose solution seems easy
– particularly by the application of ‘technology’
• But isn’t
• As we saw, communication between systems
is incredibly difficult
– not like ‘code-breaking’ at all
• But it seems easy
– I say: "This is really written in English, but it has
been coded in some strange symbols. I will now
proceed to decode."
Apparently easy solutions (1)
• The Internet
– The Internet became embedded in popular
consciousness in the 1990s and 2000s
– The problem of sending data from one place to
another seemed to be solved
– But it didn’t solve the communication problem
– The Chinese send a letter to English speakers,
who receive it OK. But no one understands it.
Apparently easy solutions (2)
• Data warehouses
– An apparently simple solution
– Send all the data from disparate source systems into one
place (the ‘warehouse’)
– Then you have it all in one place
• But the problem remains – you have all the different
languages in one room
– And no one understands each other
– Even worse, when the translation was done on
spreadsheets, at least the users understood what was
going on
– Now nobody does
Large spreadsheet systems
• Spreadsheet systems are becoming huge
– We saw a 600 spreadsheet system last year.
That seemed big
– Then we saw a 1,000 sheet system. That was
even bigger.
– Then we found a 9,000 sheet system. That was
awesome.
• What do we do?
Dangers of large systems
• Large spreadsheet systems are like
mainframes
– But they don’t have a central compiler
– The embedded risks are huge
Examples
• Hard-coded references passing
unchecked through many
spreadsheets
– Date, source, and type of data is
completely opaque
– Nature of transformations completely
unclear.
– Location of transformations unknown
Examples
• Senior management sees only
immediate source sheets
– Under a dozen seems manageable
– But they don’t see the hundreds or
thousands of sheets that are feeding the
dozen.
– Tip of iceberg
Solving the problem
• [this page deliberately left blank]
Questions & Comments