The impact of disclosure control on the Special Workplace Statistics

Download Report

Transcript The impact of disclosure control on the Special Workplace Statistics

The effect of disclosure control on
the SWS data
Eileen Howes
30 September 2004
With much assistance from:
Giorgio Finella
Bill Armstrong
Summary of presentation
• Early concerns with output area matrix data
• Early comparison with Theme Table 10
• Commissioned aggregations
• Recent comparisons with commissioned table
• What can we do next?
Data Management and Analysis Group
OA matrix data quality
• Major problems with output area matrix data
• Clementswood and Hyde Park wards – difference
between Theme Table 10 and SWS301, means of travel to
work for all workers
Data Management and Analysis Group
Clementswood ward – workers
SWS301
•
•
•
•
Total workers
Underground
Car driver 4,646
Bus
1,567
Data Management and Analysis Group
TT10 Difference
8,810
8,059
303 263 40
4,005 641
1,554 13
751
Hyde Park ward - workers
SWS301
•
•
•
•
TT10 Difference
Total workers
17,02416,712312
Underground
5,739 5,298 441
Car driver 2,779 2,954 -175
Bus
1,835 1,852 -17
Data Management and Analysis Group
Why are they different?
Disclosure control applied to a large number of flows
30 OAs in Clementswood ward, all flows into each OA (about
25,000)
Problem with disappearing flows
Data Management and Analysis Group
Commissioned aggregations
Part of Table SWS301 for all OAs of workplace in London
Areas of residence aggregated to:
Wards
Districts
Inner and Outer London
GORs
Counties/former counties in East and SE
Data Management and Analysis Group
Commissioned aggregations
Tables received
Big files
Supertable format
One big mistake in the spec – no area codes
More of a challenge
Data Management and Analysis Group
Nightmare at City Hall
Export csv files from Supertable
Files with 15,200,000 records
Adding area codes
Checking data
Data Management and Analysis Group
Commissioned table C0310
Finally, we have csv and SASPAC system files for:
OA of workplace in London, ward of residence
Means of travel to work
Data Management and Analysis Group
Comparisons of C0310 with OA to OA SWS data
C0310 SWS301 % diff.
Residents of Clementswood 3,843 3,846 -0.1
ward who are in work
Works at/from home
Travels by:
Underground
Train
Bus
Data Management and Analysis Group
355
357
-0.1
463
483 -4.3
605 621 -2.6
424 432 -1.9
Comparisons of C0310 with OA to OA SWS data
(cont.)
Taxi
Car driver
Car passenger
Motor cycle
Bicycle
On foot
Other
Data Management and Analysis Group
145
18
33
18
C0310 SWS301
% diff.
27 15 +44.0
1,351 1,340 +0.8
135
+7.0
27 -50.0
33
0.0
404 391
+3.2
12 +33.0
Comparisons of C0310 with SWS data
Origin
Ward of residence (Clementswood in London
Borough of Redbridge)
C0310 aggregated by ONS, rounded once
SWS OAs aggregated by user, thousands of rounded numbers
Destination London
(Sum of all flows into all OAs in London – quite a lot of small
numbers rounded differently)
Data Management and Analysis Group
Commissioned Table C0310
Clementswood ward to OAs in London
Number of flows in table
Number of zero flows
Number of non-zero flows
Flows with value of 3
Flows with value of 4
Flows with value of 5
Flows with value of 6
Flows with value of 7
Flows with value of 8
Data Management and Analysis Group
24,140
23,502
638
424 (66%)
21
10
70
17
12 etc
Comparisons of C0310 with SWS301
Clementswood ward to OAs in London
Number of non-zero flows C0310 638
Number of non-zero flows SWS301 595
But:
Only 412 flows are common to both tables
182 out of 595 flows in SWS301 are not in C0310 (31%)
226 out of 638 flows in C0310 are not in SWS301 (35%)
Data Management and Analysis Group
Comparisons of C0310 with SWS301
• 214 flows value 4+ on C0310
• 63 of these are not on SWS301
• 29% not on SWS301
• Values of these 63 NOT all multiples of 3 (some 4,5,7,8,10)
Data Management and Analysis Group
Comparisons of C0310 with SWS301
• 217 flows of 4+ on SWS301
• 66 of these flows are not on C0310
• 30% not on C0310
• Values of these 66 are all multiples of 3
Data Management and Analysis Group
Comparisons of C0310 with SWS301
• Of the 151 common flows (on both C0310 and SWS301):
• Only 28 (19%) have the same value in both tables
• Some are very different
Data Management and Analysis Group
Comparisons of C0310 with SWS301
• Less than 5% of non-zero flows have the same value in
each table
• 95% are different numbers or are not there
Data Management and Analysis Group
And if you thought the ward to ward data would be
better:
Ward to ward SWS
Total number of flows in data
2,123,432
Flows which appear in only 1 table 294,053 Flows which
appear in only 2 tables
351,352 Flows which appear in
only 3 tables 335,690 Flows which appear in only 4 tables
259,260 Flows which appear in only 5 tables
222,628 Flows which appear in all 6 tables
660,449
Data Management and Analysis Group
Is this data fit for purpose?
• No
Data Management and Analysis Group
So what can we do next?
More of this type of analysis
Lobby for separate class of user
Public sector/academic users
1920 Census Act
Data Management and Analysis Group