Using Mobile Phone Meta Data For National Statistics Content

Download Report

Transcript Using Mobile Phone Meta Data For National Statistics Content

Using Mobile Phone Meta Data For
National Statistics
An introduction
May Offermans, Martijn Tennekes, Alex Priem, Shirley Ortega en Nico Heerschap
Content
1 Data Sources
‐ Event Data Records(EDR)
‐ Customer databases
2 Privacy and processing
3Results
‐ Applications in statistics
• Daytime population
• Tourism
4 Conclusions
2
Source
Call Detail Records/ Event Data Detail Records
Call Detail records can contain many variables like:
–
–
–
–
–
–
–
–
–
–
–
the phone number of the subscriber originating the call (calling party)
the phone number receiving the call (called party)
the starting time of the call (date and time)
the call duration
the billing phone number that is charged for the call
the identification of the telephone exchange or equipment writing the
record
a unique sequence number identifying the record
the disposition or the results of the call, indicating, for example, whether or
not the call was connected
call type (voice, SMS, etc.)
Each exchange manufacturer decides which information is emitted on the
tickets and how it is formatted. Examples:
Timestamp
3
Source – Mobile Phone Metadata
Call Detail Records/ Event Data Detail Records
– Monthly 4 Billion Event Data/Detail Records of
6-7 million users contains information of:
‐ Antenna location
‐ Time indicator
‐ In- or outgoing
‐ Technology information (data, sms, call ..dual/umts)
‐ Roaming (foreign devices)
– Customer database (unique number of foreign callers per
months)
4
Applications under research
‐
‐
‐
‐
‐
‐
‐
‐
‐
‐
Daytime population
Mobility, of which tourism
Safety
Demographics
Border traffic
Economical activity
Disaster management or safety planning
Use of public services
Sociology (calling patterns)
Health
Population
6
Source: Vodafone/SN
Titel van de
presentatie
Privacy & Process (1)
– Problems big data
‐ Dynamical data source that keeps on growing
‐ Daily change of antenna locations (4G)
‐ Software
‐ Transporting data
‐ Security issues
‐ Privacy
‐ Costs ->>>>
7
‐ Micro data from the mobile
network will be transferred to a
new server system.
‐ During this process most
sensitive variables become
hashed or deleted.
‐ Only Mezuro has access to the
process to collect aggregated
anonymized data
Vodafone Solution, controlled by Vodafone
Anonymized aggregated data
Mezuro
Privacy & Process (2)
Validated output
for mobility reporting
Aggregation & validation
(Anonymisation – phase 2)
Automated ‘blind’ analysis
Replace User-IDs
(Anonymisation – phase 1)
Traffic data
(Events = CDR’s)
Privacy & Process (3)
– Advantages
‐ Save, quick, fast, cheap, limits the risks and no
personal data
– Disadvantages
‐ Does not fit current methodological practice
• No personal data, so cannot be coupled to other
personal data.
• Persons are not followed directly
• No direct weighing
Research
– ‘New’ statistics- > Daytime population
– Tourism statistics -> Inbound tourism
10
Titel van de
presentatie
Results (1) - Daytime Population
Source: Vodafone/Mezuro, compiled by SN
Results (2) - Day time population
Municipal
Personal
Records
Database
Source: Vodafone/Mezuro, compiled by SN
Almere: commuter town?
Tourism
Inbound tourism
Roaming data
Results (1) Tourism
– German tourists (= devices)
Source: Vodafone/Mezuro, compiled by SN
14
Tourism (2)
German tourists at the coast
Devices
Rainfall
Source: Vodafone/Mezuro, compiled by SN
Tourism (3) Portugese roaming
Portugese roaming data during 2013 UEFA Cup
League final, Benfica (Portugal) - Chelsea (England)
Source: Vodafone/Mezuro, compiled by SN
16
Tourism (4)
Source: Vodafone/Mezuro, compiled by SN
17
Tourism (5) Different type of communication
18
Source: Vodafone/Mezuro, compiled by SN
Conclusions for tourism
– Potential
‐ Replace existing statistics and new statistics
‐ Smaller area and smaller timeframes
‐ Events
‐ Also when 24 hour limit is dropped:
• Daytrips and number overnight stays
• Flows of tourists
• Tourist related areas
– Rather trends then volumes (benchmarking)
– Privacy issues, but also access (telecom providers)
– New methodological issues/new framework (representativeness)
– Role of national statistical offices?
– Revolutionary or evolutionary?
19