Location Privacy in Mobile Computing

Download Report

Transcript Location Privacy in Mobile Computing

(ECE 256: Wireless Networking and Mobile Computing)
Location Privacy in Mobile Computing
Topics:
Pseudonymns, CliqueCloak, Path Confusion, CacheCloak …
1
Context
Better localization technology
+
Pervasive wireless connectivity
=
Location-based pervasive applications
2
Location-Based Apps
 For Example:




GeoLife shows grocery list on phone when near WalMart
Micro-Blog allows querying people at a desired region
Location-based ad: Phone gets coupon at Starbucks
…
 Location expresses context of user
 Facilitating content delivery
Its as if Location is the IP address for content
3
Double-Edged Sword
While location drives this new class of applications,
it also violates user’s privacy
Sharper the location, richer the app, deeper the violation
4
The Location Based Service Workflow
Forward
to local service:
Request:
Reply:
Reply:
Retrieve
all
available
services
in
Retrieve all available services
in
location
client’s location
Client
Server
LBS Database
(Location Based Service)
5
The Location Anonymity Problem
Request:
Retrieve all bus lines from location
to address
Client
Server
Privacy Violated
=
=
LBS Database
(Location Based Service)
6
Double-Edged Sword
Moreover, range of apps are PUSH based.
Require continuous location information
Phone detected at Starbucks, PUSH a coffee coupon
Phone located on highway, query traffic congestion
7
Location Privacy
 Problem:
Continuous location exposure
a serious threat to privacy
 Research:
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Preserve privacy without
sacrificing the quality of
continuous loc. based apps
8
Just Call Yourself ``Freddy”
 Pseudonymns
 Effective only when infrequent location exposure
 Else, spatio-temporal patterns enough to deanonymize
… think breadcrumbs
John
Leslie
Jack
Susan
Alex
Romit’s Office
9
A Customizable k-Anonymity Model for Protecting
Location Privacy
Paper by:
B. Gedik, L.Liu
(Georgia Tech)
Slides adopted from: Tal Shoseyov
10
Location Anonymity
“A message from a client to a database is called
location anonymous if the client’s identity cannot be
distinguished from other users based on the client’s
location information.”
Database
11
k-Anonymity
“A message from a client to a database is called
location k-anonymous if the client cannot be identified
by the database based on the client’s location from other
k-1 clients.”
12
Implementation of Location Anonymity
Server transforms the
message
byto“anonymizing”
Database
executes request
Server forwards
data
Server
sends
the location
datato
in the
the received
according
client
Database
replies to server
“anonymized”
message
Client sends plain
request data
anonymous
with compiled
messagedata
to the server
13
Implementation of Location k-Anonymity
y
Temporal
Spatial Cloaking
Cloaking
– Setting
– Setting
a range
a timeofinterval,
space to be
where
a single
allbox,
the clients
where all
in aclients
specific
located
location
within the
sending
range area message
said to beininthat
the time
“same
interval
location”.
are said to
have sent the message in the “same time”.
x
t
14
Implementation of Location k-Anonymity
Spatial-Temporal Cloaking –
Setting a range of space and a
time interval, where all the
messages sent by client inside the
range in that time interval. This
spatial and temporal area is
called a “cloaking box”.
t
y
x
15
Previous solutions
M. Gruteser, D Grunwald (2003) – For a fixed k
value, the server finds the smallest area around the
client’s location that potentially contains k-1 different
other clients, and monitoring that area over time until
such k-1 clients are found.
Drawback:
Fixed anonymity
value for all
clients (service
dependent)
16
The CliqueCloak Approach
Definitions:
Constraint Area:
For a message m, a constraint area is a
spatial-temporal area that contains the
sending client’s location. A client sends
his message along with a constraint area
to prevent the database from sending the
client useless information on locations
outside the constraint area.
y
m
k=3
x
17
The CliqueCloak Approach
Definitions:
Cloaking Box:
A spatial and temporal area assigned to a
transformed message. A valid cloaking box
must comply to the following conditions:
y
1. The client that sent the message m is located
in the cloaking box
m4
m2
2. The number of different clients inside the
cloaking box must be at least m.k (the
anonymity level of the message).
k=3
k=3
m1
k=2
x
3. The cloaking box must be included inside the
message’s constraint area.
18
The CliqueCloak Approach
Definitions:
Approach:
Constraint
Graph:
An l-clique in that
graph such that
l ≥ mi.k for each i is mapped by the
Each mobile node is a vertice in the
algorithm to a spatial cloaking
graph, and 2 nodes are connected iff
box, where all messages in the
each of them is inside the other
clique will be transformed using
node’s constraint area.
the cloaking box, making each of
the messages’ senders
indistinguishable from one
another.
y
m3
m2
m4
k=3
k=3
k=2
m1
k=2
x
19
The CliqueCloak Algorithm
The Idea:
• For each plain message, along with its
constraints and anonymity level k, we try
t
to find a k-clique in the constraint graph
and convert the clique into a spatial
y
cloaking box.
x
• Each of the messages inside the cloaking box will be
converted into transformed messages, replacing their location
values with the cloaking box.
• We try finding a cloaking box for a message until it is expired
(exceeds its temporal constraints).
20
Does CliqueCloak solve the location
privacy problem?
Any further concerns? Doubts?
21
Add Noise
 K-anonymity and CliqueCloak
 Convert location to a space-time bounding box
 Ensure K users in the box
 Location Apps reply to boxed region
Bounding Box
 Issues
You
K=4
 Poor quality of location
 Degrades in sparse regions
 Not real-time
22
Confuse Via Mixing
 Path intersections is an opportunity for privacy
 If users intersect in space-time, cannot say who is who later
 Issues
 Users may not be collocated in space and time
 Mixing still possible at the expense of delay
23
Existing solutions seem to suggest:
Privacy and Quality of Localization (QoL)
is a zero sum game
Need to sacrifice one to gain the other
24
Ideal Solution Should
Break away from this tradeoff
Target:
Spatial accuracy
Real-time updates
Privacy guarantees
Even in sparse populations
Another Idea: CacheCloak
25
CacheCloak Intuition
Exploit mobility prediction to create
future path intersections
User’s paths are like crossroads of breadcrumbs
App knows precise locations, but doesn’t know the user
26
CacheCloak
 Assume trusted privacy provider
 Reveal location to CacheCloak
 CacheCloak exposes anonymized location to Loc. App
Loc. App1
Loc. App2
Loc. App3
Loc. App4
CacheCloak
27
CacheCloak Design
 User A drives down path P1
 P1 is a sequence of locations
 CacheCloak has cached response for each location
 User A takes a new turn (no cached response)
 CacheCloak predicts mobility
 Deliberately intersects predicted path with another path P2
 Exposes predicted path to application
 Application replies to queries for entire path
 CacheCloak always knows user’s current location
 Forwards cached responses for that precise location
28
CacheCloak Design
 Adversary confused
 New path intersects paths P1 and P2 (crossroads)
 Not clear where the user came from or turned onto
Example …
29
Example
30
Benefits
 Real-time
 Response ready when user
arrives at predicted location
 High QoL
 Responses can be specific to location
 Overhead on the wired backbone (caching helps)
 Entropy guarantees
 Entropy increases at traffic intersections
 In low regions, desired entropy possible via false branching
 Sparse population
 Can be handled with dummy users
31
Quantifying Privacy
 City converted into grid of small sqaures (pixels)
 Users are located at a pixel at a given time
 Each pixel associated with 8x8 matrix
 Element (x, y) = probability that user enters x and exits y
y
 Probabilities diffuse
 At intersections
 Over time
x
pixel
 Privacy = entropy
E user  
pixels
pi log pi
32
Diffusion
 Probability of user’s presence diffuses
 Diffusion gradient computed based on history
 i.e., what fraction of users take right turn at this intersection
Time t1
Time t2
Time t3
Road
Intersection
33
Evaluation
 Trace based simulation
 VanetMobiSim + US Census Bureau trace data
 Durham map with traffic lights, speed limits, etc.
6km x 6km
10m x 10m pixel
1000 cars
 Vehicles follow Google map paths
 Performs collision avoidance
34
Results
 High average entropy
 Quite insensitive to user density (good for sparse regions)
 Minimum entropy reasonably high
35
Results
 Per-user entropy
 Increases quickly over time
 No user starves of location privacy
36
Issues and Limitations
 CacheCloak overhead
 Application replies to lots of queries
 However, overhead on wired infrastructure
 Caching reduces this overhead significantly
 CacheCloak assumes same, indistinguishable query
 Different queries can deanonymize
 Need more work
 Per-user privacy guarantee not yet supported
 Adaptive branching & dummy users
37
Closing Thoughts
Two nodes may intersect in space but not in time
Mixing not possible, without sacrificing timeliness
Mobility prediction creates space-time intersections
Enables virtual mixing in future
38
Closing Thoughts
CacheCloak
Implements the prediction and caching function
Significant entropy attained
even under sparse population
Spatio-temporal accuracy
remains uncompromised
39
Final Take Away
Chasing a car is easier on highways …
Much harder in Manhattan crossroads
CacheCloak tries to turn a highway into
a virtual Manhattan
… Well, sort of …
40
Questions?
41
 Emerging trends in content distribution
 Content delivered to a location / context
 As opposed to a destination address
 Thus, “location” is a key driver of content delivery
IP address : Internet
=
Location : CDN
 New wave of applications
42
 Emerging trends in content distribution
 Content delivered to a location / context
 As opposed to a destination address
 Thus, “location” is a key driver of content delivery
IP address : Internet
=
Location : CDN
 New wave of applications
43
Example
44
Location Privacy
 Problem:
Continuous location exposure
deprives user of her privacy.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
45
Location Frequency
 Some location apps are reactive / infrequent
 E.g., List Greek restaurants around me now (PULL)
 But, many emerging apps are proactive
 E.g., Phone detected at Starbucks, PUSH a coffee coupon
46
Location Frequency
 Some location apps are reactive / infrequent
 E.g., List Greek restaurants around me now (PULL)
 But, many emerging apps are proactive
 E.g., Phone detected at Starbucks, PUSH a coffee coupon
Proactive apps require
continuous location
Opportunity for Big Bro to track you
over space and time
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
47
Categorizing Apps
 Some location apps are reactive
 You ask, App answers
 E.g., Pull all Greek restaurants around your location
 But, many emerging apps are proactive
 E.g., Phone detected at Starbucks, PUSH a coffee coupon
48
Categorizing Apps
 Some location apps are reactive
 You ask, App answers
 E.g., Pull all Greek restaurants around your location
 But, many emerging apps are proactive
 E.g., Phone detected at Starbucks, PUSH a coffee coupon
Proactive apps require
continuous location
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
49