Map-Matching for Outdoor Tracking Guobin (Jacky) Shen [email protected] Microsoft Research Asia Mobile Location Sensing Tutorial, Mobisys’13.
Download ReportTranscript Map-Matching for Outdoor Tracking Guobin (Jacky) Shen [email protected] Microsoft Research Asia Mobile Location Sensing Tutorial, Mobisys’13.
Map-Matching for Outdoor Tracking Guobin (Jacky) Shen [email protected] Microsoft Research Asia Mobile Location Sensing Tutorial, Mobisys’13. Applications • Navigation • Personal location diary: – Keep track of trips and location-based notes • Traffic estimation from crowd/fleet sources • Package, trash tracking – Low power embedded chip • Etc, etc, … What is map matching? • Map-matching is the process of aligning a sequence of observed user positions with the road network on a digital map. • Essentially, a map-constrained filtering process Map Data Raw sensor Data Location Samples Matching Engine Trajectory Typical sensors and techniques • Sensors: – GPS – WiFi – Cellular – IMU (accelerometer, compass, gyroscope) GPS • Techniques: Relative Energy Consumption WiFi IMU – Hidden Markov Model Cellular 5-10m 75m 150m 300m A trivial task when using GPS? • Seemingly trivial – GPS’s accuracy: < 10m – “nearest road” alg. • But it’s not always true. – Sampling frequency: • Unaffordable continuous location acquisition • GPS is energy hungry – Canyon effect More challenging using other sensors The Case of WiFi The Case of Cellular BRIEF INTRODUCTION TO HMM Markov Models • Set of states: {s1 , s2 ,, sN } • Process moves from one state to another generating a sequence of states : si1 , si 2 ,, sik , • Markov chain property: – Probability of each subsequent state depends only on what was the previous state: P(sik | si1 , si 2 ,, sik 1 ) P(sik | sik 1 ) • To define Markov model needs these probabilities: – Transition probabilities, aij P(si | s j ) – Initial probabilities i P(si ) Example of Markov Model 0.3 0.7 Rain Dry 0.2 0.8 • Two states : ‘Rain’ and ‘Dry’. • Transition probabilities. E.g., P(‘Dry’|‘Rain’)=0.7 • Initial probabilities: – P(‘Rain’)=0.4 , P(‘Dry’)=0.6 . Calculation of sequence probability • By Markov chain property, prob of state sequence: P(si1 , si 2 ,, sik ) P(sik | si1 , si 2 ,, sik 1 ) P(si1 , si 2 ,, sik 1 ) P(sik | sik 1 ) P(si1 , si 2 ,, sik 1 ) P(sik | sik 1 ) P(sik 1 | sik 2 ) P(si 2 | si1 ) P(si1 ) • Prob of the state sequence {‘Dry’,’Dry’,’Rain’,’Rain’} P({‘Dry’,’Dry’,’Rain’,Rain’} ) = P(‘Rain’|’Rain’) P(‘Rain’|’Dry’) P(‘Dry’|’Dry’) P(‘Dry’) = 0.3*0.2*0.8*0.6 Hidden Markov models. • Markov Model: – Set of states, sequence of states, – Markov chain property, – Transition prob A, Initial prob . • Hidden Markov Model: – States are not visible, but each state generates one random observable/visible states {v1 , v2 ,, vM } • To define an HMM, needs additional probabilities: – Emission/Observation probabilities B: bmi = P(vm | si ) • HMM: M=(A, B, ) Example of Hidden Markov Model • Two states : ‘ 0.3 – Low’ and ‘High’ atmospheric pressure. 0.7 Low • Two observations : High – ‘Rain’ and ‘Dry’. • Transition probabilities 0.2 0.6 Rain 0.4 0.8 0.6 0.4 Dry – Same as before • Observation probabilities : P(‘Rain’|‘Low’)=0.6 , P(‘Dry’|‘Low’)=0.4 , P(‘Rain’|‘High’)=0.4 , P(‘Dry’|‘High’)=0.3 • Initial probabilities: P(‘Low’)=0.4, P(‘High’)=0.6 Main problems using HMMs • Given the HMM M=(A, B, ) and the observation sequence O=o1 o2 ... oK , – Evaluation problem: calculate the probability that the model M has generated sequence O. – Decoding problem: calculate the most likely sequence of hidden states Si that produced this observation sequence O. • Learning problem. Given some training observation sequences O, and general structure of HMM (numbers of hidden and visible states), determine HMM parameters A, B, that best fit training data. Decoding problem • Given the HMM M=(A, B, ) and the observation sequence O=o1 o2 ... oK, find the state sequence Q= q1,…, qK which maximizes P(Q | O ), or equivalently P(Q, O). o Trellis representation 1 o o k s1 s1 s2 s2 a1j k+1 o ( K = Observations) s1 s1 s2 s2 sj si a2j si si aij aNj Time= sN sN sN sN 1 k k+1 K Viterbi algorithm (1) • General idea: – if best path ending in qk= Sj goes through qk-1= Si, then it should coincide with best path ending in qk-1= Si. qk-1 s 1 s i s qk a1j aij s j aNj N – To backtrack best path, keep the predecessors of Sj was Si. Viterbi algorithm (2) • Define: k(i) as the maximum probability of producing observation sequence o1 o2 ... ok when moving along any hidden state sequence q1… qk-1 and getting into qk= Si . k(i) = max P(q1… qk-1 , qk= Si, o1 o2 ... ok) where max is taken over all possible paths q1… qk-1 . • Recursion formula: k(i) = max P(q1… qk-1 , qk= Si, o1 o2 ... ok) = maxi [aijbj(ok)k-1(j)] Viterbi algorithm (3) • Initialization: 1(i) = max P(q1= Si, o1) = i bi (o1) , 1<=i<=N. • Forward recursion: k(i) = maxi [aijbj(ok)k-1(j)] (1<=j<=N, 2<=k<=K) • Termination: Choose best path ending at time K (i.e., maxi [ K(i) ]) • Backtrack best path. THE CASE OF GPS Paul Newson and John Krumm, "Hidden Markov Map Matching Through Noise and Sparseness", ACM SIGSPATIAL GIS 2009, pp. 336-343. A trivial task when using GPS? • Seemingly trivial – GPS’s accuracy: < 10m – “nearest road” alg. • But it’s not always true. – Sampling frequency: • Unaffordable continuous location acquisition • GPS is energy hungry – Canyon effect Many people encountered loops in navigation when using GPS. Three Insights 1. Correct matches tend to be nearby 2. Successive correct matches tend to be linked by simple routes 3. Some points are junk, and the best thing to do is ignore them Mapping to a HMM Three Insights, Three Choices 1. Match Candidate Probabilities 2. Route Transition Probabilities 3. “Junk” Points Obtaining probabilities GPS Difference Probability Distance Difference Probability 0.12 7 Data Histogram 0.1 Data Histogram 6 Gaussian Distribution Exponential Distribution 5 0.08 4 0.06 3 0.04 2 0.02 1 0 0 0 5 10 15 Distance Between GPS and Matched Point (meters) Emission Probability Model 20 0 0.5 1 1.5 abs(great circle distance - route distance) (meters) Transition Probability Model 2 Match Candidate Limitation • Don’t consider roads “unreasonably” far from GPS point • “Junk Points” Route Candidate Limitation • Route Distance Limit • Absolute Speed Limit • Relative Speed Limit Robustness to Sparse Data Error vs. Sampling Period 1 0.9 • HMM Map Matcher works “perfectly” up to 30 second sample period • HMM Map Matcher is reasonably good up to 90 second sample period 0.8 Route Mismatch Fraction 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Data: http://research.microsoft.com/en-us/um/people/jckrumm/MapMatchingData/data.htm 600 540 480 420 360 300 240 180 120 90 60 45 30 20 10 5 2 1 Sampling Period (seconds) THE CASE OF WIFI Thiagarajan, Arvind et al. “Vtrack: Accurate, Energy-Aware Road Traffic Delay Estimation using Mobile Phones, In Sensys’09, pages 85-98. Background • Road Traffic Problem: – Causes inefficiency, Fuel Waste, Frustration • Trends: – More cars on road, more time wasted in traffic • Smartphone Capabilities – Massive Deployment – GPS , WiFi, and Cellular localization capabilities • The ask: – Accurate, energy-aware road traffic delay estimation using mobile phones? VTrack • VTrack is a system for travel time estimation using smart phones sensors (GPS, WiFi). • Key Challenges – Sensor unreliability • GPS outages • WiFi is only applicable when APs are available – Energy consumption • WiFi is more energy efficient (2x) but less accurate • Need to find optimal sample rate Key idea: HMM map matching to produce the actual trajectories using WiFi location estimation System Architecture • Users with smartphones (crowdsourcing) – Run VTrack reporting application – Report position data periodically to the server VTrack algorithm • Preprocess the position samples to remove outliers – Speed constraint (<200mph) • HMM map matching • Bad zone removal – Viterbi decoding confidences Raw trace Outlier removal & interpolation Viterbi matching Bad zone removal HMM Map Matching • Hidden states: road segments • Observables: position samples • Emission probability: – Assumed to be uniform, 1/N • Transition probabilities p = P(Sj,t|Si,t-1): – If i=j, p= (constant set to <= 1/(dmax+1), where dmax is maximum out-degree of any intersection). – If j does not start where i ends, p=0. – If j does start where i ends, p= or 0. Results • HMM-based map matching – Robust to noise – Error < 10% • When only WiFi • With 40m Gaussian noise • GPS available and accurate, periodic sampling helps with both THE CASE OF CELLULAR Thiagarajan, Arvind et al. "Accurate, Low-Energy Trajectory Mapping for Mobile Devices.” NSDI 2011. Problem statement • Can we process cellular signal information to produce accurate trajectories? How accurate? Motivation • Push down the energy consumption – GPS signals are energy-intensive • Frequent GPS sampling drains battery fast – Cellular consumes almost no additional energy • But, existing cellular location systems aren’t good enough to find tracks • Existing map-matching algs. perform poorly w/ cellular signals. Key insights • Proper preprocessing is crucial! – Translating raw sensor data to location stamps is a kind of preprocessing, but not necessarily good. • Cellular has very localization accuracy, introduce too much noises. Map Data Raw sensor Data Sequencing Location & Smoothing Samples Matching Engine Trajectory Overall workflow Sensor Hints Cell Tower Fingerprints Grid Sequencing Training Database Sequence of Grids Smoothing and Interpolation • Instead, first sequence GSM FPs on a spatial grid Sequence of Coordinates Segment Matching Sequence of Road Segments • Not convert radio FP to (lat,lon) coords and then match to a map Road Map Three key steps HMM fingerprints to grid sequence HMM smooth grid to road segments HMM for grid sequencing • Given a sequence of GSM fingerprints, find most likely sequence of grid cells – Hidden state: Grid Cell – Observables: <CellTowerID, RSSI> • Emission score: – P(Signature | Grid Cell) • Transition score: – P(GCi|GCi-1) • Decoding: – max(Emission Score * Transition Score) Smoothing and interpolation • Smoothing: – Step1: centroid(training points in a grid) – Step2: centroid(Cg in a sliding window) • Interpolation: to increase the sample rate s.t., minimum road segment (~30m) has one sample HMM for road segment matching • Hidden state: road segment • Observables: <lat, lon> • Emission score: Score = e -d 2 • Transition score: – P(RSi|RSi-1) Add motion state and turn hints • Binary motion states: still or moving (from Acc) • Binary turn hints: turn or not turn (from Compass) • Consider hints in transition score – P(RSi|RSi-1)*P(Move|Move Hint)*P(Turn|Turn Hint) Evaluation results • 75% precision, 2.4x better than existing approaches • Sequence first before converting to <lat, lon> is critical • Sensor hints lead to small improvement in system metrics, but can correct some systematic errors (e.g., kinks, looping) THE CASE OF IMU Santanu Guha, et al., AutoWitness: Locating and Tracking Stolen Property While Tolerating GPS and Radio Outages, Sensys’10. AutoWitness • Objectives – Detection of theft – Tracking of the stolen tag – Pinpointing of the location • Challenges - Cost & Energy – Hardware selection – Theft classifier – Tracking method System Overview Emded tag Dormant Motion Vehicular movement Motion detection (vibration dosimeter) Classification (Accelerometer) Distance / turn estimate (Accelerometer / Gyro) Transfer to server (GSM / GPRS) Path reconstruction Sufficient time & RF power available Turns Angle Estimate • Gyro single integration change in the attitude • Heading angle detection: – Absolute value of 1st difference • Thresholds, Dh & Dl • Activation time – From Dh to Dl Distance Estimate • 2nd order Butterworth Filter to remove noise • Median of 20 samples mean of 10 such medians • Accelerometer Double integration – Leverage still states to reset accelerometer drift – Handling curved roads – angular rotation info HMM Map Matching • Hidden states: (compound) road segments • Observables: as unique to hidden state as possible – Distance between successive stops – Total length of the segment (btwn successive turns) • Initial state: – Assumed to be uniform, 1/N, all intersections within radius r of the tag deployed position. • Transition probabilities p = 𝑃(𝑆𝑗,𝑡|𝑆𝑖,𝑡−1): – Obtained using Bayesian inferencing – Jointly determined by distance, d, traveled on segment j and the angle of turn, , from segment i to j. • Emission probabilities Transition probability inference • Associate estimation margins d for distance, and for turn • For travel distance d, road segments intersecting within dd and turn within form candidate set Ci. • Set uniform prior prob Pi(j) = 1/|Ci|. • Assume Gaussian distribution for Pi(d|j)~N(u, 2), trained from experimental data. • Transition prob: 𝑃𝑖(𝑗|𝑑) = 𝑃𝑖(𝑑|𝑗)𝑃(𝑗) 𝑃(𝑑) Emission probability • Cause: distance estimation error • Let 𝑃(𝑦|𝑥) denote the probability of obtaining x from distance estimation when the true distance is y. – 𝑃(𝑦|𝑥) = 0, if |y−x|> d, – 𝑃 𝑦 𝑥 = 1 − |𝑦−𝑥| , else. 𝑦 • Pinning to the traffic light Evaluation THE CASE OF CELLULAR + IMU He Wang, et al., WheelLoc: Enabling Continuous Location Service on Mobile Phone for Outdoor Scenarios, Infocom’13. Motivation • Many applications call for location service as a first class citizen of modern mobile OS – Always available, good accuracy personal diary geo-fencing location based reminder – GPS: energy hungry, long time cold start, canyon effect – WiFi: energy hungry, coverage problem, low accuracy – Cellular: energy efficient, very low accuracy WheelLoc • Seeks to leverage energy-efficient IMU sensors, cellular and road-map for continuous location service cell tower distance and direction More advanced pre-processing Raw sensor Data road map Map Data Distance & Direction Matching Engine Trajectory Overview Sensor Readings Motion State Detection Cell ID Distance and Direction Cell Tower Location DB Map Matching Interpolation and Extrapolation Road Map Motion State Detection • Framework {F1, F2, F3, …} sensor readings feature extraction • Driving v.s. Still slight vibration decision tree Mobility Trace Estimation – Driving • Distance and Distance Estimation a2 a a1 time large horizontal acceleration difference heading vector = a1 – a2 Mobility Trace Estimation – Driving • Speed and Distance Estimation Mobility Trace Estimation – Driving • Direction Estimation x compass y z heading direction calculate direction from magnetic field directly Mobility Trace Estimation – Driving • Direction Estimation x y z heading vector projection of magnetic field magnetic field Mobility Trace Estimation – Cycling • Speed and Distance Estimation Slow (1 Hz) Fast (2 Hz) Mobility Trace Estimation – Cycling • Direction Estimation Take right chances! Mobility Trace Estimation – Evaluation Distance error is within 10% most of time. Direction error is within 25 degree most of time. Map Matching – HMM O 1 observations: O 2 O 3 ... S 2 S 3 ... emission S 1 states: transition states in WheelLoc: road segments on the track cell tower emission distance direction transition 69 The probabilities • Initial states: determined by cell tower coverage • Emission probability – A road segment has large intersection with coverage of multiple cell towers – 𝑃 = 𝑁(𝑑𝑖𝑠𝑡 𝑙𝑡, 𝑐𝑡 , 𝑅) • Transition probability: also Bayesian inferencing – Similar to AutoWitness • Prune candidates with estimated distances and turn angles • Error distribution of estimated distances (as the prior prob.) – Apply cell tower coverage constraint System Evaluation – Location Accuracy 376 30 71 APPLICATION-AWARE ENERGY SMARTNESS Kaisen Lin, et al., Energy-Accuracy Trade-off for Continuous Mobile Device Location, Mobisys’10. Location is important • Social Networking – Show the closest friends • Mobile Search/Ads – Nearby restaurants – Offers for nearby products/tasks from your calendar or shopping lists – Nearest movie show times, events • Track based apps • Context aware apps • But continuous accurate location is energy intensive Observations • Not all apps need high accuracy all the time • Sensor accuracy varies across sensors, changes with time/space Accuracy needed for restaurant search in Portland, OR A-Loc Location Service Input Dynamic Accuracy Requirement Sensor Energy Model Dynamic Sensor Accuracy Model Output Sensor Selection Algorithm Sensor Data Location Sensor Accuracy Model • Error, x, learned as a function of 2D location – Used in conditional distribution p(z|x) for algorithm • Characterizes expected accuracy of each sensor in different regions – Only load city around current location to save memory Error () 6 4 2 3 0 1 1 2 3 X1 4 X2 Sensor Energy Models • Energy to obtain sensor reading – Switching energy included if the radio not already on for other usage (Eg. WiFi power on: 115mJ and off: 65mJ) – Cold and warm start differ for GPS 100000 Energy (mJ) 10000 1000 100 10 1 Min Max Dynamic Accuracy Requirements • Accuracy required depends on density of interesting entities – Search: density of vendors – Ads: density of advertisers for products/tasks mined from user’s calendar or shopping lists • Eg. Density of movie theatres, density of competitor franchises – Social networks: density of nearby friends – Context based device optimizations: geographical separation between home, office, malls etc Accuracy Requirement • Can be determined using known Yellow Pages/POI data for most cases • Easy to determine if accurate location known – Need to determine accuracy requirement using only estimated location • Guaranteed to yield correct list of nearest entities Eg. Accuracy needed for pizza restaurant search in Portland, OR. Darker shades represent higher accuracy needed. A-Loc Location Service Input Dynamic Accuracy Requirement Sensor Energy Model Dynamic Sensor Accuracy Model Output Sensor Selection Algorithm Sensor Data Location Sensor Selection Algorithm • Predict error for each modality – Suppose x = location (estimate) and z = observation (2D vectors) • If modality i were to be selected: p(x|zi) = p(zi|x) p(x|z(t-1)) Distribution of Sensor predicted model at x location Belief of location based on past observation ei Predicted error: spread of distribution p(x|zi) Take weighted mean over predicted zi since computed before sensing Algorithm (contd.) • p(x|z(t-1)): Location belief based on prior observation – Obtained via any existing prediction method • 2nd order HMM (considers direction of motion) • linear extrapolation in new locations (HMM not yet learned) – Can incorporate shared knowledge on where mobile users more likely to be (land use data, prior locations) • Using error expected from each modality, select one that meets the required error for lowest energy Bayesian Estimation Algorithm User Data Global Data User Query p(x|z(t-1)) Energy Model Location model (prior) accuracy constraint posterior Selection Algorithm Sensor model p(zi|x) Sensor Observation Location Past Observations p(x|zi) = p(zi|x) p(x|z(t-1)) Summary • Covered several representative work • HMM is an effective map-matching method • Proper pre-processing (and post-processing) is crucial towards improved system performance • Multi-source sensor fusion and map matching is one promising direction towards continuous and accurate location provisioning. Major References • • • • • • • Paul Newson and John Krumm, “ Hidden Markov Map Matching Through Noise and Sparseness,” ACM SIGSPATIAL GIS 2009, pp. 336-343. Arvind Thiagarajan, Lenin Ravindranath, Katrina LaCurts, Samuel Madden, Hari Balakrishnan, Sivan Toledo, and Jakob Eriksson, “Vtrack: Accurate, Energy-Aware Road Traffic Delay Estimation using Mobile Phones,” In Sensys’09, pages 85-98. Arvind Thiagarajan, Lenin Ravindranath, Hari Balakrishnan, Samuel Madden, and Lewis Girod, “Accurate, Low-Energy Trajectory Mapping for Mobile Devices,” in NSDI 2011. Santanu Guha, Kurt Plarre, Daniel Lissner, Somnath Mitra, Bhagavathy Krishna, Prabal Dutta and Santosh Kumar, AutoWitness: Locating and Tracking Stolen Property While Tolerating GPS and Radio Outages, in Sensys’10. Arvind Thiagarajan, James Biagioni, Tomas Gerlich and Jakob Eriksson, “Cooperative Transit Tracking using Smart-phones,” in Sensys’10. He Wang, Zhiyang Wang, Guobin Shen, Fan Li, and Feng Zhao, “WheelLoc: Enabling Continuous Location Service on Mobile Phone for Outdoor Scenarios,” in Infocom’13. Kaisen Lin, Aman Kansal, Dimitrios Lymberopoulos and Feng Zhao, “Energy-Accuracy Trade-off for Continuous Mobile Device Location,” in Mobisys’10.