Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA [email protected] Interpret a Sequential Signal 1-D Signal80400 Time (seconds) Signal is • Often a function of time (as.

Download Report

Transcript Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA [email protected] Interpret a Sequential Signal 1-D Signal80400 Time (seconds) Signal is • Often a function of time (as.

Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA [email protected]

1

Interpret a Sequential Signal

1-D Signal

120 100 80 60 40 20 0 0 10 20 30 40 50

Time (seconds)

60 70 80 90 100 • • Signal is Often a function of time (as above) Often from a sensor 2 2

Pervasive/Ubicomp Examples

• • • • • • • Signal sources Accelerometer Light sensor Gyro sensor Indoor location GPS Microphone … • • • • • • • Interpretations Speed Mode of transportation Location Moving vs. not moving Proximity to other people Emotion … 3 3

Goals of this Tutorial

• • • • • • Confidence to add sequential signal processing to your research Ability to assess research with simple sequential signal processing Know the terminology Know the basic techniques • How to implement • Where they’re appropriate Assess numerical results in an accepted way At least give the appearance that you know what you’re talking about 4 4

Not Covering

Regression – fit function to data 12000 10000 8000 6000 4000 2000 0 0 Classification – classify things based on measured features 100% Statistical Tests – determine if data support a hypothesis 80% 60% 40% 20% 0% 5 4 3 2 1 0 0 10 20 30 40 50 60 70 80 90 100 0,5 1 1,5 2 2,5 3 5 5

Outline

• • • • Introduction (already done!) Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 6

1D: z(t) 2D: z(t) = z z ( ) 2 bold means vector

Signal Dimensionality

1-D Signal

120 100 80 60 40 20 0 0 10 20 30 40 50

Time (seconds)

60 70 80 90 100

2-D Signal

100 90 80 30 20 10 0 70 60 50 40 0 10 20 30 40 50 60

z 1 (meters)

70 80 90 100 7 7

Sampled Signal

Cannot measure nor store continuous signal, so take samples instead [ z(0), z(Δ), z(2Δ), … , z((n-1)Δ) ] = [ z 1 , z 2 , z 3 , … , z n ] Δ = sampling interval, e.g. 1 second, 5 minutes, …

1-D Signal

120 100 80 60 40 20 0 0 0,5 1 Δ = 0.1 seconds 1,5 2 2,5

Time (seconds)

3 3,5 4 4,5 5 8 8

z i = x i + v i measurement from noisy sensor actual value, but unknown random number representing sensor noise

Signal + Noise

Noise • Often assumed to be Gaussian • Often assumed to be zero mean • Often assumed to be i.i.d. (independent, identically distributed) • v i ~ N(0,σ) for zero mean, Gaussian, i.i.d., σ is standard deviation

1-D Signal

120 100 80 60 40 20 0 0 10 20 30 40 50

Time (seconds)

60 70 80 90 9 100 9

Running Example

• • Track a moving person in (x,y) 1000 (x,y) measurements Δ = 1 second 100

Actual Path and Measured Locations

90 80 measurement vector actual location 𝒛 𝑖 = 𝒙 𝑖 𝒙 𝑖 noise + 𝒗 𝑖 = 𝑥 𝑦 𝑖 𝑖 = 𝑥 𝑖 , 𝑦 𝑖 𝑇 𝒗 𝑖 = 𝑣 𝑣 𝑖 𝑖 (𝑥) (𝑦) ~ 𝑁 0,3 𝑁 0,3 70 60 50 40 30 20 10 zero mean standard deviation = 3 meters 0 0 10 20 30 40 50

x (meters)

60 70 80 90 100 start outlier Also 10 randomly inserted outliers with N(0,15) 10

Outline

• • • • Introduction Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 11

Mean Filter

• • Also called “moving average” and “box car filter” Apply to x and y measurements separately Filtered version of this point is mean of points in solid box z x t • • • • • “Causal” filter because it doesn’t look into future Causes lag when values change sharply Help fix with decaying weights, e.g. Sensitive to outliers, i.e. one really bad point can cause mean to take on any value Simple and effective (I will not vote to reject your paper if you use this technique) 12

Mean Filter

100

Actual Path and Measured Locations

90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60

x (meters)

70 80 90 100 outlier

Mean Filter

100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60

x (meters)

70 80 90 100 10 points in each mean • • Outlier has noticeable impact If only there were some convenient way to fix this … 13 13

Median Filter

z x Filtered version of this point is mean median of points in solid box Insensitive to value of, e.g., this point Median is way less sensitive to outliners than mean t median (1, 3, 4, 7, 1 x 10 10 ) = 4 mean (1, 3, 4, 7, 1 x 10 10 ) ≈ 2 x 10 9 14 14

Median Filter

100

Actual Path and Measured Locations

90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60

x (meters)

70 80 90 100 outlier

Median Filter

100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60

x (meters)

70 80 90 100 10 points in each median Outlier has noticeable less impact 15 15

Mean and Median Filter

Mean and Median Filter

100 90 80 70 60 50 40 30 20 10 0 0 Mean Median 10 20 30 40 50

x (meters)

60 70 80 90 100 The median is almost always better to use than the mean.

Editorial: mean vs. median 16 16

Outline

• • • • Introduction Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 17

Kalman Filter

Assumed trajectory is parabolic • • Mean and median filters assume smoothness Kalman filter adds assumption about trajectory Weight data against assumptions about system’s dynamics My favorite book on Kalman filtering Big difference #1: Kalman filter includes (helpful) assumptions about behavior of measured process 18

Kalman Filter

Kalman filter separates measured variables from state variables Measure: Infer state: 𝒛 𝒙 𝑖 𝑖 = 𝑧 𝑧 𝑖 𝑖 (𝑥) (𝑦 ) = 𝑥 𝑖 𝑦 𝑖 𝑣 𝑖 (𝑥) 𝑣 𝑖 (𝑦 ) Running example: measure (x,y) coordinates (noisy) Running example: estimate location and velocity (  !) 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 Big difference #2: Kalman filter can include state variables that are not measured directly 19 19

Kalman Filter Measurements

Measurement vector is related to state vector by a matrix multiplication plus noise.

𝒛 𝑖 = 𝐻 𝑖 𝒙 𝑖 + 𝒗 𝑖 Running example: 𝑧 𝑧 𝑖 𝑖 (𝑥) (𝑦 ) = 1 0 0 1 0 0 0 0 𝑥 𝑖 𝑦 𝑖 𝑣 𝑖 (𝑥) 𝑣 𝑖 (𝑦) + 𝑁 𝟎, 𝑅 𝑖 𝑧 𝑖 (𝑥) = 𝑥 𝑖 + 𝑁 0, 𝜎 𝑟 𝑧 𝑖 (𝑦 ) = 𝑦 𝑖 + 𝑁 0, 𝜎 𝑟 • • In this case, measurements are just noisy copies of actual location Makes sensor noise explicit, e.g. GPS has σ of around 5 meters Sleepy eyes threat level: orange 20 20

Kalman Filter Dynamics

Insert a bias for how we think system will change through time 𝒙 𝑖 = Φ 𝑖−1 𝒙 𝑖−1 + 𝑤 𝑖−1 𝑥 𝑖 𝑦 𝑖 𝑣 𝑖 (𝑥) 𝑣 𝑖 (𝑦) 1 = 0 0 0 0 1 0 0 ∆𝑡 𝑖 0 1 0 0 ∆𝑡 𝑖 0 1 𝑥 𝑖−1 𝑦 𝑖−1 𝑣 (𝑥) 𝑖−1 𝑣 (𝑦) 𝑖−1 + 0 0 𝑁(0, 𝜎 𝑠 ) 𝑁(0, 𝜎 𝑠 ) 𝑥 𝑖 = 𝑥 𝑖−1 + ∆𝑡 𝑖 𝑣 𝑖 (𝑥) location is standard straight-line motion 𝑣 𝑖 (𝑥) = 𝑣 (𝑥) 𝑖−1 + 𝑁(0, 𝜎 𝑠 ) velocity changes randomly (because we don’t have any idea what it actually does) 21 21

Kalman Filter Ingredients

1 0 0 0 0 1 0 0 H matrix: gives measurements for given state 𝑁 𝟎, 𝑅 𝑖 Measurement noise: sensor noise 1 0 0 0 0 1 0 0 ∆𝑡 𝑖 0 1 0 0 ∆𝑡 𝑖 0 1 φ matrix: gives time dynamics of state 𝑁 𝟎, 𝑄 𝑖 Process noise: uncertainty in dynamics model 22 22

Kalman Filter Recipe

𝒙 𝑖 (−) = Φ 𝑖−1 𝒙 (+) 𝑖−1 𝑃 𝑖 (−) = Φ 𝑖−1 𝑃 (+) 𝑖−1 Φ 𝑇 𝑖−1 + 𝑄 𝑖−1 𝐾 𝑖 = 𝑃 𝑖 (−) 𝐻 𝑖 𝑇 𝐻 𝑖 𝑃 𝑖 (−) 𝐻 𝑖 𝑇 + 𝑅 𝑖 −1 𝒙 𝑖 (+) = 𝒙 𝑖 (−) + 𝐾 𝑖 𝒛 𝑖 − 𝐻 𝑖 𝒙 𝑖 (−) 𝑃 𝑖 (+) = 𝐼 − 𝐾 𝑖 𝐻 𝑖 𝑃 𝑖 (−) • Just plug in measurements and go • Recursive filter – current time step uses state and error estimates from previous time step Sleepy eyes threat level: red Big difference #3: Kalman filter gives uncertainty estimate in the form of a Gaussian covariance matrix 23 23

Kalman Filter

Velocity model: 𝑣 𝑖 (𝑥) = 𝑣 (𝑥) 𝑖−1 + 𝑁(0, 𝜎 𝑠 ) • • • Smooth Tends to overshoot corners Too much dependence on • straight line velocity assumption Too little dependence on data

Kalman Filter

100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50

x (meters)

60 70 80 90 100 25 25

Velocity model: 𝑣 𝑖 (𝑥) = 𝑣 (𝑥) 𝑖−1 + 𝑁(0, 𝜎 𝑠 ) • Hard to pick process noise σ s • Process noise models our uncertainty in system dynamics • Here it accounts for fact that motion is not a straight line

Kalman Filter

“Tuning” σ s (by trying a bunch of values) gives better result

Kalman Filter

100 90 80 70 60 50 40 30 20 10 0 0 Untuned Tuned 10 20 30 40 50

x (meters)

60 70 80 90 100 26 26

Kalman Filter

The Kalman filter was fine back in the old days. But I really prefer more modern methods that are not saddled with Kalman’s restrictions on continuous state variables and linearity assumptions.

Editorial: Kalman filter 27 27

Outline

• • • • Introduction (already done!) Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 28

Particle Filter

Dieter Fox et al. WiFi tracking in a multi-floor building • • • Multiple “particles” as hypotheses Particles move based on probabilistic motion model Particles live or die based on how well they match sensor data 29 29

Particle Filter

Dieter Fox et al. • Allows multi-modal uncertainty (Kalman is unimodal Gaussian) • • Allows continuous and discrete state variables (e.g. 3 rd floor) Allows rich dynamic model (e.g. must follow floor plan) • Can be slow, especially if state vector dimension is too large (e.g. (x, y, identity, activity, next activity, emotional state, …) ) 30 30

𝑝 𝒛 𝑖 𝒙 𝑖 E.g. measured speed (in z) will be slower if emotional state (in x) is “tired”

Particle Filter Ingredients

• • • • z = measurement, x = state, not necessarily same Probability distribution of a measurement given actual value Can be anything, not just Gaussian like Kalman But we use Gaussian for running example, just like Kalman z i x i For running example, measurement is noisy version of actual value 31 31

Particle Filter Ingredients

𝑝 𝒙 𝑖 𝒙 𝑖−1 • • • • • Probabilistic dynamics, how state changes through time Can be anything, e.g. • • • Tend to go slower up hills Avoid left turns Attracted to Scandinavian people Closed form not necessary Just need a dynamic simulation with a noise component But we use Gaussian for running example, just like Kalman x i random vector x i-1 32 32

Home Example

Rich measurement and state dynamics models Measurements z = ( (x,y) location in house from WiFi) T State (what we want to estimate) x = (room, activity) 𝑝 𝒛 𝑖 𝒙 𝑖 • p((x,y) in kitchen | in bathroom) = 0 𝑝 𝒙 𝑖 𝒙 𝑖−1 • • • • p( sleeping now | sleeping previously) = 0.9

p( cooking now | working previously) = 0.02

p( watching TV & sleeping| *) = 0 p( bedroom 4 | master bedroom) = 0 33 33

Particle Filter Algorithm

Start with N instances of state vector x i (j) , i = 0, j = 1 … N 1. i = i+1 2. Take new measurement z i 3. Propagate particles forward in time with p(x i |x i-1 ), i.e. generate new, random hypotheses 4. Compute importance weights w i (j) = p(z i |x i (j) ), i.e. how well does measurement support hypothesis?

5. Normalize importance weights so they sum to 1.0

6. Randomly pick new particles based on importance weights 7. Goto 1 • • Compute state estimate Weighted mean (assumes unimodal) Median Sleepy eyes threat level: orange 34

Particle Filter

Dieter Fox et al. WiFi tracking in a multi-floor building • • • Multiple “particles” as hypotheses Particles move based on probabilistic motion model Particles live or die based on how well they match sensor data 35 35

Particle Filter Running Example

𝑝 𝒛 𝑖 𝒙 𝑖 Measurement model reflects true, simulated measurement noise. Same as Kalman in this case.

z i x i 100 90 80 70 60 50

Particle Filter

Actual Particle 1000 Particle 1000000 40 30 𝑝 𝒙 𝑖 𝒙 𝑖−1 Straight line motion with random velocity change. Same as Kalman in this case.

𝑥 𝑖 = 𝑥 𝑖−1 + ∆𝑡 𝑖 𝑣 𝑖 (𝑥) location is standard straight-line motion 𝑣 𝑖 (𝑥) = 𝑣 (𝑥) 𝑖−1 + 𝑁(0, 𝜎 𝑠 ) velocity changes randomly (because we don’t have any idea what it actually does) 20 10 0 0 10 20 30 40 50

x (meters)

60 70 80 90 100 Sometimes increasing the number of particles helps 36 36

Particle Filter Resources

UbiComp 2004 Especially Chapter 1 37 37

Particle Filter

The particle filter is wonderfully rich and expressive if you can afford the computations. Be careful not to let your state vector get too large.

Editorial: Particle filter 38 38

Outline

• • • • Introduction Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 39

Hidden Markov Model (HMM)

• • • • Big difference from previous: states are discrete, e.g. Spoken phoneme {walking, driving, biking, riding bus} {moving, still} {cooking, sleeping, watching TV, playing game, … } Markov 1856 - 1922 Hidden Markov 40 40

(Unhidden) Markov Model

0.9

bus 0.0

0.1

0.0

0.1

0.2

walk 0.1

drive 0.7

• Move to new state (or not) • at every time click • when finished with current state • Transition probabilities control state transitions 0.9

Example inspired by: UbiComp 2003 41 41

Hidden Markov Model

0.9

bus 0.0

0.1

0.0

0.1

0.2

walk 0.1

drive 0.7

0.9

accelerometer Can “see” states only via noisy sensor 42 42

HMM: Two Parts

Two parts to every HMM: 1) Observation probabilities P(X i (j) |z i ) – probability of state j given measurement at time i 2) Transition probabilities a jk – probability of transition from state j to state k Initial State Probabilities P(X 0 (j) ) Transition Probabilities a jk Observation Probabilities P(X 1 (j) |

z

1 ) Transition Probabilities a jk Observation Probabilities P(X 2 (j) |

z

2 ) Transition Probabilities a jk Observation Probabilities P(X 3 (j) |

z

2 ) • • Find path that maximizes product of probabilities (observation & transition) Use Viterbi algorithm to find path efficiently 43 43

Smooth Results with HMM

80 60 40 20 0 0 still 10 20

Signal Strength

moving still 30 40 50

Time (sec.)

60 70 80 90 100 moving vs. still still moving noise variance Signal strength has higher variance when moving → observation probabilities 0.00011

0.99989

still moving 0.00011

Transitions between states relatively rare (made-up numbers) → transition probabilities 0.99989

44 44

Smooth Results with HMM

0.4

still 0.00011

0.00011

0.99989

0.6

moving 0.99989

still 0.2

moving 0.8

80 60 40 20 0 0 20 still 40 moving 60 80 100 noise variance still 0.9

moving 0.1

still 0.3

Viterbi algorithm finds path with maximum product of observation and transition probabilities moving 0.7

Still vs. Moving Estimate

moving still moving still moving still 0 200 800 actual inf erred inf erred and smoothed with HMM 1000 400 600

Time (seconds)

Results in fewer false transitions between states, i.e. smoother and slightly more accurate 45 45

Running Example

Discrete states are 10,000 1m x 1m squares Observation probabilities spread in Gaussian over nearby squares as per measurement noise model Transition probabilities go to 8-connected neighbors 0.011762 0.136136 0.011762

0.13964

0.401401

0.13964

0.011762 0.136136 0.011762

Hidden Markov Model

40 30 20 10 0 0 100 90 80 70 60 50 10 20 30 40 50

x (meters)

60 70 80 90 100 46 46

HMM Reference

• • Good description of Viterbi algorithm Also how to learn model from data 47 47

Hidden Markov Model

The HMM is great for certain applications when your states are discrete.

Editorial: Hidden Markov Model • • • Tracking in (x,y,z) with HMM?

Huge state space (→ slow) Long dwells Interactions with other airplanes 48 48

Outline

• • • • Introduction Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 49

𝑒 𝑖 = 𝒙 𝑖 − 𝒙 𝑖 Euclidian distance estimated value

Presenting Continuous Performance Results

Tracking Error vs. Filter

actual value 50 45 40 35 30 25 20 15 10 5 0 Measured Mean Mean Error Median Error Median Kalman (untuned) Kalman (tuned) Particle Plot mean or median of • Euclidian distance error Median is less sensitive to error outliers Note: Don’t judge these filtering methods based on these plots. I didn’t spend much time tuning the methods to improve their performance.

7 6 5 1 0 4 3 2 Mean

Tracking Error vs. Filter

Median Kalman (untuned) Mean Error Median Error Kalman (tuned) Particle HMM HMM 50 50

Presenting Continuous Performance Results

• • Cumulative error distribution Shows how errors are distributed More detailed than just a mean or median error 95 th percentile 95% of the time, the particle filter gives an error of 6 meters or less (95 th percentile error) 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 0

Cumulative Error Distribution

1 2 3 4 5

Error (meters)

6 7 Median HMM Kalman (tuned) Particle Mean Kalman (untuned) 8 9 10 50% of the time, the particle filter gives an error of 2 meters or less (median error) 51 51

Presenting Discrete Performance Results

Techniques like particle filter and HMM can classify sequential data into discrete classes Confusion matrix Si tti ng Sta ndi ng Wa l ki ng Up s ta i rs Down s ta i rs El eva tor down El eva tor up Brus hi ng teeth Si tti ng 75% 29% 4% 0% 0% 0% 0% 2% Sta ndi ng 24% 55% 7% 1% 1% 2% 2% 10% Wa l ki ng 1% Inferred Acti vi ti es Up s ta i rs Down s ta i rs 0% 0% 6% 79% 4% 7% 1% 3% 95% 0% 0% 4% 0% 89% 1% 2% 3% 0% 6% 0% 8% 0% 0% El eva tor down 0% 4% 1% 0% 2% 87% 3% 0% El eva tor up 0% 3% 1% 1% 0% 1% 87% 0% Brus hi ng teeth 0% 2% 1% 0% 0% 0% 0% 85% 52 52 Pervasive 2006

• • • • Introduction Signal terminology and assumptions Running example Filtering • Mean and median filters • Kalman filter • • Particle filter Hidden Markov model • Presenting performance results

End

100 90 80 70 60 50 40 30 20 10 0 0

Actual Path and Measured Locations

10 20 30 40 50 60

x (meters)

70 80 90 100 3 2 1 0 7 6 5 4 Mean

Tracking Error vs. Filter

Mean Error Median Error Median Kalman (untuned) Kalman (tuned) Particle HMM 53 53

54

Ubiquitous Computing Fundamentals, CRC Press, © 2010 55