Path-State Modeling for Time Series Anomaly Detection Matt Mahoney

Download Report

Transcript Path-State Modeling for Time Series Anomaly Detection Matt Mahoney

Path-State Modeling for Time
Series Anomaly Detection
Matt Mahoney
Outline
• Review of time series anomaly detection
– Gecko
– Compression
– Path modeling
• Piecewise linear approximation of path
• Fast testing using state
• Experimental results on NASA valve data
Problem: How to Detect Anomalies
in Time Series Data
• Normal Marotta Fuel
Valve Solenoid
Current (Used on
Space Shuttle)
• Abnormal (poppet
partially blocked)
Goal
• Reduce human workload in specifying
“normal” model
• Editable rule based model (in SCL)
• Real time testing (1K-10K samples per
second)
Manual Method
• Identify features (zero crossings, peaks…)
• Specify correct behavior using SCL rules
Recorded Current Signature of a Know n Good Valve
4.5
Energized
4
1
Valve Current
3.5
3
4
2
2.5
2
2
3
1.5
1
0.5
Energizing
1
4
3
De-Energized
0
-0.5
-0.2
De-Energizing
0
0.2
0.4
Tim e (Seconds)
0.6
0.8
1
Gecko (Stan Salvador)
• Identify model states (parabolic segments)
– Multiple training series are averaged by
dynamic time warping
• Classify points (x,dx,d2x) using RIPPER
• Construct linear state machine
• Pass/fail test result
Compression Model
Normal, uncompressed
Abnormal, uncompressed
Normal, compressed
Abnormal, compressed
Normal 1
Normal 1 or 2
Normal 2
Abnormal
TEK Compression Anomaly Scores
1.2
1
0.8
Nor 0
Nor 1
Ab 0
Ab 1
0.6
0.4
0.2
0
GZIP
PAQ3
RK
Goal Evaluation
Manual
Gecko
Compression
Reduce
Workload
No
Yes
Yes
Real Time
Yes
Yes
Possible
Editable
model
Yes
Yes
No
Problem with Gecko/RIPPER: State
Machine May Underconstrain Model
Training
Segment 1: x = 0, dx = 0
Segment 2: 0 < x < 1, dx = 1
Test
Segment 1: x = 0, dx = 0
Segment 2: 0 < x < 1, dx = 3
dx > 0.5
State 1
State 2
Accept
Path Model
dx
Test Path (d2 = 4)
3
2
Training Path (scaled to unit cube)
1
x
1
2
3
Path Model Example
Training
Training
Normal Too steep Too low
dx
d2x
x
Anomaly Score
Example TEK Results
Anomaly
Score
TEK 0
TEK 1
TEK 10
(Training) (Normal)
TEK 11
TEK 12
Problems with Path Modeling
• Testing is slow, O(n2)
– Compares n test points to n training points
each
• Model is complex (stores n points)
Proposed Solution
• Piecewise linear approximation of path
– Editable (k segments, k << n)
– Faster testing, O(kn)
• State machine model (nearest segment)
– Fast testing, O(n) (same as Gecko)
– Local minima problem (same as Gecko)
Piecewise Approximation Algorithm
• Repeat n – k times
– Remove vertex with lowest cost = dh2
• Run time is O(n log n) using doubly linked heap
h
d
Test k: compare to all segments
Nearest segment: 0-19
x
dx
Anomaly
Score
TEK0 training
TEK3 near normal
TEK12 stuck poppet TEK16 late release
Paths (not segmented)
TEK 16
TEK 12
x
TEK 0
TEK 3
dx
d2x
TEK 0 approximation with
k = 20 segments
Test 2: compare only to current and
next segment (fails)
TEK 0 training
TEK 3 OK
TEK 12 local minima TEK 16 local minima
Test 4 segments (previous,
current, next 2) succeeds
Training
OK
Skips past minimum Transitions back
Test 4 fails with k = 50
Training
OK
Not complete
Delayed completion
Test 5 (previous, current, next 2, and
one random segment) succeeds
Path Fitting (optimal if no sharp bends)
• Repeat n – k times
– Remove lowest cost
vertex (cost = dh2)
– Move adjacent
vertices by h/4
toward removed
vertex
Vertex Removal vs. Path Fitting
• TEK 0 self anomaly scores
– Path fitting better for k > 50
– Vertex removal better for k < 50
K
200
100
50
20
Vertex removal
Maximum Total
0.000008 0.000656
0.000057 0.005802
0.000345 0.027968
0.010298 0.601229
Path fitting
Maximum Total
0.000005 0.000350
0.000019 0.003903
0.000542 0.025327
0.015872 0.961845
Path Modeling vs. Gecko
• Data: Voltage Test 1 at 14V, 16V, 18V... to 32V
– 10 x 20K points
– 31 sets of 1-3 training files
• Gecko
– Transition threshold = 3
– Error threshold = 10 or 20
– Results: pass at 10 (P), pass at 20 (P/F) or fail
• Path Modeling
–
–
–
–
Filter delay 2 x 50 samples per dimension
k = 50 segments
Test 5 (last, current, next 2, and random)
Results: maximum and total anomaly score
Typical Results
Test file
V37898 V14
V37898 V16
V37898 V18
V37898 V20
V37898 V22
V37898 V24
V37898 V26
V37898 V28
V37898 V30
V37898 V32
T21
T21
T21
T21
T21
T21
T21
T21
T21
T21
+ = Train
R00s.txt
R00s.txt
R00s.txt
R00s.txt +
R00s.txt
R00s.txt
R00s.txt +
R00s.txt
R00s.txt
R00s.txt
Maximum
Total
0.041018 58.254755
0.021778 43.696323
0.006596 26.814669
0.000913
0.705107
0.008819 48.095410
0.006635 23.487464
0.000361
0.593473
0.009032 48.236476
0.033475 194.134671
0.076193 448.467580
Gecko
P
P/F
P
P
Gecko Summary (Stan)
• Gecko
– 1 training file: correct behavior
• 10 self: 10 P (100% correct)
• 90 others: 3 P/F, 87 F (97-100% correct)
– 2-3 training files: some generalization
• 26 self: 23 P, 3 F (14V, 14V, 16V) (88% correct)
– 14V is too different from the others
• 22 “between”: 8 P, 6 P/F, 8 F (36-63% correct)
• 162 others: 1 P/F, 161 F (99-100% correct)
Path Model Summary
• Anomaly score proportional to training-test
difference (correct)
• Multiple training sets: no generalization
(expected)
Run Time Performance
• Tested on data set 1 (218 x 20K points)
– 50 training files = 106 samples
– 168 test files = 3.36 x 106 samples
• 750 MHz Duron, tsad4.cpp, g++ -O 2.95.2
– Read and filter 106 points: 23 sec
– Approximate to k = 100 segments: 30 sec.
– Test k: 162 sec (500 ns per point per
segment)
Summary
Meets all goals
Output
Training speed
Test speed
Parameters
Local minima
Generalization
Path Model
Yes
Numeric
O(n log n)
O(n)
Filter delay,
number of
segments
Yes
No
Gecko
Yes
Pass/fail
O(n2) (DTW)
O(n)
Transition and
error thresholds
Yes
Some
Future Work
• Test path modeling with other data sets
– UCR archive,
http://www.cs.ucr.edu/~eamonn/TSDMA/
– Power load profiles,
http://www.delelect.com/pdfs/Del-Res.txt
• Test with multiple dimensions
• Generalization?
Thank You
Further Reading
http://cs.fit.edu/~mmahoney/nasa/