Transcript Slide 1

Introduction
• Time is very essential component in everyday life and would
act as great source of information for any smart home.
• We hypothesize that anomaly detection and enhanced
prediction is possible using temporal relations.
• TempAl suite:
 Identify temporal relations in smart home datasets.
 Use temporal information to aid prediction of events in a
smart home environment.
 Use temporal information to identify anomalous events in
a smart home environment.
Smart Homes: Goals
Adapt to Needs
Cost Effective and
Reliable
Maximum Comfort
and Security
VJ
AI@WSU © 2007
3
MavHome: Smart Home Project
• Project Unique
– Focus on entire home
• House perceives and acts
– Sensors
– Controllers for devices
– Connections to the mobile user and Internet
• Unified project incorporating varied AI
techniques, cross disciplinary with mobile
computing, databases, multimedia, and
others
VJ
AI@WSU © 2007
4
Experimentation Environment2
MavLAB Argus Sensor Network
 around 100 Sensors.
 include Motion, Devices, Light, Pressure,
Humidity and more.
Real Dataset and synthetic datasets consist timestamp
of the activity with the activity name and the state it is
in.
Datasets
Synthetic
Real
VJ
No of
Days
Parameter Setting
No of
No of
Intervals
Events
Identified
60
8
1729
60
17
1623
AI@WSU © 2007
Size of
Data
106KB
104KB
5
What is a temporal relation?
Food “Contains” Water
or
Water “Before” Pills
or
Food “Meets” Pills
or
Food “Contains” Water “before” Pills
“It is common to describe scenarios using time intervals
rather than time points”
- James F. Allen
Reminder Assistance
• Reminder system based on temporal relations.
Anomaly Detection
• If Pills are to be taken “After” Food, we can notice violation of this activity!
Maintenance
• If Cooker is Spoiled should we call emergency or a normal repair?
Temporary Need Analysis
• If Oven used for Turkey, Is turkey at Home?
Improve Prediction
VJ association
AI@WSU © 2007
• Increase prediction accuracy with
rules!
6
Allen’s 13 Temporal Relations
Before
After
During
Contains
Overlaps
Overlapped-By
Meets
Met-by
Starts
Started-By
Finishes
Finished-By
Equals
VJ
AI@WSU © 2007
7
Experimentation Process
• Experiment 1: Evaluate TempAl’s ability to
detect anomalies using temporal relations.
• Experiment 2: Evaluate TempAl’s event
prediction using association rules.
• Experiment 3: Evaluate TempAl’s event
prediction using temporal relations in ALZ.
Architecture
Temporal Intervals
• Process raw data to form temporal intervals.
Raw Sensor Data
Timestamp
Sensor State
3/3/2003 11:18:00 AM OFF
3/3/2003 11:23:00 AM ON
3/3/2003 11:23:00 AM ON
3/3/2003 11:24:00 AM OFF
Identify Time Intervals
Date
Sensor ID
03/02/2003 G11
03/02/2003 G19
03/02/2003 G13
03/02/2003 G19
Start Time
01:44:00
02:57:00
04:06:00
04:43:00
Sensor ID
E16
G12
G11
G12
End time.
01:48:00
01:48:00
01:48:00
01:48:00
Associated Temporal Relations
Date time
Sensor ID Temporal Relation Sensor ID
3/3/2003 12:00:00 AM G12
DURING
E16
3/3/2003 12:00:00 AM E16
BEFORE
I14
3/2/2003 12:00:00 AM G11
FINISHESBY
G11
4/2/2003 12:00:00 AM J10
STARTSBY
J12
Raw Sensor Data
Interval Data
Temporal
Relations
Data
Experiment 1: Temporal relations
based anomaly detection
• Step 1: Collect the temporal relations based data.
• Step 2: Identify the most frequent happening events in
the data.
• Step 3: Use the frequency of temporal relations to
calculate the probability of the observed event.
• Step 4: Flag observed events as anomalous which have
a sufficiently low probability.
Find Frequent Events: The Apriori Algorithm
Example
Database D
Day Event
23/10/2005
24/10/2005
25/10/2005
26/10/2005
C1 itemset sup.
{TV}
TV cooker lamp
Scan D {oven}
Oven Cooker Fan
{cooker }
TV Oven Cooker Fan
{lamp}
Oven Fan
{fan}
2
3
3
1
3
L1 itemset sup.
{TV}
{Oven}
{cooker}
{fan}
C2 itemset sup
L2 itemset sup
{TV Cooker}
{Oven Cooker}
{Oven Fan}
{Cooker,Fan}
2
2
3
2
{TV Oven}
{TV Cooker}
{TV Fan}
{Oven Cooker}
{Oven Fan}
{Cooker Fan}
C3
itemset
1
2
1
2
3
2
Scan D
C2
2
3
3
3
{TV Oven}
{TV Cooker}
{TV Fan}
{Oven Cooker}
{Oven Fan}
{Cooker Fan}
Scan D L3
{Oven Cooker Fan}
VJ
itemset
sup
{Oven
Cooker Fan} 2
AI@WSU
© 2007
12
Anomaly Detection Process
1] Calculate the evidence of occurrence with other frequent events
P(Z|Y) = |Before(Y,Z)| + |Contains(Y,Z)| + |Overlaps(Y,Z)| +
|Meets(Y,Z)| + |Starts(Y,Z)| + |StartedBy(Y,Z)| +
|Finishes(Y,Z)| + |FinishedBy(Y,Z)| + |Equals(Y,Z)| / |Y|
Event Sequence: X A B
P(B|AUX) = P(B ∩ (AUX) ) / P(AUX) = P(B ∩ A) U P(B ∩ X)/ P(A) +
P(X) –P(A∩X) [Distributive Rule]
= P(B|A)*P(A) + P(B|X)*P(X) / P(A) + P(X) –P(A∩X)
[Multiplication Rule]
2] Calculate the anomaly.
• Anomalyz= 1 – P(Z|Y)
3] Check if the calculated anomaly is equal to or greater than (mean + 2 * St.
Dev.), If yes, declare it as an Anomaly.
VJ © AI LAB EECS@WSU 2007
Experimentation Results
Anomaly Detection on Real Dataset
Frequent
Event
Evidence
Anomaly
Detected
J10
0.45
0.55 No
J11
0.32
0.68 No
A11
0.33
0.67 No
A15
0.24
0.76 No
A11
0.23
0.77 No
A15
0.22
0.78 No
I11
0.27
0.73 No
I14
0.34
0.66 No
Anomaly Mean
0.7
Anomaly St. Dev.
0.071764
Anomaly Cut-off
Threshold
0.8435
Anomaly Detection on Synthetic Dataset
Frequent Event Evidence
Anomaly
Detected
Lamp
0.3
0.7 NO
Lamp
0.23
0.77 NO
Lamp
0.01
0.99 YES
Fan
0.32
0.68 NO
Cooker
0.29
0.71 NO
Lamp
0.45
0.55 NO
Lamp
0.23
0.77 NO
Lamp
0.01
0.99 YES
Lamp
0.23
0.77 NO
Fan
0.3
0.7 NO
Cooker
0.34
0.66 NO
Lamp
0.33
0.67 NO
Lamp
0.2
0.8 NO
Lamp
0.02
0.98 NO
Lamp
0.002
0.998 YES
Fan
0.34
0.66 NO
Cooker
0.42
0.58 NO
Anomaly Mean
0.763412
Anomaly St. Dev.
0.135626
Anomaly Cut-off Threshold
VJ © AI LAB EECS@WSU 2007
1
Experiment 2: Using Temporal
relations for identifying association
rules for prediction.
• Step 1: Collect the temporal relations based data.
• Step 2: Use Weka workbench to generate
association rules on the temporal relations data.
• Step 3: Rules are used in the form “IF X THEN Y”.
Use rules with the existing system for prediction.
Association rule generation using
Weka
• Use association rule mining to find predictive rules in Weka
workbench .
• Weka generates best rules with a given support and
confidence.
• We use Apriori algorithm and vary the support and confidence
and collect the output.
Parameter settings for rules generation using Weka in
Real Datasets
RUN
MINIMUM
SUPPORT
MINIMUM
CONFIDENCE
Assosiation Rule Mining on Real Data
NO OF BEST
RULES
FOUND
1
0.00
0.5
100
2
0.01
0.5
006
3
0.02
0.5
002
4
0.05
0.5
001
100
90
80
70
60
50
40
30
20
10
0
No. Best Rules
Minimum Confidence
1
Minimum Support
2
3
4
1
0
2
0.01
3
0.02
4
0.05
Minimum Confidence
0.5
0.5
0.5
0.5
No. Best Rules
100
6
2
1
Minimum Support
Parameter settings for rules generation using Weka in
Synthetic Datasets
MINIMUM
SUPPORT
MINIMUM
CONFIDENCE
NO OF
BEST
RULES
FOUND
1
0.00
0.5
100
2
0.01
0.5
010
3
0.02
0.5
005
4
0.05
0.5
003
Axis Title
RUN
Assosiation Rule Mining on Synthetic
Data
100
90
80
70
60
50
40
30
20
10
0
1
No of Best rules found
Minimum Confidence
Minimum Support
2
3
4
1
0
2
0.01
3
0.02
4
0.05
Minimum Confidence
0.5
0.5
0.5
0.5
No of Best rules found
100
10
5
3
Minimum Support
Rule Generation
• Due to small datasets used, we use the top rules generated
with a minimum confidence of 0.5 and a minimum support
of 0.01.
• Confidence level above 0.5 and support above 0.05 could not
be used, as they could not result in any viable rules.
Sample of best rules observed in real datasets:
Activity=C11 Relation=CONTAINS 36 ==> Activity=A14 36
Activity=D15 Relation=FINISHES 32 ==> Activity=D9 32
Activity=D15 Relation=FINISHESBY 32 ==> Activity=D9 32
Activity=C14 Relation=DURING 18 ==> Activity=B9 18
Step 3: Temporal Rules Enhancement
to the Prediction.
Input: Output of ALZ Predictor a, Best Rules r, Temporal Dataset
Repeat
If a! = null
Repeat
Set r1 to the first event in the relation rule
If (r[i].relationoccur ==a) Then Read r[i].relationpredict, if any
Then predict;
If (a == testevent ) Then increment correctcount,
Then insert a in trie;
Else
Continue;
End if.
Until end of rules.
End if.
Loop until End of Input.
Experiment 2 Results
• Comparing ActiveLeZi based prediction with
and without temporal rules
DATA SET
ACCURACY% ERROR%
REAL
(WITHOUT RULES)
55
45
SYNTHETIC
(WITHOUT RULES)
64
36
REAL
(WITH RULES)
56
44
SYNTHETIC
(WITH RULES)
69
31
Accuracy
80
70
60
50
40
30
20
10
0
Real(Without Rules)
Real(With Rules)
Accuracy
Synthetic(Without
Synthetic(With
Rules)
Rules)
Experiment 3: Using Temporal
relations for prediction using Alz.
• Step 1: Collect the temporal relations based data.
• Step 2: Use existing ActiveLeZi Predictor and also
calculate the temporal probability.
• Step 3: Use Prediction by Partial Match employed by
ALZ with temporal probability for prediction.
ALZ
• Uses LZ78 compression algorithm and it uses Prediction by
Partial Match (PPM) family of predictors (Alz-TDAG)
• Acts as sequential predictor and predicts the next most
probable event based on trie.
• The way it calculates probability is that a is next event is
Why use temporal probability?
• At higher order in phrase we see that temporal
probability would include more information compared
to the sequential probability.
• Predictionc = P (C|B) = P (C|B) SEQ: Order-0+
P (C|B) TEMPORAL: Order 1-n (at each order in phrase) + P (C|B) SEQ: n
• P(B|A) = |After(B,A)| + |During(B,A)| +
|OverlappedBy(B,A)| + |MetBy(B,A)| + |Starts(B,A)| +
|StartedBy(B,A)| + |Finishes(B,A)| + |FinishedBy(B,A)|
+ |Equals(B,A)| / |A|
Prediction On Real Datasets
Prediction
Dataset
(Learning Algorithm)
Accuracy
Train Test Correct (%)
Prediction
Error (%)
Real (Alz)
100
1
0
0%
100%
Real (Alz+Tempal)
100
1
1
100%
0%
Real (Alz)
100
10
6
60%
40%
Real (Alz+Tempal)
100
10
6
60%
40%
Real (Alz)
750
40
29
72.50%
27.50%
Real (Alz+Tempal)
750
40
29
72.50%
27.50%
Cross Validation (Alz)
787
83
48
57.96%
42.04%
787
83
49
58.92%
41.08%
Cross Validation
(Alz+Tempal)
Prediction on Synthetic Datasets
Dataset(Learning
Algorithm)
Train
Test
Correct
Prediction
Prediction
Accuracy
Error
Synthetic (Alz)
100
1
1
100%
0%
Synthetic (Alz+Tempal)
100
1
1
100%
0%
Synthetic (Alz)
100
10
10
100%
0%
Synthetic (Alz+Tempal)
100
10
10
100%
0%
Synthetic (Alz)
1400
90
89
98.88%
1.12%
Synthetic (Alz+Tempal)
1400
90
90
100%
0%
Synthetic (Alz)
13905 1544
1532
99.22%
0.78%
Synthetic (Alz+Tempal)
13905 1544
1532
99.22%
0.78%
Cross Validation (Alz)
13905 1544
1292
83.68%
16.32%
13905 1544
1292
83.64%
16.36%
Cross Validation
(Alz+Tempal)
Test Case Scenario
• To highlight the true potential of leveraging
temporal relations for enhancing prediction.
• Test set consists of two typical test events
which were also check by observation.
• one with more number of predictive temporal
relations and other with less.
Test Case Scenario Results
•
•
Alz mispredicts both the test events and
TempAl predicts one correctly and mispredicts other due to lack of more temporal
information in the form of temporal relations.
Training Set:
a ON
a OFF
a ON
b ON
a ON
b ON
b ON
b ON
b ON
b ON
a ON
a ON
b ON
c ON
c ON
d ON
d OFF
c ON
b ON
a ON
c OFF
a OFF
Temporal Relations on Training Set:
a BEFORE a, a BEFORE b, a BEFORE b, a BEFORE b, a BEFORE a, a
BEFORE b, a BEFORE c, a BEFORE d, a BEFORE c, a BEFORE a, a
AFTER a, a OVERLAPS b, a BEFORE b, a BEFORE b, a BEFORE a, a
BEFORE b, a BEFORE c, a BEFORE d, a BEFORE c, a BEFORE a, b
AFTER a, b OVERLAPPEDBY a, b MEETS b, b BEFORE b, b BEFORE a, b
BEFORE b, b BEFORE c, b BEFORE d, b BEFORE c, b BEFORE a, b
AFTER a, b AFTER a, b AFTER b, b METBY b, b BEFORE a, b BEFORE
b, b BEFORE c, b BEFORE d, b BEFORE c, b BEFORE a, a AFTER a, a
AFTER a, a AFTER b, a AFTER b, a AFTER b, a BEFORE b, a BEFORE
c, a BEFORE d, a BEFORE c, a BEFORE a, b AFTER a, b AFTER a, b
AFTER b, b AFTER b, b AFTER b, b AFTER a, b CONTAINS c, b
CONTAINS d, b OVERLAPS c, b BEFORE a, c AFTER a, c AFTER a, c
AFTER b, c AFTER b, c AFTER b, c AFTER a, c DURING b, c BEFORE
d, c BEFORE c, c BEFORE a, d AFTER a, d AFTER a, d AFTER b, d
AFTER b, d AFTER b, d AFTER a, d DURING b, d AFTER c, d BEFORE
c, d BEFORE a, c AFTER a, c AFTER a, c AFTER b, c AFTER b, c
AFTER b, c AFTER a, c DURING b, c AFTER c, c AFTER d, c
FINISHES a, a AFTER a, a AFTER a, a AFTER b, a AFTER b, a AFTER
b, a AFTER a, a AFTER b, a AFTER c, a AFTER d, a FINISHESBY c
ALZ Prediction: b b
Test Set:
a ON
d OFF
Alz +TempAl Prediction: a b
Conclusion
• Unique and new Approach by leveraging temporal
information.
• It shows us that the temporal information aids the prediction
and anomaly detection processes in a smart environment.
• Anomaly detection enables decision maker to identify change
pattern in activities based on anomaly or simply discard the
event. Also used for reminder assistance.
• Prediction can be used to predict the next event or next set of
events (Modeled with spatial).
• It acts as a guiding foundation for spatio-temporal models for
smart environments.
Future Direction
• Expansion of the temporal relations such as until, since,
next, and so forth.
• Incorporate different dimensions of ON and OFF states.
• Larger real datasets should be collected and be
experimented.
• Planning problems for smart home activities?
• Using time window or coming up with a cut off or
threshold criterion mechanism.
• Expands to become a potential Graph mining problem
and could involves link prediction.
• Can be extended to form Spatio-Temporal models.
• Temporal Relations Visualization problem.
• Pattern analysis for lifestyle and behavior improvements
& feedback on interestingness of a pattern.
Acknowledgements
• I thank my advisor and all my professors & teachers whom I
interacted with!
• I thank my family.
• I thank all my friends and peers in the research labs which I
was a part of.
• I would like to thank Washington State University and
University of Texas at Arlington.
• I would like to thank all other people who directly or indirectly
played a role in encouraging me to work towards the
completion of my Thesis.
Publications
2007
•
Vikramaditya R. Jakkula, Aaron S. Crandall, and Diane J. Cook, "Knowledge Discovery in Entity Based Smart Environment Resident Data Using
Temporal Relations Based Data Mining", ICDM Workshop on Spatial and Spatio-Temporal Data Mining, Omaha, Nebraska, 2007 (acceptance rate:
20%).
•
Vikramaditya R. Jakkula, "Predictive Data Mining to Learn Health Vitals of Residents in a Smart Home", ICDM IEEE Workshop of Data Mining in
Medicine, Omaha, Nebraska, 2007 .
•
Vikramaditya R. Jakkula, and Diane J. Cook, "Mining Sensor Data in Smart Environment for Temporal Activity Prediction", Poster session of the ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining Workshop on Knowledge Discovery from Sensor Data (sensor-KDD
2007), San Jose, August 2007 (acceptance rate: 32%).
•
Vikramaditya R. Jakkula, and Diane J. Cook, "Using Temporal Relations in Smart Home Data for Activity Prediction", International Conference on
Machine Learning (ICML) Workshop on the Induction of Process Models (IPM /ICML 2007), Corvallis, June 2007
•
Vikramaditya R. Jakkula, Diane J. Cook, and Aaron S. Crandall, "Temporal pattern discovery for anomaly detection in smart homes", Proceedings of
the 3rd IET International Conference on Intelligent Environments (IE 07), Germany, September 2007
•
Vikramaditya R. Jakkula, and Diane J. Cook, "Learning Temporal Relations in Smart Home Data "Proceedings of the Second International Conference
on Technology and Aging, Canada, June 2007
•
Vikramaditya R. Jakkula, Diane J. Cook, and Gaurav Jain, Prediction Models for a Smart Home Based Health Care System, Proceedings of 21st IEEE
International Conference on Advanced Information Networking and Applications Workshops, Canada, May 2007
2006
•
Vikramaditya R. Jakkula, Michael G. Youngblood and Diane J. Cook, Identification of Lifestyle Behavior Patterns with Prediction of the Happiness of an
Inhabitant in a Smart Home, AAAI Workshop on Computational Aesthetics: Artificial Intelligence Approaches to Beauty and Happiness, Boston, July
2006
•
Gaurav Jain, Diane J. Cook and Vikramaditya R. Jakkula, Monitoring Health by Detecting Drifts and Outliners for a Smart Environment Inhabitant, 4th
International Conference on Smart Homes and Health Telematics, June 2006 .
JOURNALS (PENDING)
•
Vikramaditya Jakkula and Diane J. Cook, “Anomaly Detection Using Temporal Data Mining in a Smart Home Environment ”, Medicine and Medical
Informatics journal special issue on Smart Homes, 2008.
•
Vikramaditya Jakkula and Diane J. Cook, “Prediction Models for a Smart Home Based Health Care System”, Special Issue of International Journal of
Telemedicine and Applications (IJTA),2008.
•
Diane J. Cook, Juan Augusto and Vikramaditya Jakkula “Ambient Intelligence: Current trends and technologies” PMC Journal, 2007.
BOOK CHAPTERS (PENDING)
•
Vikramaditya Jakkula and Diane J. Cook, “Mining Temporal Relations in Smart Environment Data using TempAl”, Knowledge Discovery from Sensor
Data, Taylor and Francis/CRC Press, 2008.
•
Vikramaditya Jakkula, Diane J. Cook and Aaron Crandall, “ENHANCING ANOMALY DETECTION FOR SMART HOMES USING TEMPORAL PATTERN
DISCOVERY”, Advanced Intelligent Systems, 2008.
•
Vikramaditya Jakkula and Diane J. Cook, “Learning Temporal Relations in Smart Home Data”, Technology and Aging, IOS Press: Assistive Technology
Research Series, 2008.
•
Gaurav Jain, Diane J. Cook and Vikramaditya R. Jakkula, ”Monitoring Health by Detecting Drifts and Outliners for a Smart Environment Inhabitant",
Smart Homes And Beyond, IOS Press: Assistive Technology Research Series, volume 19, Pg 114-121,2006. ISBN 978-1-58603-623-2,2006.
Thank You
VJ
AI@WSU © 2007
33
Test Case 2 Results
Alz +
TempAl
Alz
# of
# of
Instances # of Events Anomaly
#
# of
Patterns
Train
Alz +
TempAl
Alz
#
#
Accuracy
Correct
Correct
Prediction Prediction
(%)
Test
Accuracy
(%)
1
6000
25
5
20
5000
1000
603
582
60.3
58.2
2
6000
10
15
20
5000
1000
411
425
41.1
42.5
3
6000
10
5
50
5000
1000
757
763
75.7
76.3
4
6000
10
5
20
5000
1000
913
914
91.3
91.4
5
3250
10
5
20
2500
750
668
716
89.06
95.4
P-Value 0.120213224 >0.05.
Shows its not significantly better but with a closer value to 0.05, it would be a
good enhancement model.
Visualization : Earlier Model
Visualization Enhanced