Smart Phone-Based Sensor Mining

Download Report

Transcript Smart Phone-Based Sensor Mining

www.cis.fordham.edu/wisdm or wisdmproject.com
Gary M. Weiss
Comp & Info Science Dept
Fordham University
[email protected]

Data Mining:
 Extraction of knowledge from data via automated
methods

Smart phone sensor mining:
 Extraction of useful knowledge from the data
generated by smart phone sensors
1/11/2012
Gary M. Weiss
ICCS 2012
2

What sensors are found on smart phones?
 Audio sensor (microphone)
 Image sensor (camera, video recorder)
 Tri-Axial Accelerometer
 Location sensor (GPS, cell tower, WiFi)
 Infrared proximity sensor; Light sensor
 Magnetic compass; Temperature sensor; Touch sensor
 Virtual/calculated sensors:
▪ Proximity (via light), gravity, orientation, gyroscope
1/11/2012
Gary M. Weiss
ICCS 2012
3

Learning about smart phone users
 Security requires understanding how devices used
 Main focus of talk not on security but on what can
be learned about smart phone users

Smart phone based biometric identification
 Can be considered a security application

Many news stories about abuses
 Apps to spy on your spouse; iPhone location fiasco
1/11/2012
Gary M. Weiss
ICCS 2012
4

Activity recognition (what are you doing)?
 Are you walking, jogging, sitting, standing, etc?

Biometric Identification (who are you)?
 Are you John Smith?

Trait Identification (who are you at diff. level)?
 Are you male? Are you tall? What do you weigh?
1/11/2012
Gary M. Weiss
ICCS 2012
5

Data miners want to learn everything about you
 Somehow that info will be useful
 Develop useful apps, marketing leads, etc.
 Many positive uses
▪ That is why NSF provided WISDM with funding for activity
recognition from “Health and Well Being” program
 But obviously issues with privacy and abuse
1/11/2012
Gary M. Weiss
ICCS 2012
6
 Approach to Predictive Data Mining
1. Collect labeled (sensor) training data
2. Apply data mining method to build predictive model
3. Apply predictive model to future unlabelled data
1/11/2012
Gary M. Weiss
ICCS 2012
7
1/11/2012
Gary M. Weiss
ICCS 2012
8

Why is it useful?
 Context-sensitive applications
▪ Context influences handling of phone calls or music to play
 Health applications
▪ Track activity levels or detect falls in elderly

Approaches to activity recognition
 Uses multiple accelerometers
 Use custom devices (pedometer, FitBit)
 Our approach: use existing smart phones
1/11/2012
Gary M. Weiss
ICCS 2012
9

Accelerometer data from Android phone
 Walking
 Jogging
 Climbing Stairs
 Lying Down
 Sitting
 Standing
Gravity included
1/11/2012
Gary M. Weiss
ICCS 2012
10
1/11/2012
Gary M. Weiss
ICCS 2012
11
1/11/2012
Gary M. Weiss
ICCS 2012
12
1/11/2012
Gary M. Weiss
ICCS 2012
13
1/11/2012
Gary M. Weiss
ICCS 2012
14
Impersonal (Universal) Model
Single Model trained and used for everyone
Data Mining Method: Instance Based Learning (WEKA IB3)
Actual Class
72.4%
Accuracy
1/11/2012
Predicted Class
Walking Jogging Stairs Sitting Standing
Lying
Down
Walking
2209
46
789
2
4
0
Jogging
45
1656
148
1
0
0
Stairs
412
54
869
3
1
0
Sitting
10
0
47
553
30
241
Standing
8
0
57
6
448
3
Lying Down
5
1
7
301
13
131
Gary M. Weiss
ICCS 2012
15
Personal Model: Model Build per User
Data Mining Method: Instance Based Learning (WEKA IB3)
98.4%
accuracy
Predicted Class
Jogging
Stairs
Walking
3033
1
24
0
0
Lying
Down
0
Jogging
4
1788
4
0
0
0
Stairs
42
4
1292
1
0
0
Sitting
0
0
4
870
2
6
Standing
5
0
11
1
509
0
Lying Down
4
0
8
7
0
442
Actual Class
Walking
1/11/2012
Gary M. Weiss
ICCS 2012
Sitting Standing
16
1/11/2012
Gary M. Weiss
ICCS 2012
17
Identification based on physical/behavioral traits

 Fingerprints, DNA, iris, gait, etc.

Biometrics for everyone
 Equipment smaller & cheaper (sensors + processing)
▪ Laptops currently perform face recognition

Gait-based recognition
 Most work is camera-based

Some applications
 device security, customization & personalization
1/11/2012
Gary M. Weiss
ICCS 2012
18

Used for identification and authentication
 Identification means predicting identity from pool of
users (36 in initial study and 200 in recent study)
 Authentication is a binary class prediction
▪ Is it you or an imposter?

We evaluate walking and other activities as well as
unclassified activities

Predictions made on individual 10 sec. samples
but also combine “votes” to exploit larger samples
1/11/2012
Gary M. Weiss
ICCS 2012
19
Unclassified
Walk
Jog
Up
Down
J48
72.2
84.0
83.0
65.8
61.0
Neural Net
69.5
90.9
92.2
63.3
54.5
Straw Man
4.3
4.2
5.0
6.5
4.7
Based on 10 second test samples
Unclassified
Walk
Jog
Up
Down
J48
36/36
36/36
31/32
31/31
28/31
Neural Net
36/36
36/36
32/32
28.5/31
25/31
Based on most frequent prediction for 5-10 minutes of data
Authentication results even better (~90% with 10 sec samples)
Recent unpublished results demonstrate 100% accuracy with 200 users!
1/11/2012
Gary M. Weiss
ICCS 2012
20
1/11/2012
Gary M. Weiss
ICCS 2012
21


Soft biometrics: traits can aid with biometrics
As data miners we want to know everything
about a person
 Marketing applications: ads based on sex
 Inferred weight to predict calories burned
1/11/2012
Gary M. Weiss
ICCS 2012
22

Normally think about traits as being:
 Unchanging: race, skin color, eye color, etc.
 Slow changing: Height, weight, etc.

But want to know everything about a person:
 What they wear, how they feel, if they are tired, etc.
 Have never seen this goal for mobile sensor mining
1/11/2012
Gary M. Weiss
ICCS 2012
23


Work in early stages
Data initially collected from ~70 people, now 200
 Accelerometer and survey data
 Survey data includes anything we could think of that
might somehow be predictable
▪ Sex, height, weight, age, race, handedness, disability
▪ Shoe size, footwear type, size of heels, type of clothing
▪ # hours academic work , # hours exercise
 Too few subjects investigate all factors
▪ Many were not predictable (maybe with more data)
1/11/2012
Gary M. Weiss
ICCS 2012
24
Accuracy
Male Female
71.2%
Male
31
7
Female
12
16
Accuracy Short
83.3%
Short
15
Tall
2
Tall
5
20
Accuracy
78.9%
Light
Heavy
Light
Heavy
13
2
7
17
Results for IB3 classifier. For height and weight middle categories removed.
1/11/2012
Gary M. Weiss
ICCS 2012
25
1/11/2012
Gary M. Weiss
ICCS 2012
26

Security policies vary widely by OS & platform
 Symbian requires properly signed keys to remove
restrictions on using certain APIs
 iPhone apps have relatively strict oversight
 Android OS has few restrictions and Marketplace
has essentially no oversight or restrictions
▪ WISDM project has had no problem tapping into sensors
and transmitting results. Just pay $25 for account.
1/11/2012
Gary M. Weiss
ICCS 2012
27

Android notifies user of services
 SYSTEM PERMISSIONS FOR WISDM SensorCollector
▪ Coarse location, fine location, internet access, keep from
sleeping, modify/delete USB storage

Applications routinely access sensitive services
 Fandango : fine GPS location, read phone state &
identity, modify/delete USB storage, internet access
 Angry Birds: identical permissions!
 Notifications probably next to useless given this!
1/11/2012
Gary M. Weiss
ICCS 2012
28

Even legitimate applications have to be
concerned with privacy & security
 WISDM will encrypt data in transit, encrypt on
phone, include secure accounts & passwords, etc.
 Need to ensure than any aggregated info is made
public only if cannot be traced to individual
1/11/2012
Gary M. Weiss
ICCS 2012
29

Good Policies:
 Make it clear what you are monitoring and storing
 Provide application level control for the user
▪ Allow user to turn on/off monitoring of specific sensors
▪ If they use an option to upload the information to Facebook
then little privacy!

Since legitimate and illegitimate apps function
alike, no easy way to distinguish them
 Could try to use only certified apps, but quite limiting
1/11/2012
Gary M. Weiss
ICCS 2012
30

WISDM is building & deploying the actitracker
service to track your activities real-time and
display them via a web-based interface
 Useful health information and thus supported by
NSF Grant & Google faculty research award
 Actitracker.com online and should have basic
functionality shortly
1/11/2012
Gary M. Weiss
ICCS 2012
31

WISDM research group
 Current Members
▪ Anthony Alcaro, Alex Armero, Shaun Gallagher, Andrew
Grosner, Margo Flynn, Jeff Lockhart, Paul McHugh, Luigi
Patruno, Tony Pulickal, Greg Rivas, Priscilla Twum, Bethany
Wolff, Zach Wyhowanec, Jack Xue
 Key Former Members
▪ Jennifer Kwapisz, Sam Moore, Shane Skowron, Alvan Wong
 Funders: NSF, Google, and Fordham
1/11/2012
Gary M. Weiss
ICCS 2012
32
1.
J.R. Kwapisz, G.M. Weiss, and S.A. Moore. 2010.
Activity recognition using cell phone accelerometers, in Proceedings of the Fourth
International Workshop on Knowledge Discovery from Sensor Data, 10-18.
2.
J. R. Kwapisz, G.M. Weiss, and S.A. Moore, 2010.
Cell phone-based biometric identification, in Proceedings of the IEEE Fourth
International Conference on Biometrics: Theory, Applications and Systems.
3.
J.W. Lockhart, G.M. Weiss, J.C. Xue, S.T. Gallagher, A.B. Grosner, T.T. Pulickal. 2011.
Design considerations for the WISDM smart phone-based sensor mining
architecture, in Proceedings of the Fifth International Workshop on Knowledge
Discovery from Sensor Data, San Diego, CA.
4.
G.M. Weiss, and J.W. Lockhart, 2011.
Identifying user traits by mining smart phone accelerometer data, in Proceedings of
the 5th International Workshop on Knowledge Discovery from Sensor Data., San
Diego, CA.
1/11/2012
Gary M. Weiss
ICCS 2012
33
For more information go to wisdmproject.com
1/11/2012
Gary M. Weiss
ICCS 2012
Gary Weiss
[email protected]
34