Extracting Dense Regions From Hurricane Trajectory Data

Download Report

Transcript Extracting Dense Regions From Hurricane Trajectory Data

Extracting Dense Regions From Hurricane
Trajectory Data
Praveen Kumar Tripathi
Madhuri Debnath
Ramez Elmasri
Department of Computer Science and Engineering
University of Texas at Arlington
First International ACM Workshop on
Managing and Mining Enriched Geo-spatial Data
GeoRich 2014
Extracting Dense Regions From Hurricane Trajectory Data
Contents:
 Spatio temporal data:
 Examples
 Applications
 Data mining tasks
 Extracting Dense Regions from Hurricane Trajectory data:
 Proposed concept
 DBSCAN: core concepts
 Experiments and Analysis: spatial, and spatial with non spatial.
 Temporal extension for the analysis:
 Temporal DBSCAN
 Proposed relative time attribute
 Experimental analysis of spatio temporal clustering
2
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Spatio temporal data: Hurricane JIG 1950
 Has spatial and temporal attributes as key attributes.
Example :
3
'ADV'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
'Year'
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
1950
'Name'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'HurricaneJIG'
'LAT'
24.3
24.2
24.2
24.2
24.2
24.4
24.7
25
25.4
26.1
26.9
27.6
28.4
29.5
30.8
32
33.2
34.2
35.1
35.9
36.8
38.8
40.8
41.9
43
44.1
'LON'
-47.2
-48.2
-49.2
-50.2
-51.2
-52.4
-53.7
-54.6
-55.5
-56.7
-57.8
-58.6
-59.3
-60
-60.5
-60.2
-59.2
-58.2
-57.2
-56.2
-55
-51.5
-47.1
-44.5
-42
-39.9
'TIME'
'10/11/12Z'
'10/11/18Z'
'10/12/00Z'
'10/12/06Z'
'10/12/12Z'
'10/12/18Z'
'10/13/00Z'
'10/13/06Z'
'10/13/12Z'
'10/13/18Z'
'10/14/00Z'
'10/14/06Z'
'10/14/12Z'
'10/14/18Z'
'10/15/00Z'
'10/15/06Z'
'10/15/12Z'
'10/15/18Z'
'10/16/00Z'
'10/16/06Z'
'10/16/12Z'
'10/16/18Z'
'10/17/00Z'
'10/17/06Z'
'10/17/12Z'
'10/17/18Z'
'WIND'
40
40
45
45
50
55
55
60
65
70
75
75
80
80
85
90
100
105
100
95
90
90
85
65
60
60
'PR'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'-'
'STAT'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'TROPICAL'
'HURRICANE-1'
'HURRICANE-1'
'HURRICANE-1'
'HURRICANE-1'
'HURRICANE-1'
'HURRICANE-1'
'HURRICANE-2'
'HURRICANE-2'
'HURRICANE-3'
'HURRICANE-3'
'HURRICANE-3'
'HURRICANE-2'
'HURRICANE-2'
'HURRICANE-2'
'HURRICANE-2'
'HURRICANE-1'
'EXTRATROPICAL'
'EXTRATROPICAL'
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Hurricane data
4
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Dataset*
 Attributes :
 ADV : Tracking ID
 LAT : Position in latitude
 LON : Position in longitude
 Time : Track time point
 WIND : Winds in knots ( 1 knot = 1.15 mph)
 PR : Pressure in millibards
 STAT : State category ( Tropical Storm, Hurricane..)
 Temporal Coverage : 1950 – 1999 (50 years) (15319 data points, 496
trajectories)
 Spatial Coverage : North Atlantic
 Interval : 6 hourly (0000,0600,1200,2400)
 Data format : txt file
5
*http://weather.unisys.com/hurricane/atlantic
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Spatio temporal data
 Example
 Global positioning data (GPS).
 Cell phone tracking, vehicle traffic data, emergency vehicle data,
transportation data
 Hurricane and storm tracking data.
 Animal movement data.
 Environmental data:
 Land use data.
 Demographic : population data.
 Soil moisture data.
 Temperature data.
6
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Spatio Temporal data: contd..
 Applications:





Managing vehicle traffic patterns.
Monitoring and predicting weather conditions.
Examining wild animal behavior.
Analyzing spread of diseases.
Monitoring land use, climate change.
 Data Mining Tasks:




7
Clustering.
Flocking behavior mining.
Convoy detection.
Classification.
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Extracting Dense Regions from Hurricane
Trajectory data
 The data set has been considered as point data set.
 Density Based Spatial Clustering of Application with Noise
(DBSCAN) [1] has been used.
 DBSCAN has been used to identify the dense regions of
hurricane activity viz., hot spots.
 Clustering has been done using only the spatial attributes
(latitude and longitude), as well as spatial along with non
spatial attribute (latitude, longitude and wind speed) for
comparative analysis.
 Clustering results have been evaluated using clustering
evaluation measures.
8
[1] Martin Ester , Hans-peter Kriegel , Jörg S , Xiaowei Xu, A density-based algorithm for discovering
clusters in large spatial databases with noise , Proceedings of the Second International Conference on
Knowledge Discovery and Data Mining (KDD-96), pp. 226
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Contd..
 The post processing of the obtained clustering results have
been done to obtain:
 Regions from where the storms are most likely to originate.
 Regions where the storms are more likely to land.
 Key regions through which the storms are more likely to
traverse.
9
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
DBSCAN: Core Concepts
Cluster Definition: A cluster 𝐶 is a subset of objects satisfying the following:
• Connected: ∀𝑝, 𝑞 ∈ 𝐶, 𝑝 𝑎𝑛𝑑 𝑞 are density connected
• Maximal: ∀𝑝, 𝑞, 𝑖𝑓 𝑝 ∈ 𝐶, 𝑎𝑛𝑑 𝑞 is density reachable from 𝑝, 𝑡ℎ𝑒𝑛 𝑞 ∈ 𝐶.
10
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Experiments and
Analysis : Spatial
analysis
11
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Experiments and
Analysis(1): Spatial
and non spatial
analysis
12
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
[2] Quality measure =
Two distances used for two quality measures :
• Spatial distance (latitude and longitude)
• Non-spatial distance (wind speed).
13
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
14
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Temporal extension to DBSCAN
 The spatio temporal data has time as another key attribute.
 DBSCAN needs to be modified to incorporate the time
domain.
 In the temporal analysis possibilities:
 Absolute temporal analysis (vehicle traffic analysis)
 Periodic temporal analysis (repeating everyday)
 Relative temporal analysis (hurricane data analysis)
 Present analysis uses the relative temporal analysis:
 Hurricanes do not occur at the same time. (absolute temporal
analysis not possible).
15
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Relative temporal attribute
Let there be two trajectories of length 𝑚 𝑎𝑛𝑑 𝑙 where,
𝑙 𝑎𝑛𝑑 𝑚 may be different:
 Absolute relative temporal attribute.
 𝑇𝑟1 = [< 𝑥11 , 𝑦11 , 1 >, < 𝑥12 , 𝑦12 , 2 > ⋯
𝑥1𝑙 , 𝑦1𝑙 , 𝑙]
 𝑇𝑟2 = [< 𝑥21 , 𝑦21 , 1 >, < 𝑥22 , 𝑦22 , 2 > ⋯ 𝑥2𝑚 , 𝑦2𝑚 , 𝑚]
 Length normalized relative temporal attribute.
1
 𝑇𝑟1 = [< 𝑥11 , 𝑦11 , 0 >, < 𝑥12 , 𝑦12 ,
> ⋯ 𝑥1𝑙 , 𝑦1𝑙 , 1 ]
𝑙−1
1
 𝑇𝑟2 = [< 𝑥21 , 𝑦21 , 0 >, < 𝑥22 , 𝑦22 ,
>
𝑚−1
⋯
16
𝑥2𝑚 , 𝑦2𝑚 , 1]
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Extending DBSCAN
 In [2], DBSCAN extended to incorporate additional neighborhood
criteria:
 Let 𝐷 = 𝑑1 , 𝑑2 , … 𝑑𝑛 be the data set where 𝑑𝑖 = 𝑥𝑖 , 𝑦𝑖 , 𝑎𝑖 , 𝑏𝑖 .
(𝑥1 −𝑥2 )2 + (𝑦1 −𝑦2 )2 - spatial attributes (𝑥𝑖 , 𝑦𝑖 )
= (𝑎1 −𝑎2 )2 + (𝑏1 −𝑏2 )2 - non spatial attributes
 𝑑𝑖𝑠𝑡𝑠 =
 𝑑𝑖𝑠𝑡𝑛𝑠
(𝑎𝑖 , 𝑏𝑖 )
 Let there are two thresholds 𝜖1 and 𝜖2 given by the user for the
spatial and non-spatial neighborhood…………………………(1)
 Let there be now two neighborhood (spatial and non-spatial):
 𝑁𝜖 1 𝑑𝑘 = 𝑑𝑗 ∈ 𝐷 𝑑𝑖𝑠𝑡𝑠 𝑑𝑘 , 𝑑𝑗 ≤ 𝜖1 }
 𝑁𝜖 2 𝑑𝑘 = 𝑑𝑗 ∈ 𝐷 𝑑𝑖𝑠𝑡𝑛𝑠 𝑑𝑘 , 𝑑𝑗 ≤ 𝜖2 } then the final
common neighborhood will be:
 𝑁 = 𝑁𝜖 1 𝑑𝑘 ∩ 𝑁𝜖 2 𝑑𝑘 …………………………………(2)
17
[2] Birant, D., and Kut, A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data
Knowledge Engineering , 2007 pp 208-221
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Temporal extension
 We use the formulation in (2), where the thresholds 𝜖1 is
used for spatial neighborhood and 𝜖2 for the temporal
neighborhood.
 By incorporating the spatial as well as temporal
neighborhood in the basic DBSCAN algorithm we now have
the spatio-temporal version of the DBSCAN algorithm.
18
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
Spatio temporal clustering results
19
7/17/2015
Extracting Dense Regions From Hurricane Trajectory Data
References:
1.
2.
3.
4.
5.
6.
7.
8.
Martin Ester , Hans-peter Kriegel , Jörg S , Xiaowei Xu, A density-based algorithm for discovering
clusters in large spatial databases with noise , Proceedings of the Second International Conference on
Knowledge Discovery and Data Mining (KDD-96), pp. 226–231
Birant, D., and Kut, A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data
Knowledge Engineering , 2007 pp 208-221
Gil Lee and Han, J. Trajectory Clustering: A partition-and-group framework. In SIGMOD 2007, pp
593-604
Gudmundsson, J, Thom A, Vahrenhold J, Of motifs and goals : mining trajectory data. In Proceedings of
the 20th International Conference on Advances in Geographic Information Systems, SIGSPATIAL’12
129-138.
Gil Lee, J Han , X Li and H Gonzalez, Traclass : Trajectory Classification using hierarchical regionbased and trajectory-based clustering VLDB 2008.
Vieira M., Bakalov P., and Tsotras V., On-Line Discovery of Flock Patterns in Spatio-Temporal data,
Proceedings of 17th ACM SIGSPATIAL International conference, GIS’09, pp. 286-295
Jeung H.,Yiu M., Zhou X. Jensen C., and Shen H., Discovery of convoy in Trajectory databases.
PVLDB, 2008.
Kalnis P., Mamoulis, and Bakiras S., On Discovering Moving Clusters in Spatio-Temporal data, In
SSTD 2005, pp. 364-381.
Extracting Dense Regions From Hurricane Trajectory Data
21
7/17/2015