const_poproutes_unc_tra_adesai_10_24_12

Download Report

Transcript const_poproutes_unc_tra_adesai_10_24_12

Constructing Popular Routes from Uncertain Trajectories

Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research Asia) Wen-Chih Peng (National Chiao Tung University, Hsinchu) Paper reviewed by: Aniruddha Desai (University of Washington, Tacoma)

Applications

Scope: Infer popular routes from a set of

uncertain trajectories

   Trip Planning (Travel / Tourism) Traffic Management (Transportation) Animal Movement studies

Spatial Trajectories

What is a trajectory?

Sequence of points: Location (Latt, Long) & Time-stamp What are the moving objects?

Humans, Vehicles, Animals etc.

How are the trajectories collected?

Ubiquitous location acquisition technologies / devices using GPS

Uncertainty and Inference

 Trajectories generated at low or irregular frequencies.

 Routes between consecutive points on trajectories are uncertain.

 To infer a popular route we need to find similarity between two uncertain trajectories – this is hard to measure.

“RICK”

Route Inference framework based on

Collective Knowledge

Approach: aggregate uncertain trajectories in a mutually reinforcing way: uncertain + uncertain => certain

Datasets:

◦ Real datasets used for conducting extensive experiments ◦ Check-in dataset from Foursquare – 6,600 trajectories from Manhattan (3 check-ins min) ◦ 15,000 taxi trajectories in Beijing.

How does it work?

Rick Overview: user specified query consists of a location sequence & a time span; RICK infers the top-k popular routes that pass through these locations within given time span

Region Construction

   Historical uncertain trajectories used to construct a routable graph in a gridded space based on spatio temporal characteristics Grid cell size (“l”) represents granularity of inferences Data points (or grid “cells”) “spatially close” if: |x - x’| <= 1 and |y - y’| <= 1

Region Construction (cont’d…)

 Data points “st-correlated” (spatio-temporally correlated) if they are spatially close (Rule 1 or Rule 2) and they mutually satisfy a temporal constraint q  Connection support C is of a cell pair is a threshold for connectivity in the graph.

Neighbor: If the connection support of a cell pair is >= C then they are neighbors.

Region Construction (cont’d…)

  Region: Based on the connection support (above a specified threshold value ‘C’) between individual cell pairs regions are constructed.

Cell pairs are merged into regions using an efficient recursive algorithm; Time complexity: O(cnm

2 ) Where c = minimum loop iterations n = size (cardinality) of the set of cells in the grid space m = size (cardinality) of the dataset

Edge Inference

 After the regions are constructed we infer edges.

 Two types of Edges: ◦ Edges within each region ◦ Edges among regions

Edge Inference (cont’d…)

 Each vertex represents a cell and each edge indicates a transition relationship and has two attributes: ◦ Transition support ◦ Travel time  Virtual bidirected edges between cells (vertices) are generated if cells are neighbors in a region.

 Shortest path inference approach is used. The direction, transition supports and travel time information for edge on shortest path is stored.

 Redundant edges and edges whose transition support is 0 are eliminated

Route Inference

  Two phases: ◦ Route generation ◦ Route refinement Route generation: ◦ Top-k coarse routes are discovered with the routable graph

Route Inference (cont’d…)

 If query location can not be mapped to a graph vertex we use MINDIST (nearest neighbor algorithm) to find the cells close to the query location.

Local Routes: the top-k local routes between any two consecutive cells are searched in the cell sequence by an A*-like algorithm.

 Route score is computed based on the range of time interval between the two query locations.

 Based on top-k local routes top-k global routes are searched by a branch-and-bound search approach

Route Inference (cont’d…)

Two-Layer Routing Algorithm  Before searching for local routes region sequences are generated to reduce the search space by using a lower bound of the transition times between the regions with respect to two given cells.

 Thus, multiple region sequences are possible

Route Inference (cont’d…)

Route Refinement:  Use historical data points (of trajectories that traverse the cells on the rough route) that locate in the cells on the route generated.

 Adopt linear regression for set of points of each cell to derive a line segment.

 Concatenate line segments in the order of the inferred route

Performance Evaluation

     Inferred routes are compared against ground-truth from raw-trajectories.

Two metrics used: ◦ NDTW – normalized dynamic time warping distance ◦ MD - maximum distance between inferred route and the raw trajectory of the ground truth.

Compared RICK with existing approach MPR (Most Popular Route) as a baseline Time Efficiency is tested (avg. query time 0.5 secs).

RICK outperforms the baseline by generating routes 300-700m closer to the ground-truth (than the those of the baseline).

Visualization of Results

Visualization of the query: “Central Park - > The Museum of

Modern Art - > Times Square - > Empire State Building - >

SoHo”, for top-1 (most popular) route inferred by RICK Note: The route does not just connect the query locations, but passes through other attractions along the “inferred” most popular route.

Strengths

   Thorough / Credible The authors have conducted extensive experiments on real data. Their results show that the route inference framework is effective, efficient and measurably accurate.

Organized / Easy to understand The content of the paper is very well organized and can be easily understood even by a naïve reader.

Illustrations: (where provided) are very effective in describing spatial concepts.

Weaknesses

 Connection Support: Not explained sufficiently, diagrams would have been helpful explain key concept  Route generated using A*-like algorithm context of inferred route generated.

: Not explained the role of A*-like algorithm adequately in the  NDTW: “ Normalized dynamic time warping” distance is not explained adequately; diagrams would have helped explain this key performance metric better.

Thank you!

Q&A