Transcript MO14

Critical Analysis Presentation:
T-Drive: Driving Directions based
on Taxi Trajectories
Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang,
Weilei Xie, Xing Xie, Guangzhong Sun, Yan Huang
Presentation by: Yashu Chamber, Zhe Jiang
CSCI: 8715
Instructor: Dr. Shashi Shekhar
1
Problem Statement
• INPUT: Taxi trajectory data using collected using GPS;
user query with a start point qs , end point qd, and a
departure time td,
• OUTPUT: To find a route from qs to qd with departure
time td in a dynamic road network Gr = (Vr, Er) which is
learned from a trajectory archive A.
• OBJECTIVE: The route should be fastest.
• CONSTRAINTS:
– dynamic road network Gr = (Vr, Er) in which travel time
cost change over time and space
– low sample rates of GPS points in trajectory archives
Significance To Course
• Trajectory data is both spatial and temporal in
nature. This work could complement the
course content as novel technique in spatial
and temporal data mining.
• This paper shows an interesting application of
spatial and temporal data mining to “smart
driving directions” recommendation.
3
Challenges
• difficulty in modeling intelligence of taxi
drivers based on taxi historical trajectories,
• the sparseness and low coverage of trajectory
data
• low sampling rate in GPS points.
Major Contributions
1. Introduce the notion of “landmark graph” that
models the intelligence of taxi drivers based on
taxi trajectories.
2. Proposed a “Variance-Entropy-Based
Clustering” method to summarize the
distributions of travel times between any two
landmarks at any time.
3. Build a ‘system’ to suggest ‘fastest driving
directions’ based on real world dataset, evaluate
its performance with both synthetic queries and
in-field-experiments.
Key Concepts
1. Trajectory: A sequence of GPS points pertaining to one
trip.
2. Road Segments: A directed edge (one or bi-directional)
between two segment terminal points.
3. Landmarks: ‘k’ frequently traversed road segments.
4. Landmark Edges: Edge connecting two landmarks if a
good number of trajectories (δ) are passing through these
two landmarks, and the travel time is less than(tmax).
5. Rough Routing: Sequence of landmark edges having the
fastest route for a given user (based on user’s speed).
6. Refined Routing: Each Landmark edge in the rough route
is transformed into a number of road segments having the
fastest route.
Approach
• Trajectory Preprocessing
– Trajectory segmentaion: Segement the GPS log of taxis into
individual trips.
• Landmark Graph Construction
– identify the most ‘k’ frequently traversed road segments, which
they denote as “landmarks”.
– Different landmarks are connected by an edge (“landmark
edge”) if a good number of trajectories (δ) are passing through
these two landmarks, and the travel time is less than(tmax).
• Route Computing.
– Rough Routing: Identify a sequence of landmark edges
representing the fastest route.
– Refined Routing.
Figures Describing The Approach
Source: T-Drive: Driving Directions Based on Taxi Trajectories. Author: Jing Yuan
Validation Methodology
1. Evaluating landmark graphs.
– RESULT: Used to validate the modeling
efficiency of the method.
2. Evaluation based on synthetic queries.
– The authors randomely generate 1200 and
compare against Speed-constraint based method
(SC) which is used in Google and Bing maps, and
the Real-time-traffic-analysis-based method.
RESULT: Suggested method outperforms state
of the art methods RT and SC.
3. In-the-field-evaluation.
i.
ii.
Same driver traverses routes suggested by their method and
competetive techniques.
Different drivers (with similar skills) travel different routes suggested
by different methods.
Strengths and Weaknesses of the methodology.
The authors did a good job of evaluating using different methodologies,
including real and synthetic evaluations using real dataset.
Assumptions
1. The normal users are allowed on the same roads
where taxi trajectories exist. Travel time cost of
taxis are same as other type of vehicles.
2. There is periodicity of travel time cost over
week, i.e. accidents, road works, emergencies
and so on that change traffic flow patterns are
rare.
3. The taxi drivers have knowledge of fastest route
and do not make “roundabout trip”.
4. *Historical taxi trajectories cover all potential
points.
Suggestions
• Remove data sparseness and low coverage from
challenge section and mention it in a scope
paragraph, since it is not conquered.
• Provide more justification of deciding travel time
cost for a user based on user optimism index
• Provide theoretical cost model of suggested
method.
• Add more in-field evaluations, current sample
size (two) is too small.