Approximate Shortest Distance Computing: A Query

Download Report

Transcript Approximate Shortest Distance Computing: A Query

Approximate Shortest Distance
Computing: A Query-Dependent
Local Landmark Scheme
Abstract
Shortest distance query is a fundamental operation in large-scale networks. Many existing
methods in the literature take a landmark embedding approach, which selects a set of graph nodes
as landmarks and computes the shortest distances from each landmark to all nodes as an
embedding. To answer a shortest distance query, the precomputed distances from the landmarks to
the two query nodes are used to compute an approximate shortest distance based on the triangle
inequality.
• In this paper, we analyze the factors that affect the accuracy of distance estimation in landmark
embedding. In particular, we find that a globally selected, query- independent landmark set may
introduce a large relative error, especially for nearby query nodes. To address this issue, we
propose a query-dependent local landmark scheme, which identifies a local landmark close to
both query nodes and provides more accurate distance estimation than the traditional global
landmark approach.
• We propose efficient local landmark indexing and retrieval techniques, which achieve low offline
indexing complexity and online query complexity. Two optimization techniques on graph
compression and graph online search are also proposed, with the goal of further reducing index
size and improving query accuracy. Furthermore, the challenge of immense graphs whose index
may not fit in the memory leads us to store the embedding in relational database, so that a query
Existing System
• As the size of graphs that emerge nowadays from various application domains is
dramatically increas¬ing, the number of nodes may reach the scale of hundreds of
millions or even more. Due to the massive size, even simple graph queries become
challenging tasks. One of them, the shortest distance query, has been extensively studied
during the last four decades. Querying shortest paths or shortest distances between
nodes in a large graph has important applications in many domains including road
networks, social networks, communication networks, the Internet, and so on.
• For example, in road networks, the goal is to find shortest routes between locations; in
social networks, the goal is to find the closest social relationships such as friendship or
collaboration between users; while in the Internet, the goal is to find the nearest server
to reduce access latency for clients.
• Although classical algorithms like breadth-first search (BFS), Dijkstra's algorithm [1], and
A* search [2], [3], [4] can compute the exact shortest paths in a network, the massive
size of modern information networks and the online nature of such queries make it
infeasible to apply the classical algorithms online. On the other hand, it is space
inefficient to precompute and store the shortest.
Architecture Diagram
System Specification
• HARDWARE REQUIREMENTS
• Processor
: Intel Pentium IV
• Ram
: 512 MB
• Hard Disk
: 80 GB HDD
•
• SOFTWARE REQUIREMENTS
• Operating System
: Windows XP / Windows 7
• FrontEnd
: Java
• BackEnd
: MySQL 5
CONCLUSION
• In this paper, we propose a novel shortest path tree-based local landmark
scheme, which finds a node close to the query nodes as a query-specific
local landmark for a triangulation-based shortest distance estimation.
Specifi¬cally, a local landmark is defined as the LCA of the query nodes in a
shortest path tree rooted at a global landmark.
• Efficient algorithms for indexing and retrieving LCAs are introduced, which
achieve low offline indexing complexity and online query complexity.
• This strategy significantly reduces the distance estimation error, compared
with global landmark embedding. We also study the local landmark
scheme on relational database for better scalability. Extensive experimental
results on large-scale social net¬works and road networks demonstrate the
effectiveness and efficiency of the proposed local landmark scheme.
THANK YOU