Index Structures for Multimedia Data Feature-based Approach Jaruloj Chongstitvatana 2301474 Advanced Data Structures Multimedia Data Feature-based approach  Image/Voice data  Sequence data  Geometric data  Text descriptor Jaruloj Chongstitvatana Examples     Movies,

Download Report

Transcript Index Structures for Multimedia Data Feature-based Approach Jaruloj Chongstitvatana 2301474 Advanced Data Structures Multimedia Data Feature-based approach  Image/Voice data  Sequence data  Geometric data  Text descriptor Jaruloj Chongstitvatana Examples     Movies,

Index Structures for
Multimedia Data
Feature-based Approach
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
1
Multimedia Data
Feature-based
approach
 Image/Voice data
 Sequence data
 Geometric data
 Text descriptor
Jaruloj Chongstitvatana
Examples




Movies, music
Gene sequence
Shape (CAD)
Documents
2301474 Advanced Data Structures
2
Queries for Multimedia Data

Point queries
 Given

Range queries
 Given

a data, find the exact match
a data, find similar data within a range
Nearest-neighbor queries
 Given
Jaruloj Chongstitvatana
a data, find the most similar data
2301474 Advanced Data Structures
3
Feature Transformation
Mapping from an object to a d-dimensional
vector, called a feature vector.
 What is this mapping function?

 For
image data: color histogram, etc.
 For sequence data: number of each element
 For geometric data: slope of segments of
perimeter
 For text descriptor: number of each keyword
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
4
Similarity Measure: distance function
Given 2 data objects x and y.
 Let (x,y) be the distance function.

 (x,y)
indicates the similarity between data x
and y.

Usually (x,y) is based on a distance
between the feature vectors of x and y.
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
5
Similarity Queries

Point queries
 Given
an object x, find any object y such that
(x,y)=0.

Range queries
an object x and a threshold , find any
object y such that (x,y) < .
 Given

Nearest-neighbor queries
 Given
an object x, find an object y such that
(x,y) ≤ (x,z) for any object z in the database.
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
6
Distance Measure

Euclidean distance (x,y) = (i=1,…,d (xi-yi)2 )1/2

Manhattan distance (x,y) = i=1,…,d |xi-yi|

Maximum distance (x,y) = max i=1,…,d |xi-yi|

Weighted Euclidean (x,y) = (i=1,…,d wi (xi-yi)2 )1/2
distance

Ellipsoid distance
Jaruloj Chongstitvatana
(x,y) = (x-y)T W (x-y)
2301474 Advanced Data Structures
7
Other Similarity Queries

k-Nearest-neighbor queries
 Given
an object x and an integer k, find k
objects y1, y2,…, yk, such that, for i=1, 2, …, k,
(x,yi) ≤ (x,z) for any other object z in the
database.
Approximate nearest-neighbor queries
 Approximate k-nearest-neighbor queries

Jaruloj Chongstitvatana
2301474 Advanced Data Structures
8
Range Queries
On
 k-d-B trees
 Grid files
 Quad trees
 R-trees
Already discussed.
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
9
Nearest-neighbor Queries
On
 k-d-B trees
 Grid files
 Quad trees
 R-trees
Let’s discuss.
Jaruloj Chongstitvatana
2301474 Advanced Data Structures
10