PowerPoint Template

Download Report

Transcript PowerPoint Template

Evaluation of Conditional
Preference Queries
Fabíola S. Fernandes, Sandra de Amo
UNIVERSIDADE
FEDERAL DE UBERLÂNDIA
PÓS-GRADUAÇÃO EM
CIÊNCIA DA COMPUTAÇÃO
MOTIVATION
Movies ( Title, Genre, Years, Director, Actor )
My preferences:
(1) I prefer those movies produced in the 90’s rather than from 80’s,
if both belongs to the same category (genre);
(2) For the movies produced in the 80’s I prefer dramas to comedies;
(3) For Woody Allen’s films of the same genre and decade, I prefer
those staring the actress Charlotte Rampling than those staring
Mia Farrow.
Page  2
UFU
MOTIVATION
Movies ( Title, Genre, Years, Director, Actor )
Queries:
 Give the titles of the films which most fulfill my
wishes among those stored in the database, provided
they are not romance films.
 Give the 4 films, among those stored in the database,
which most fulfill my wishes.
Page  3
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  4
UFU
THE CPref-SQL LANGUAGE
 An extension of SQL able to express conditional
preference queries
 The queries incorporate the usual hard constraints
(WHERE) as well as soft constraints (preference rules)
Goals:
 Express preferences over a database
 Filter the answer to queries according to user preferences
Page  5
UFU
THE CPref-SQL LANGUAGE
Express preferences over a database as follows:
CREATE PREFERENCES MyPrefs
FROM Movies AS
Y=90 > Y=80 [T,D,A] AND
IF D=Woody Allen THEN A=Charlotte Rampling > A=Mia Farrow [T] AND
IF G=comedy and Y=80 THEN D=Joel Coen > D=Woody Allen [T,A] AND
IF Y=80 THEN G=drama > G=comedy [T]
Page  6
UFU
THE CPref-SQL LANGUAGE
Filter the answers to queries according to user
preferences
SELECT title
FROM movies
WHERE genre <> ‘romance’
ACCORDING TO PREFERENCES MyPrefs, 4
Page  7
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  8
UFU
RELATED WORK AND CONTRIBUTIONS
Topic
Modeling and reasoning of
preferences
SQL extensions
Preference Operators
State of the Art
CP-Net
TCP-Net
Conditional Preference
Formulas
Preference SQL
CPref-SQL
Skyline
Winnow
Select-Best
Algorithms for evaluation
of preference queries
BNL – skyline queries
SFS
BNL+, BNL++
BNL*, …
Top-K queries
Top N, Top K – ranking
Top K dominating
Implementation
On-top, Centrist, Built-in
Page  9
UFU
RELATED WORK AND CONTRIBUTIONS
Topic
Modeling and reasoning of
preferences
SQL extensions
Preference Operators
State of the Art
CP-Net
TCP-Net
Conditional Preference
Formulas
Preference SQL
CPref-SQL
Skyline
Winnow
Select-Best
Proposal
Conditional Preference
Formulas
CPref-SQL
SelectK-Best
Algorithms for evaluation
of preference queries
BNL – skyline queries
SFS
BNL+, BNL++
BNL*, …
BNL** e R-BNL**
Top-K queries
Top N, Top K – ranking
Top K dominating
Top-K cp-queries
Implementation
On-top, Centrist, Built-in
Built-in
Page  10
UFU
CONTRIBUTIONS
 Top-K cp-queries
 Algorithms BNL** and R-BNL** for evaluating the SelectBest and SelectK-Best operators
 Implementation in the core of the RDBMS PostgreSQL
 Experiments comparing the built-in approach with the
translation into standard SQL
Page  11
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  12
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
t5
t3
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  13
t6
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust Memories
drama
80
Woody Allen
Charlotte Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R2
t5
t3
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  14
t6
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust
Memories
drama
80
Woody Allen
Charlotte
Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  15
t6
t5
R2
t3
R4
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust Memories
drama
80
Woody Allen
Charlotte Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex
Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  16
R2
t6
t5
t3
R4
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust Memories
drama
80
Woody Allen
Charlotte Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  17
R2
t6
t5
t3
R4
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust Memories
drama
80
Woody Allen
Charlotte Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
THE PREFERENCE MODEL
t1
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
t2
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R4: (Y = 80)  (G = d) > (G = c) [{T}].
Page  18
R2
t6
t5
t3
R4
t4
Title
Genre
Years
Director
Actor
t1
Titanic
drama
90
James Cameron
Bill Paxton
t2
Stardust Memories
drama
80
Woody Allen
Charlotte Rampling
t3
New York Stories
drama
80
Woody Allen
Mia Farrow
t4
A Midsummer
Night's Sex Comedy
comedy
80
Woody Allen
Mia Farrow
t5
Crimewave
comedy
80
Joel Coen
Louise Lasser
t6
Avatar
action
00
James Cameron
Zoe Saldana
UFU
CONSISTENCY TEST [Wilson 2004]
When a cp-theory is consistent?
R1: (Y = 90) > (Y = 80) [{T,D,A}],
R2: (D = wa)  (A = cr) > (A = mf) [{T}],
R3: (G = c) ^ (Y = 80)  (D = jc) > (D = wa) [{T,A}],
R4: (Y = 80)  (G = d) > (G = c) [{T}],
R5: (G = c)  (Y=80) > (Y = 90) [{T}]
Dependency Graph
Y
D
Local Consistency
Years
G
90
80
A
(x, c, 90, y, z) ? (x, c, 80, y, z)
T
Page  19
UFU
CPref-SQL ALGEBRA OPERATORS
R
Select-Best
K
R
Page  20
SelectK-Best
Tuples that do not have any
other tuple over them in the
preference hierarchy
K tuples with the less
number of tuples above them
in the preference hierarchy
UFU
CPref-SQL ALGEBRA OPERATORS
π
SELECT < attribute-list >
SelectBest / SelectK-Best
FROM < tables >

WHERE < where-conditions (hard conditions) >
ACCORDING TO PREFERENCES
< preference (soft conditions) >
|X|
R1
Page  21
...
Rn
UFU
ALGORITHMS BNL** E R-BNL**
SelectBest
SelectK-Best
BNL**
R-BNL**
 Follows the lines of the Blocked Nested Loop (BNL) algorithm
(BORZSONYI et al. ICDE 2001)
 Uses the structure of a Datalog program to compare tuples
Page  22
UFU
ALGORITHMS BNL** E R-BNL**
Dominance Test
Input: tuples t1, t2
Output: t1 > t2 or t2 > t1 or t1 ~ t2 (incomparable)
Method
1. Projection
2. CP-Theory = Datalog program
Test t1 > t2 : datalog goal
Page  23
UFU
ALGORITHMS BNL** E R-BNL**
Dominance Test: t1 > t2? t2 > t1? t1 ~ t2?
1. Projection
R (A, B, C, D)
dom(A) = {a1, a2, a3}
dom(B) = {b1, b2}
dom(C) = {c1, c2, c3}
dom(D) = {d1, d2}
Page  24
CP-theory
1.
A = a1 -> C = c1 > C = c2
2.
B = b1 -> C = c2 > C = c3
t1, t2
Are comparable?
(a1,b1,c1,d1) and (a1,b1,c1,d2)
No
(a2,b1,c1,d1) and (a1,b1,c3,d1)
No
(a2,b2,c2,d1) and (a2,b2,c3,d1)
Yes
(a2,b2,c2) and (a2,b2,c3)
(a1,b1,c2,d1) and (a1,b1,c3,d1)
Yes
(a1,b1,c2) and (a1,b1,c3)
Projections
UFU
ALGORITHMS BNL** E R-BNL**
Dominance Test: t1 > t2? t2 > t1? t1 ~ t2?
2. CP-theory = Datalog Program
 Converts the preference rules in a Datalog program P
 Each dominance test of 2 comparable tuples is a goal for P
 Finds the solution (goal) using SLD resolution method
Page  25
UFU
ALGORITHMS BNL** E R-BNL**
Dominance Test: t1 > t2? t2 > t1? t1 ~ t2?
CP-Theory
1.
A = a1 -> C = c1 > C = c2
2.
B = b1 -> C = c2 > C = c3
Datalog Program
1.
pref(x1, y1, z1, x2, y2, z2 ) <- x1 = a1, x2 = a1, y1 = y2, z1 = c1 , z2 = c2
2.
pref(x1, y1, z1, x2, y2, z2 ) <- x1 = x2, y1 = b1, y2, = b1, z1 = c2 , z2 = c3
3.
dom( x1, y1, z1, x2, y2, z2 ) <- pref( x1, y1, z1, x2, y2, z2 )
4.
dom( x1, y1, z1, x2, y2, z2 ) <- pref( x1, y1, z1, x3, y3, z3 ),
dom( x3, y3, z3, x2, y2, z2 )
Page  26
UFU
ALGORITHMS BNL** E R-BNL**
Dominance Test: t1 > t2? t2 > t1? t1 ~ t2?
2. CP-theory = Datalog Program
Page  27
Test
Goal
(a1, b1, c1) > (a1, b1, c2)
dom( a1, b1, c1, a1, b1, c2 )
UFU
ALGORITHM BNL**
Procedure MostPref(r)
1. clear the in-memory page W and the temporary table F
2. make r the input
3. repeat the following until the input is empty
4.
5.
6.
7.
8.
9.
10.
for every tuple t in the input
if t is dominated by a tuple in W
then ignore t
if t dominates some tuples in W
then eliminate the dominated tuples and insert t into W
if t is incomparable with all tuples in W
then insert t into W if there is room, otherwise add t to F
11.
insert in S the tuples of W which were added there when F was empty
12.
make F the input, clear the temporary table
13. return S
Page  28
UFU
ALGORITHM R-BNL**
t1[]
l(t) =
 t’  r| t’ > t
otherwise
0
max { l(t’) | t’ > t } + 1
t2[t1]
t3[t1,t2]
t4[t1,t2,t3,t5]
t5[]
t6[]
0
1
t1
0
t6
0 t5
t2
Output S
2
t3
t4
3
SelectK-Best ( 4, R ) = { t1, t5, t6, t2}
Page  29
UFU
ALGORITHM R-BNL**
Procedure topK(r)
1. clear the in-memory page W and the temporary table F
2. make r the input
3. repeat the following until the input is empty
4.
for every tuple t in the input
for every tuple t’ in W
5.
if t is dominated by t’
6.
then add t’ into MorePref(t)
7.
if t dominates t’
8.
9.
then add t into MorePref(t’)
10.
insert t into W if there is room, otherwise add t to F
11.
insert in S the tuples of W which were added there when F was empty
12.
make F the input, clear the temporary table
13. Return S
Page  30
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  31
UFU
TOP-K CP-QUERIES IN THE RDBMS
 Extension for PostgreSQL 8.4
 Linux Operational System
 C Language
 Directly implemented in the Postgres back-end: built-in approach
Page  32
UFU
TOP-K CP-QUERIES IN THE RDBMS
CREATE PREFERENCES
Postgres
back-end
Page  33
UFU
TOP-K CP-QUERIES IN THE RDBMS
ACCORDING
TO
PREFERENCES
Postgres
back-end
Page  34
UFU
TOP-K CP-QUERIES IN THE RDBMS
Page  35
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  36
UFU
EXPERIMENTAL RESULTS
 Benchmark TPC-H (http://www.tpc.org/tpch/)
 Synthetic database
 Suite with 22 SQL queries
 Queries adaptation:
 insertion of the preference clause
 removal of aggregate functions (group by, having, …)
 changes on the terms of the WHERE clause
 Performance and scalability evaluations of CPref-SQL
queries and their translations to SQL
 All CPref-SQL query can be translated into SQL queries
Page  37
with recursion
UFU
EXPERIMENTAL RESULTS
Conversion CPref-SQL <-> SQL
CREATE OR REPLACE VIEW Rules
(title,genre,years,director,actor,tit,gen,yea,dir,act) AS
CREATE PREFERENCES mypref
FROM movies AS
genre = ‘drama > genre =
‘musical’ [1,5]
AND
years = 90 > years = 80 [1,4,5]
AND
IF years = 80 THEN genre =
‘drama’ > genre = ‘comedy’ [1]
Page  38
(SELECT *
FROM movies M, movies M1
WHERE M.genre = ‘drama' AND M1.genre = ‘musical'
AND M.director = M1.director AND M.years =
M1.years)
UNION
(SELECT *
FROM movies M, movies M1
WHERE M.years = 90 AND M1.years = 80 and M.genre
= M1.genre)
UNION
(SELECT *
FROM movies M, movies M1
WHERE M.years = 80 and M1.years = 80 and M.genre =
‘drama’ and M1.genre = ‘comedy’ and
M.director = M1.director and M.actor =
M1.actor);
UFU
EXPERIMENTAL RESULTS
Conversion CPref-SQL <-> SQL
WITH RECURSIVE Recursion
( tit, gen, yea, dir, act, title, genre, years, director, actor ) AS (
SELECT *
FROM movies
WHERE genre <> ‘romance’
ACCORDING TO PREFERENCES
mypref
( SELECT * FROM Rules )
UNION
( SELECT M.title, M.genre, M.years, M.director, M.actor,
R.title, R.genre, R.years, R.director, R.actor
FROM Rules M, Recursion R
WHERE M.tit = R.tit AND
M.gen = R.gen AND
M.yea = R.yea
M.dir = R.dir AND
M.act = R.act ) )
SELECT *
FROM movies
WHERE genre <> ‘romance’
EXCEPT
SELECT R.title, R.genre, R.years, R.director, R.actor
FROM Recursion R;
Page  39
UFU
EXPERIMENTAL RESULTS
Performance
Page  40
UFU
EXPERIMENTAL RESULTS
Scalability
Page  41
UFU
OUTLINE
 The CPref-SQL Language
 Related Work and Contributions
 Evaluation of Preference Queries (cp-queries)
 Top-K cp-queries in the RDBMS
 Experimental Results
 Conclusion and Further Work
Page  42
UFU
CONCLUSION AND FURTHER WORK
Top-K cp-queries
Algorithms BNL** e R-BNL**
Implementation in the core of the PostgreSQL
Ongoing research:
 Development of algorithms under the approach on-top
 Supporting to other built-in predicates (>, <, >=,…)
Future research:
 Incorporating aggregate operations in the CPref-SQL block
Optimization of the execution plan – rules rewrite
Page  43
UFU
Evaluation of Conditional
Preference Queries
Fabíola S. Fernandes, Sandra de Amo
[email protected], [email protected]
!! FIM !!
UNIVERSIDADE
FEDERAL DE UBERLÂNDIA
PÓS-GRADUAÇÃO EM
CIÊNCIA DA COMPUTAÇÃO
ALGORITHM BNL**
Temporary table
F
Input Table
Output page
The Preferred
tuples = query
answer
Block of Pages
Window W
Buffer
Page  45
UFU
ALGORITHM BNL**
t1
t2
t6
t5
t3
t1t6
t4
t1
t2
t3
Buffer W
t4
t5
t6
t5
t6
INPUT
Output S
Temporary table F
Page  46
UFU
ALGORITHM R-BNL**
Temporary table
F
Input Table
Output page
All tuples with
their respective
MorePref lists
Block of Pages
Window W
Buffer
Page  47
UFU
ALGORITHM R-BNL**
t1
t2
t6
t5
t3
t4
t1[]
t2[]
t2[t1]
Buffer W
t3[]
t3[t1,t2]
t4[]
t4[t1,t2]
t5[]
t6[]
INPUT
Output S
Temporary table F
Page  48
UFU
ALGORITHM R-BNL**
t1
t2
t6
t5
t3
t1[]
t4
t2[t1]
Buffer W
t3[t1,t2]
t4[t1,t2]
t5[]
INPUT
Output S
t6[]
Temporary table F
Page  49
UFU
ALGORITHM R-BNL**
t1[]
l(t) =
 t’  r| t’ > t
otherwise
0
max { l(t’) | t’ > t } + 1
t2[t1]
t3[t1,t2]
t4[t1,t2,t3,t5]
t5[]
t6[]
0
1
t1
0
t6
0 t5
t2
Output S
2
t3
t4
3
SelectK-Best ( 4, R ) = { t1, t5, t6, t2}
Page  50
UFU
TOP-K CP-QUERIES IN THE RDBMS
Create Preferences
 Catalog structure:
 Each column is an attribute of the relation movies
 Each row represents a rule
ID
Rule
title
genre
years
director
actor
1
NIL
*
90|80
NIL
NIL
2
NIL
*
*
Woody Allen
Charlotte Rampling|Mia
Ferrow
3
NIL
comedy
80
Joel Coen|Woody
Allen
NIL
4
NIL
drama|comedy
80
*
*
Page  51
UFU
TOP-K CP-QUERIES IN THE RDBMS
Create Preferences
Page  52
UFU
TOP-K CP-QUERIES IN THE RDBMS
Insertion of the operators in the query processor
Page  53
UFU