幻灯片 1 - Home, WAMDM, Database Group at Renmin

Transcript 幻灯片 1 - Home, WAMDM, Database Group at Renmin

Architecture and Design of
Distributed Database Systems
WAMDM Cloud Computing Group
Haiping Wang
2010-05-08
2015/7/20
WAMDM Cloud Group
1
Outline
• Architecture
– Client/Server, P2P
– DDBMS,MDBS
• Design
– Strategies
– Issues
– Fragmentation
– Allocation
• Conclusion
2015/7/20
WAMDM Cloud Group
2
Client/Server Reference Architecture
• Begin at 1990’s
• Communication at the level of SQL
statement
• Client
– Application , user interface, DBMS Client
responsible for cache management( cached
data and cached transaction locks)
– Possible for consistency checking of user
queries(not common)
• Server
– Query processing, optimization,
transaction and storage management
– One server
• Comparison with centralized DB
– Multiple Server
• client manages its own connection(heavy client)
• Client only know it “home server”(light client)
2015/7/20
WAMDM Cloud Group
3
Physical Data Organization in P2P
Distributed Systems
• Data usually fragmented and
replicated
• Three layer organization
– LIS (local internal schema)
– LCS(local conceptual schema)
– GCS ( global conceptual
schema)
– ESs (external schema), for user
applications and access
2015/7/20
WAMDM Cloud Group
4
Functional Schematic of an Integrated
DDBMS
For data independence
Extended the ANSI/SPARC
GD/D for global mapping
LD/D for local mapping
2015/7/20
WAMDM Cloud Group
5
Components of a DDBMS
• User processor
– User interface handler
– Semantic data controller
– Global query optimizer and
decomposer
– Distributed execution monitor
• Data processor
– Local query optimizer
– Local recovery manager
– Run-time support processor
•
2015/7/20
WAMDM Cloud Group
6
MDBS Architecture
With a GCS
Without GCS
GCS,GES may use different data model
Local system layer
and language
Homogeneous:
Multidatabase layer
Unilingual(example:MULITIBASE)
 Multilingual(permit each user to
access the global data by means of
external schema)
2015/7/20
WAMDM Cloud Group
heterogeneous
7
Difference between MDBS and DDBMS
Definition difference
DDBMS
MDBs
Global conceptual schema
(fundamental difference)
The conceptual view of the
entire database, mapping
the global schema to local
conceptual schema
The collection of some of
the local DB that wants to
share, mapping the local
conceptual schema to a
global schema
Global database
Equal to the union of local
databases
A subset of the union of
the local databases
Design strategy
Usually designed by topdown process
Usually designed by
bottom-up process
2015/7/20
WAMDM Cloud Group
8
Outline
• Architecture
– Client/Server, P2P
– DDBMS,MDBS
• Design
– Strategies
– Issues
– Fragmentation
– Allocation
• Conclusion
2015/7/20
WAMDM Cloud Group
9
Framework of Distribution
• Level of sharing
– Share nothing
– Share data
– Share data+ program
• Access pattern
– Static
– Dynamic
• Level of knowledge on
access pattern
– Know nothing
– The designer know partial
information
– The designer know complete
information
2015/7/20
WAMDM Cloud Group
10
Top-down design process
• An iteration of five steps:
– Requirement analysis
– Conceptual design and
view integration
E-R representation and
translation to RDB schema
– Distribution design
Data fragmentation and
allocation
– Physical design
– Tuning
2015/7/20
WAMDM Cloud Group
11
Bottom-up design process
• Primarily for multidatabase applications
• Terms related
– Data integration/fusion
– P2P
• Not covered by this course
2015/7/20
WAMDM Cloud Group
12
Distribution Design Issues
•
•
•
•
•
Why to fragment
How to fragment
Correctness of fragmentation
How to allocate
The information for fragmentation
2015/7/20
WAMDM Cloud Group
13
Reasons for fragmentation
To enhance intra-query concurrency
To increase the throughput
But extra cost for queries involving
more than one segment residing at
different sites
2015/7/20
WAMDM Cloud Group
14
Correctness for rules of fragmentation
2015/7/20
WAMDM Cloud Group
15
Horizontal Partition relation PROJ
2015/7/20
WAMDM Cloud Group
16
Horizontal fragmentation
• Primary horizontal fragmentation – to partition a
relation by using predicates on that relation
• Derived horizontal fragmentation – to partition a
relation by using predicates on another relation
• Two aspects affecting fragmentation
– data information, and
– application information
2015/7/20
WAMDM Cloud Group
17
Data information
2015/7/20
WAMDM Cloud Group
18
Application information
• Both qualitative and quantitative are needed
• Definition
– simple predicate
example: LOC= “New York”
– Minterm predicate
2015/7/20
WAMDM Cloud Group
19
Application information
Minterm selectivity – sel(mi)
The number of tuples of a relation returned by a
query specified by the minterm mi
Access frequency – acc(mi)
The frequency with which user applications access
data using a query specified by the minterm mi
2015/7/20
WAMDM Cloud Group
20
Primary horizontal fragmentation
2015/7/20
WAMDM Cloud Group
21
The first step for primary horizontal
fragmentation
• To find a set of simple predicates with the
properties of Completeness and Minimal
• Completeness
A set of predicate Pr is complete iff there is an
equal probability of access by any application to any
two tuples belonging to any minterm fragment that
is defined according to Pr.
2015/7/20
WAMDM Cloud Group
22
Minimal
2015/7/20
WAMDM Cloud Group
23
Steps for primary horizontal
fragmentation
• Use COM_MIN algorithm to generate a
complete and minimal set of predicates Pr’
given a set of simple predicates
• Derive the set of minterm predicates
• Elimination of some of the meaningless
minterm fragments
2015/7/20
WAMDM Cloud Group
24
COM_MIN
Rule1 fundamental rule of completeness and minimality :
a relation or fragment is partitioned “ into at least two
parts which are accessed differently by at least one
application”
Input: R: relation; Pr: set of simple predicates
Output: Pr’: set of simple predicates
Function: generate a complete and minimal set of predicates Pr’
given a set of simple predicates
2015/7/20
WAMDM Cloud Group
25
COM_MIN
2015/7/20
WAMDM Cloud Group
26
PHORIZONTAL
2015/7/20
WAMDM Cloud Group
27
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
28
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
29
Example for primary horizontal
fragmentation
• Use COM_MIN algorithm to get a complete
and minimal simple predicates
Pr'   p1, p5 , p2 , p3 , p4
2015/7/20
WAMDM Cloud Group
30
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
31
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
32
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
33
Example for primary horizontal
fragmentation
2015/7/20
WAMDM Cloud Group
34
Derived horizontal fragmentation
2015/7/20
WAMDM Cloud Group
35
Vertical fragmentation
SELECT BUDGET
FROM PROJ
WHERE PNO=Value
SELECT PNAME
FROM PROH
WHERE LOC=Value
SELECT PNAME,BUDGET
FROM PROJ
SELECT SUM(BUDGET)
FROM PROJ
WHERE LOC=Value
2015/7/20
WAMDM Cloud Group
36
Factors to be considered
• Attributes usage(AUM)
• Attributes affinity(AAM)
– AAM :affinity between two attributes
– AM: used for grouping the attributes of a relation
based on AAM(suggested in 1975,1984 )
– bond
• How to fragment by attributes
2015/7/20
WAMDM Cloud Group
37
AU Matrix
1 if
use(qi , Aj )  
0
attribute AJ
A1
q1
q2
q3
q4
aff ( Ai , Aj ) 
2015/7/20
A2
1
0

0

0
is referenced by query qi
otherwise
A3
A4
0 1 0
1 1 0 
1 0 1

0 1 1
( AUM )

 ref (q )acc (q )
k |use ( qk , Ai ) 1use ( qk , A j ) 1 PAYl
WAMDM Cloud Group
l
k
l
k
38
AA Matrix
acc1 (q1 )  15
acc1 (q2 )  5
acc1 (q3 )  25
acc1 (q4 )  3
acc2 (q1 )  20
acc2 (q2 )  0
acc2 (q3 )  25
acc2 (q4 )  0
acc3 (q1 )  10
acc3 (q2 )  0
acc3 (q3 )  25
acc3 (q4 )  0
A1
A1
q1
q2
q3
q4
1
0

0

0
2015/7/20
A2
A3
A4
0 1 0
1 1 0 
1 0 1

0 1 1
( AUM )
A1
sites  3
refl (qk )  1
A2
A3
A4
A2
A3
A4
 45 0 45 0 
 0 80 5 75 


 45 5 53 3 


 0 75 3 78 
( AAM )
WAMDM Cloud Group
39
AM(global affinity measure)
n
n
AM   aff ( Ai , Aj )[aff ( Ai , Aj 1 )  aff ( Ai , Aj 1 )  aff ( Ai 1 , Aj )  aff ( Ai 1 , Aj )]
i 1 j 1
where
aff ( A0 , Aj )  aff ( Ai , A0 )  aff ( An 1 , Aj )  aff ( Ai , An 1 )  0
A1
A1
A2
A3
A4
A2
A3
A4
 45 0 45 0 
 0 80 5 75 


 45 5 53 3 


 0 75 3 78 
( AAM )
AA is symmetric
n
n
AM   aff ( Ai , Aj )[aff ( Ai , Aj 1 )  aff ( Ai , Aj 1 )]
i 1 j 1
aff ( A0 , Aj )  aff ( Ai , An 1 )  0
2015/7/20
WAMDM Cloud Group
40
AM(global affinity measure)
n
n
AM   aff ( Ai , Aj )[aff ( Ai , Aj 1 )  aff ( Ai , Aj 1 )]
i 1 j 1
n
 n

AM    aff ( Ai , Aj )aff ( Ai , Aj 1 )   aff ( Ai , Aj )aff ( Ai , Aj 1 ) 
j 1  i 1
i 1

n
n
bond ( Ax , Ay )   aff ( Az , Ax )aff ( Az , Ay )
z 1
n
AM  [bond ( Aj , Aj 1 )  bond ( Aj , Aj 1 )]
i 1
2015/7/20
WAMDM Cloud Group
41
Bond energy algorithm
input: AA: attributes affinity matrix
output: CA: clustered affinity matrix
Function: determine groups of similar items
Steps:
1.Initialization,place and fix two of the columns of AA into CA
2. Iteration. Pick each of the remaining n-i columns and find the proper
palace to insert the column to CA, By compute the cont
3. Row ordering. Make CA to be symmetrically
Note:
The final group are insensitive to the order
The computation time is O(n*n)
2015/7/20
WAMDM Cloud Group
42
Cont
A1 A2 ... Ai 1 Ai Aj Aj 1... An
AM '
AM ''
AMold  AM '  AM ''  bond ( Ai1, Ai )  bond ( Ai , Aj )  bond ( Aj , Ai )  bond ( Aj , Aj 1)
AMnew  AM  AM  bond ( Ai1, Ai )  bond ( Aj , Aj1)  2bond ( Ai , Ak )  2bond ( Ak , Aj )
'
'
cont ( Ai , Ak , Aj )  AM new  AM old
 2bond ( Ai , Ak )  2bond ( Ak , Aj )  2bond ( Ai , Aj )
2015/7/20
WAMDM Cloud Group
43
Calculation of CA Matrix
A1
A1
A2
A3
A4
A2
A3
A4
2015/7/20
A1
 45 0
 0 80

 45 5

 0 75
(a)
A1
A1
A2
A3






A2
A1
A2
A3
A4
A4
 45 45 0 0 
 0 5 80 75


 45 53 5 3 


 0 3 75 78
(c )
A3
A2
A4
WAMDM Cloud Group
A2
 45 45 0
 0 5 80

 45 53 5

 0 3 75
(b)
A1
A1
A3
A3
A2






A4
 45 45 0 0 
 45 53 5 3 


 0 5 80 75


 0 3 75 78
(d )
44
Definition for Partition
  ref
AQ(qi )  { A j | use(qi , A j )  1}
CQ 
TQ  {qi | AQ (qi )  TA}
CTQ 
BQ  {qi | AQ(qi )  BA}
OQ  Q  {TQ
qi Q S j
CBQ 
BQ}
  ref
j
(qi )acc j (qi )
  ref
j
(qi )acc j (qi )
  ref
j
(qi )acc j (qi )
qi BQ S j
COQ 
qi OQ S j
A1
• Two complications
– Splitting (2-way, m-way, 2m )
– The location of the block of attributes
that should form one fragment
Shift
2015/7/20
(qi )acc j (qi )
qi TQ S j
z  CTQ * CBQ  COQ2
•
j
WAMDM Cloud Group
A1
A3
A2
A4
A3
A2
A4
 45 45 0 0 
 45 53 5 3 


 0 5 80 75


0
3
75
78


45
Caculate z
A1 A3
A2 A4
TA  { A1 , A3 }
BA  { A2 , A4 }
Q  {q1 , q2 , q3 , q4 }
AQ(q1 )  { A1 , A3 }
AQ(q2 )  { A2 , A3 }
AQ(q3 )  { A2 , A4 }
AQ(q4 )  { A3 , A4 }
TQ  {q1}
BQ  {q3 }
OQ  {q2 , q4 }
2015/7/20
acc1 (q1 )  15
acc1 (q2 )  5
acc1 (q3 )  25
acc1 (q4 )  3
acc2 (q1 )  20
acc2 (q2 )  0
acc2 (q3 )  25
acc2 (q4 )  0
acc3 (q1 )  10
acc3 (q2 )  0
acc3 (q3 )  25
acc3 (q4 )  0
sites  3
refl (qk )  1
CQ  15  20  10  5  25  25  25  3  128
CTQ  15  20  10  45
CBQ  25  25  25  75
COQ  5  3  7
z  45*75  7 2  3326
WAMDM Cloud Group
46
Partition algorithm
Input: CA: clustered affinity matrix; R: relation;
ref: attribute usage matrix; acc: access frequency matrix
Output: F: set of fragments
Function: part the relation R using Vertical fragmentation
Steps:
1. determine the z value for the first column
2. determine the best partition
3. do the partition operation
2015/7/20
WAMDM Cloud Group
47
Hybrid fragmentation
Tree Structured partition
bigtable
H
tablet-1
V
V
CF1
2015/7/20
WAMDM Cloud Group
CF2
H
H
tablet-2
V
CF1
V
CF2
tablet-3
V
CF1
V
CF2
48
Allocation
fragments: F  F1, F2 ,...,Fn 
sites：S  S1 , S2 ,...,Sm 
queries：Q  q1, q2 ,...,qq 
Finding the “optimal” distribution of F to S ?
2015/7/20
WAMDM Cloud Group
49
Optimally
• Minimal cost
– Storage cost
– Query cost
– Update cost
– Data communication
• Performance
– Minimize the response time
– Maximize the system throughput at each site
2015/7/20
WAMDM Cloud Group
50
Assumptions and Definitions
• A single fragment Fk
• Query read  only：T  t , t ,...,t 
1 2
m
update：U  u1 , u2 ,...,um 
• Communication cost
CT   {c12 , c13 ,...,c1m ,...,cm1,m}
C (U )  {c , c ,...,c ,...,c
'
'
12
'
13
'
1m
'
m1,m
}
• Storage cost D  {d1, d2 ,...,dm}
2015/7/20
WAMDM Cloud Group
51
Assumptions and Definitions
I S
1 if
xj  
0
the
faragment
Fk is assigned to site S j
otherwise
m

'
min (  x j u j cij  t j mincij )   x j d j 
j|S j I
i

1
j
|
S

I
j|S j I


j
2015/7/20
WAMDM Cloud Group
52
Not suitable for DDB design
• NP-complete
• Reasons
– Cannot treat fragments as individual
– No consideration of integrity enforcement
– No consideration of concurrency control
mechanisms
2015/7/20
WAMDM Cloud Group
53
Information required
• Database info.
– Number of tuples: card (Fj )
– Fragment size: size(Fj )  card(Fj ) * length(Fj )
• Application info. URij & RRij
• Site info: USCk & LPCk
• Network info: cost per frame gij
2015/7/20
WAMDM Cloud Group
54
Allocation model
1 if
xj  
0
the
faragment
TOC 
Fk is assigned to site S j
otherwise
 QPC   
qi Q
i
Sk S Fj F
STC jk
STC jk  USCk * size(Fj )* x jk
2015/7/20
WAMDM Cloud Group
55
Allocation model
QPCi  PCi  TCi
PCi  ACi  IEi  CCi
ACi 
 
Sk S F j F
(uij *URij  rij * RRij ) * x jk * LPCk
TCi  TCUi  TCRi
TCU i 
TCRi 
2015/7/20

 
S k F F j F
F j F
uij * xij * g o (i ),k 
S k S S k F
min(rij * x jk * go (i ),k  rij * x jk *
Sk S
WAMDM Cloud Group
 
u ij *x jk * g k ,o (i )
sel ( Fj )* length( Fj )
fsize
* g k ,o ( i ) )
56
Constrains
• Execution time of qi  maximum response
time of qi
• Storage constrains
• Processing constrains
2015/7/20
WAMDM Cloud Group
57
Solution
• assume all candidate partitions known, select
the “best ” partitioning
• ignore replication at first
• sliding window on fragments
2015/7/20
WAMDM Cloud Group
58
Conclusion
• Three architecture of DDBMS
• Two fragment methods
• The cost of allocation
2015/7/20
WAMDM Cloud Group
59
Thank you!!!
2015/7/20
WAMDM Cloud Group
60