Transcript Document 7566238
Optimizing of data access using replication technique
Renata Słota 1 , Darin Nikolow 1 ,Łukasz Skitał 2 , Jacek Kitowski 1,2 1 Institute of Computer Science AGH-UST, Cracow 2 ACC CYFRONET AGH, Cracow
Agenda
Motivation of the work Why does today grid computing need replication?
Replication basics Clusterix Data Management System Architecture, optimization and replication algorithms Optimization Example Replication Example Summary, conclusions
Site-level vs. Grid-level replication
Site-level replication Replicas in one site Implementation examples: RAID HSM Grid-level replication Data management systems Replicas spread on many sites
Motivation of the work Why does today grid computing need replication?
Data protection and availability Malfunction of one storage does not affect data itself, only performance is affected Performance Low level optimization and replication are not sufficient (RAID, HSM) Limited network bandwidth Limited storage performance
Replication scenarios
Static replication Decision made by system administrator or user Limited system support: replica selection, replica coherency, replica ordering Dynamic replication Decision made by dedicated grid component based on current data access pattern of users Full system support
Replication consequences
Optimal replica selection algorithm Replica creation and removal algorithm Cost of replica creation, update and storage Replica coherency
Clusterix
National Cluster of Linux Systems Project aim: To develop set of tools and procedures allowing to build productive Grid environment based on local PC clusters spread in independent supercomputing centers Network Layer: Pionier – Polish optical networks
Clusterix Data Management System Architecture
Optimization Algorithm
Selects optimal storage element for: data accessing replica creation Takes under consideration current state of the System Optimal storage element is one with the maximal weight
W(s,d) W(s,d)=min((1-NetLoad(s))
bandwidth(s,d), (1-Sload(s))
Sbandwidth(s)) s –
storage element
d –
destination node
NetLoad(s) – s
network interface load
Bandwidth(s,d) –
available bandwidth between
Sload(s) –
storage system load
Sbandwidth(s) –
storage system bandwidth
s
and
d
Automatic replication algorithm
Takes under consideration gain from replication factor A() . G() , cost of replica creation cost of replicas update U() C() , and administrative Replication profit:
P(d,R,S,f)=G(d,R,S,f)+C(d,R,f)+U(d,R,S,f)+A(d,f)
d – R – S – f – storage element, which profit is computed for set of storage elements containing replicas of f statistic data – history of file usage considered file
Storage oriented problems
Data intensive applications for Clusterix Simulation of transonic flow past a wings tips Visualization of complex multidimensional structures Ecosystem modeling and simulation
Optimization Example
Node A needs file F stored on SE1, SE2 and SE3
NMS CDMS
NMS SE1 JIMS
F
NMS Node A NMS SE3 JIMS
F
JIMS SE2
F
NMS
Optimization Example
Node A sends request to CDMS
NMS CDMS
NMS SE1 JIMS
F
NMS Node A NMS SE3 JIMS
F
JIMS SE2
F
NMS
Optimization Example
CDMS uses Optimizer to choice optimal SE
NMS CDMS
NMS SE1 JIMS
F
NMS Node A NMS SE3 JIMS
F
JIMS SE2
F
NMS
Optimization Example
W(s3,d)=min((1-NetLoad(s3))
bandwidth(s3,d), (1-Sload(s3))
Sbandwidth(s3))
Optimizer is working…
W(s2,d)=min((1-NetLoad(s2))
bandwidth(s2,d), (1-Sload(s2))
Sbandwidth(s2)) W(s1,d)=min((1-NetLoad(s1))
bandwidth(s1,d), (1-Sload(s1))
Sbandwidth(s1))
NMS CDMS
NMS SE1 JIMS
F
NMS Node A NMS SE3 JIMS
F
JIMS SE2
F
NMS
Automatic replication example
Situation 3 clusters
F F
4 storage elements SE1 SE2 SE3 SE4 2 contain replica of
F
Set of applications running on these clusters and accessing file
F
SE1
F
Automatic replication example
SE2
F
SE3
CDMS
Optimizer Replication Module Statistic Module
Gain Cost of rep.
Cost of update Adm. factor
SE4
SE1
F
Automatic replication example
SE2
F F
SE3
F F CDMS
Optimizer Replication Module Statistic Module Decision:
F F F F
SE2 SE4
F
SE4
SE1
F
Automatic replication example
SE2
F
SE3
CDMS
Optimizer Replication Module Statistic Module Sleeping… SE4
F
Summary
Conclusions
Simulation of replication vs. real system implementation Replication should be designed to meet specific Clusterix applications profile Data availability Replication drawbacks
Publications
Extended functionality of Virtual Storage System for grid
Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski
Cracow Grid Workshop 2004, poster no. 13
Application of data replication methods in Clusterix project (in polish)
Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski Pionier 2004, 19-20 May , Poznań, electronic publication
Implementation of replication methods in the Grid Environment
Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski Submitted to European Grid Conference