Database Concepts - Syracuse University

Download Report

Transcript Database Concepts - Syracuse University

DISTRIBUTED DATABASES AND DDBMS

Learning Objectives  Understand the concept of “Distributed Data”  Describe various Distributed Data and DDBMS implementations  Explain how database design affects the DDBMS environment  Apply DDBMS principles to solve problems

Definitions 

Distributed Database:

A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link 

Decentralized Database:

A collection of independent databases on non-networked computers

They are not the same thing!

What are we talking about here?

Key Questions:

 Are components of the application in more than one place?

 Are the data in more than one place?

 Does the app use more than one DBMS or “system” for data management?

 Which facets, if any, are transparent to users?

Why distribute your app or data?

 It’s hard.  It’s complex.

 So why do it?

 Scalability.

 Redundancy.

Application Complexity

Monolithic

 Everything works / is contained within one computer.

 Ex. Ms Word

Distributed

 Various working pieces are in different physical places, working over a computer network.

 Ex. Google Docs

Data Distribution

Single Site Data (Simple)

 All data stored in / retrieved from one place on a network.

 Ex. Wordpress

Multi-Site Data (Complex)

 Various parts of the data come from various sites on a network.

 Ex. My Slice, DNS

Data Complexity

Homogeneous (Easier)

 All data associated with the application is stored in the same DBMS  Ex. Wordpress

Heterogeneous (More Difficult)

 Various data components of the application are stored in different DBMSes  Ex. SU Blackboard, Facebook

Multisite Data DBMS Options  Horizontal Partitioning –  Distributing data by row  Vertical Partitioning –  Distributing data by table or column.

 Replication –  Copying data either on a schedule or in real-time

Summary: The taxonomy App Monolithic Distributed Single Site Multi Site Hetero.

Homo.

Multi Site Replicated Horiz. Partitioned Vert. Partitoned

Homogeneous == Same DBMS User’s View of Db

CRM Db

•Customers •Sales Staff •Orders Actual Implementation

N. America

•Customers •Sales Staff Oracle Same

Europe

•Orders Oracle

Heterogeneous == Multiple DBMS User’s View of Db

CRM Db

•Customers •Sales Staff •Orders Actual Implementation

N. America

•Customers •Sales Staff Oracle

Europe

•Orders Invoices File System

Europe

•Orders MySQL

Example of Replication User’s View of Db

CRM Db

•Customers •Sales Staff •Orders Actual Implementation

N. America

•All Customers •All Sales Staff •All Orders Master

Europe

•All Customers •All Sales Staff •All Orders Replica

Example of Horizontal Partitioning User’s View of Db

CRM Db

•Customers •Sales Staff •Orders Actual Implementation

N. America

•NA Customers •NA Sales Staff •NA Orders

Europe

•E Customers •E Sales Staff •E Orders

Example of Vertical Partitioning User’s View of Db

ERP System

•Financials •Customer Service •Prod. Support •Human Resources Actual Implementation

N. America

•Financials •Human Resources

Europe

•Customer Service •Prod Support

5 Typical Distributed Databases  Centralized with Single Site Data  Replicated with Snapshots (in real time)  Replicated with Synchronization (on demand, or a schedule)  Integrated Partitions ( Paritioning in data center)  Independent Partitions (Geographically distributed partitioning)

5 Typical Distributed Databases

Transparency  Location Transparency  User/application does not need to know where data resides   Replication Transparency  User/application does not need to know about duplication of data Failure Transparency  Either all or none of the actions of a transaction are committed  Transparency is difficult but important. The greater the distribution of data the more there will be a need for transparency to offset the complexity.

Applying The Concepts Via Example:  Monolithic or Distributed?  Single Site or Multi Site data?  If multi-site:  H / V Partitioned or Replicated?

 Homogeneous or Heterogeneous?

 Location Transparency?

 Replication Transparency?

 Failure Transparency?

DISTRIBUTED DATABASE AND DDBMS

Questions?