Lecture notes - 南京大学计算机科学与技术系

Download Report

Transcript Lecture notes - 南京大学计算机科学与技术系

Distributed Systems
Xining Li
Dept. of Computing and Information
Science
University of Guelph
Canada
Distributed Systems
1
Chapter 1: Introduction
What is a Distributed system?
 Why do we need distributed
Systems?
 History of Distributed Systems.
 Applications of Distributed Systems.
 Goals and Objectives.

Distributed Systems
2
Computer processor performance evolution:
Distributed Systems
3
Gordon Moore(1965) Prediction:
Distributed Systems
4
Definition of a Distributed System
(Tanenbaum):
 A distributed system is:
A collection of independent computers that
appears to its users as a single coherent
system.
Distributed Systems
5
Centralised Systems:
 System shared by users all the time
 All resources accessible
 Software runs in a single process
 Single physical location
 Single point of control
 Single point of failure
Distributed Systems
6
Decentralised Systems:









Multiple autonomous components
Components shared by users
Some resources may not be accessible
Software can run in concurrent processes on
different processors
Multiple physical locations
Multiple points of control
Multiple points of failure
No global time
No shared memory (in most cases)
Distributed Systems
7
Applications (Killer):
Computing dominated problems (distributed computing):
Mathematical Computations, Environmental and Biological
Modeling, Economic and Financial modeling, Graphics
rendering for visualization, Network Simulations.
Storage dominated problems (distributed data): Data
Mining, Image Processing, Information retrival, Insurance
Analysis.
Communications dominated problems (network
computing): Transaction processing, Video on Demand, Ecom, Electronic banking, electronic shopping
Distributed Systems
8
Common distributed computing examples










rlogin or telnet (for remote access)
network file system, network printer etc
ATM (cash machine)
Distributed databases
Network computing
Global positioning systems
Retail point-of-sale terminals
Air-traffic control
Enterprise computing
WWW
Distributed Systems
9
SETI: Search for Extraterrestrial Intelligence






To look for aliens
Radio Telescope: Arecibo (305m)
Located in: Puerto Rico
Accept 4,000,000 radio bands
Screen saver to do analysis
UC Berkeley
Distributed Systems
10
Radio
Telescope:
Arecibo
(305m)
Puerto Rico
Distributed Systems
11
Distributed.com RC5







To find the correct solution for RSA Lab’s
Secret Keys
Award: $10,000USD
RC_56: 250 days (1997)
RC_64: 1,757 days (2002)
RC_72: ?
Ex: RC_64: 18,446,744,073,709,551,616 keys
160,000 PC’s are working on this project
Distributed Systems
12
CERN: European Organization for Nuclear Research





CERN is the world's largest particle physics
centre. Here physicists come to explore what
matter is made of and what forces hold it
together.
A new Collider: Large Hadron Collider (2007)
Expected data: 10,000,000GB
Need 20,000,000 CDs to store
Solution: Distributed systems
Distributed Systems
13
Brief History of Distributed Systems:
System
Organization
CM*
Network
Computer
Date
Carnegie Mellon Hierarchical
Univ.
Bus
PDP
1975
Cambridge
DCS
Cambridge
Univ.
Cambridge
Ring
LSI-4
1979
Locus
UCLA
Ethernet
PC
1980
V System
Stanford
Univ.
Ethernet
Sun
1982
Mach
Carnegie Mellon Ethernet
Univ.
Sun, PC
1985
CORBA
OMG
Internet
Any
1990
Distributed COM
Microsoft
Internet
PC
1996
JINI
SUN
Internet
Any
2000
Distributed Systems
14
Basics of Distributed Systems:
 Networked computers (close or loosely coupled)
that provide a degree of operation transparency
 Distributed Computer System =

independent processors + networking
infrastructure
 Communication between processes (on the same
or different computer) using message passing
technologies is the basis of distributed computing
Distributed Systems
15
Goals of Distributed Systems:
 Resource sharing: easy for users to access
remote resources.
 Transparency: to hide the fact that
processes and resources are physically
distributed across multiple computers.
 Openness: to offer services according to
standard rules.
 Scalability: easy to expand and manage.
Distributed Systems
16
ISO RM-ODP: forms of transparency :
Transparency
Description
Access
Hide differences in data representation and how a
resource is accessed
Location
Hide where a resource is located
Migration
Hide that a resource may move to another location
Relocation
Hide that a resource may be moved to another location
while in use
Replication
Hide that a resource may be shared by several competitive
users
Concurrency
Hide that a resource may be shared by several competitive
users
Failure
Hide the failure and recovery of a resource
Persistence
Hide whether a (software) resource is in memory or on
disk
Distributed Systems
17
Scalability Problems: Centralized paradigm
Concept
Example
Centralized
services
Centralized
data
Centralized
algorithms
A single server for all
users
A single on-line
telephone book
Doing routing based on
complete information
Distributed Systems
18
Scalability Problems: Decentralized paradigm




No machine has complete information
about the systems state.
Machines make decisions based only on
local information.
Failure of one machine does not ruin the
algorithm.
There is no implicit assumption that a
global clock exists.
Distributed Systems
19
User Requirements :
 What services the system can provide?
 How easy to use and manage the system?
 What benefits the system can offer?
 What is the ratio of performance/cost?
 How reliable the system is?
 How secure the system can guarantee?
Distributed Systems
20
Distributed Computer System Metrics








Latency – network delay before any data is sent
Bandwidth – maximum channel capacity (analogue communication Hz,
digital communication bps)
Granularity – relative size of units of processing required. Distributed
systems operate best with coarse grain granularity because of the slow
communication compared to processing speed in general
Processor speed – MIPS, FLOPS
Reliability – ability to continue operating correctly for a given time
Fault tolerance – resilience to partial system failure
Security – policy to deal with threats to the communication or
processing of data in a system
Administrative/management domains – issues concerning the
ownership and access to distributed systems components
Distributed Systems
21