Unit 1 Introduction to DBMS

Download Report

Transcript Unit 1 Introduction to DBMS

Unit 1
Introduction to DBMS
(Database Management Systems)
Application
program
DBMS
End-user
1-1
Outline of Unit 1
1.1 Information Systems
1.2 An Overview of a Database System
1.3 An Architecture for a Database System
1.4 Database Technology Trends
Wei-Pang Yang, Information Management, NDHU
1-2
1.1 Information Systems
1-3
Stages of Information System
 Stage 0: Manual Information System
• Records
• Files
• Index Cards
 Stage 1: Sequential Information Systems
• Tapes
• Files
• slow, non-interactive, redundancy,... .
 Stage 2: File Based Information Systems
• Disk (direct access)
• application program has its own file
• data redundancy
data dependence
 Stage 3: DBMS based Information Systems
• Generalized data management software
• Transaction processing
Wei-Pang Yang, Information Management, NDHU
1-4
File Based Information Systems
 Conventional Data Processing techniques:
Enterprise:
Application
System A
Application
System B
Application
Program A
Application
Program B
File System A
Customer
File System B
Invoice
File System N
Inventory
…
Application
System N
Wei-Pang Yang, Information Management, NDHU
Application
Program N
1-5
File Based Information Systems (cont.)
Customer
Customer No.
Customer Name
Customer Addr.
Social Security ID
Inventory
Part No.
Part Description
Unit Price
Supplier
Quantities Remain
Quantities Ordered
Wei-Pang Yang, Information Management, NDHU
Invoice
Customer No.
Customer Name
Part No.
Quantities
Unit Price
Parts
Part No.
Part Description
Supplier
Quantities Ordered
Customer Name
Unit Price
1-6
User A1
User A2
User B1
User B2
User B3
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
C, Pascal
DSL (Data Sub Language)
e.g. SQL
1
2
3
External View
@ #
&
External
schema
External
schema
A
B
External/conceptual
mapping A
Conceptual
schema
<
External View
B
External/conceptual
mapping B
Conceptual
View
Database
management
system
dictionary
(DBMS) e.g. system
catalog
Conceptual/internal
mapping
DBA
(Build and
maintain
schemas
and
mappings)
Storage
structure
definition
(Internal
schema)
Wei-Pang Yang, Information Management, NDHU
1 2 3
Stored database (Internal View)
#
...
100
& @
1-7
DBMS based Information Systems:
Basic Approach - Integration
 (1) Integration of Information
• Description of the integrated view of data is the "Conceptual
Schema" of the database
DBMS
Application
program
Wei-Pang Yang, Information Management, NDHU
End-user
1-8
DBMS based Information Systems:
Basic Approach – Simple views and High level language
 (2) Provide simple views (External Schema) and high level language (e.g.
SQL) for users to manipulate (handle) data
• High level language: e.g. SQL (Structured Query Language)
<e.g.>:
SELECT SNAME
FROM S
WHERE S#= 'S4';
• Description of user's view of data is the "external schema" or
"subschema" or "view".
• High-level languages (Query Language): SQL
(1) Data Definition Language:
define format
(2) Data Manipulation Language:
retrieve, insert, delete, update
• Emphasize: EASE OF USE !!
Wei-Pang Yang, Information Management, NDHU
S
S# name
1-9
DBMS based Information Systems:
Basic Approach - Storage/Access Method
 (3) Efficient Storage/Access Techniques:
• implemented once rather than duplicated in all application
programs.
User: query in SQL
Language Processor
DBMS
Access Methods Calls
Access Method
(B+ tree, Dynamic Hashing)
I/O calls
Wei-Pang Yang, Information Management, NDHU
1-10
DBMS based Information Systems:
Basic Approach - Transaction Management
 (4) Provide Transaction Management:
• Concurrency Control
• Recovery
• Security
• .
:
Wei-Pang Yang, Information Management, NDHU
1-11
Example: A Simple Query Processing
Query in SQL:
SELECT CUSTOMER. NAME
FROM CUSTOMER, INVOICE
WHERE REGION = 'N.Y.' AND
AMOUNT > 10000 AND
CUTOMER.C#=INVOICE.C
DBMS
Language Processor
Internal Form :
( (S
SP)
Optimizer
Operator :
SCAN C using region index, create C
SCAN I using amount index, create I
SORT C?and I?on C#
JOIN C?and I?on C#
EXTRACT name field
Language
Processor
Operator Processor
Calls to Access Method:
OPEN SCAN on C with region index
GET next tuple
.
.
.
Calls to file system:
GET10th to 25th bytes from
block #6 of file #5
Access Method
e.g.B-tree; Index;
Hashing
Access
Method
File System
1-12
Wei-Pang Yang, Information Management, NDHU
database
1.2 An Overview of a
Database System
1-13
An Example

The Wine Cellar Database:
Cellar:
Bin
2
3
6
12
16
43
50
51

Wine
Chardonnay
Chardonnay
Chardonnay
Jo. Riesling
Jo. Riesling
Cab. Sauvignon
Pinot Noir
Pinot Noir
Producer
Year Bottle Ready
Buena Vista
83
1
85
Louis Martini
81
5
84
Chappellet
82
4
85
Buena Vista
82
1
83
Sattui
82
1
83
Robt. Mondavi 77
12
87
Mirassou
77
3
85
Ch. St. Jean
78
2
86
Comments
Thanksgiving
Late Harvest
Very dry
Harvest
Retrieval:
• DML (Data Manipulation Language):
SELECT Wine, Bin, Producer
FROM
Cellar
WHERE Ready = 85;
• Result:
Wine
Bin
Chardonnay 2
Chardonnay 6
Pinot Noir
50
Wei-Pang Yang, Information Management, NDHU
Producer
Buena Vista
Chappellet
Mirassou
1-14
An Example (cont.)

Deletion:
• DML:
DELETE FROM Cellar
WHERE Ready < 86;
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir

Producer
Robt. Mondavi
Ch. St. Jean
Year Bottle Ready Comments
77
78
12
2
87
86
Insertion:
• DML: INSERT INTO Cellar
VALUES (53, 'Pinot Noir', 'Franciscan', 79, 1, 86, 'for Joan');
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir
53 Pinot Noir
Wei-Pang Yang, Information Management, NDHU
Producer
Robt. Mondavi
Ch. St. Jean
Franciscan
Year Bottle Ready
77
78
79
12
2
1
87
86
86
Comments
for Joan
1-15
An Example (cont.)

Update
• DML: UPDATE
Cellar
SET Bottles = 4
WHERE Bin = 51;
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir
53 Pinot Noir
Wei-Pang Yang, Information Management, NDHU
Producer
Robt. Mondavi
Ch. St. Jean
Franciscan
Year
77
78
79
Bottle Ready Comments
12
4
1
87
86
86
for Joan
1-16
What is a Database System?
DBMS
Application
program
End-user
Wei-Pang Yang, Information Management, NDHU
1-17
What is a Database System? (cont.)

Major components of a database system:
•
•
•
•
Data: integrated and shared.
Hardware: disk, CPU, Main Memory, ...
Software: DBMS
Users:
1. Application programmers
2. End users
3. Database administrator (DBA)
• Defining external schema
• Defining conceptual schema
• Defining internal schema
• Liaison with users
• Defining security and integrity checks
• Defining backup and recovery procedures
• Monitoring performance and changing requirements
Wei-Pang Yang, Information Management, NDHU
1-18
Why Database ?

Redundancy can be reduced

Inconsistency can be avoided

The data can be shared

Standards can be enforced

Security restrictions can be applied

Integrity can be maintained

Provision of data independence
objective !
Wei-Pang Yang, Information Management, NDHU
1-19
Data Independence


Application Program
 Data Structure
Immunity of application to change in
storage structure and access strategy.
Wei-Pang Yang, Information Management, NDHU
1-20
Data Dependence vs. Data Independence

S
Data Dependent
S#
e.g. SELECT CITY
S1
S2
:
Sn
FROM S
WHERE ITEM = 'X';
• Linked list:
S
Top
TOP
s1
sn
s2
if item = TOP
. item then .........
• Tree:
.
if item < root.data then root := root .left ..........
• Array:
if A[I] = item then ............
• Storage structure changed  program changed
Wei-Pang Yang, Information Management, NDHU
1-21
1.3 An Architecture for a
Database System
1-22
User A1
User A2
User B1
User B2
User B3
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
(Data Sub Language)
e.g. SQL, QUEL
External View
@ #A &
External
schema
A
External/conceptual
mapping A
Conceptual
schema
<
External View
B
External/conceptual
mapping B
Conceptual
@ View#A &
Database
management
system
(DBMS)
Conceptual/internal
mapping
DBA
(Build and
maintain
schemas
and
mappings)
External
schema
B
Storage
structure
definition
(Internal
schema)
Stored database (Internal View)
Wei-Pang Yang, Information Management, NDHU
#
& @
1-23
An Example of the Three Levels

Internal level:
STORED_EMP length = 18
PREFIX
TYPE = BYTE(6), OFFSET = 0, INDEX = EMPX
EMP#
TYPE = BYTE(6), OFFSET=0,
DEPT#
TYPE = BYTE(4), OFFSET = 12
PAY
TYPE = FULLWORD, OFFSET = 16

Conceptual level:
EMPLOYEE
EMPLOYEE_NUMBER
DEPARTMENT_NUMBER
SALARY

CHARACTER(6)
CHARACTER(4)
NUMERIC(5)
External level:
• ( PL /I )
DCL 1 EMP
2 EMP#
2 SAL
CHAR(6)
FIXED BIN(31)
• (COBOL)
01 EMPC
02 EMPNO PIC X(4)
02 DEPTNO PIC X(4)
Wei-Pang Yang, Information Management, NDHU
1-24
Functions of the DBMS

Data Definition Language (DDL)

Data Manipulation Language (DML)

Data Security and Integrity

Data Recovery and Concurrency

Data Dictionary

Performance
Wei-Pang Yang, Information Management, NDHU
1-25
1.4 Database Technology Trends
1960s to
Mid-1970s
1970s to
Mid-1980s
Late
1980s
Future
Merging data models,
knowledge representation,
and programming
languages
Network
Hierarchical
Relational
Semantic
Object-oriented
Logic
Database
Hardware
Mainframes
Mainframes
Minis
PCs
Faster PCs
Workstations
Database machines
User
Interface
None
Forms
Query
languages
- SQL, QUEL
Graphics
Menus
Query-by-forms
Natural language
Speech input
Program
Interface
Procedural
4GL
Logic programming
Integrated database
and programming
language
Presentation
and display
processing
Reports
Processing
data
Business graphics
Image output
Knowledge
processing
Generalized display
managers
Distributed knowledge
processing
Data Model
Wei-Pang Yang, Information Management, NDHU
Embedded
query
language
Report
generators
Information
and transaction
processing
Parallel processing
Optical memories
1-26
Distributed Databases
Database

Distributed System
Distributed database is a database that is not stored
in its entirety at a single physical location, but
rather is spread across a network of computer.
< e.g.>
Hsinchu
computers
Taipei
Kaoshiung
communication
links
Taichung
Wei-Pang Yang, Information Management, NDHU
1-27
Distributed Databases (cont.)

Advantages:
 efficiency of local processing
 data sharing

Disadvantages:
 communication overhead
 implementation difficulties

Reference:
S. Ceri and G. Pelagatti
"Distributed Databases: principles and systems"
Wei-Pang Yang, Information Management, NDHU
1-28
Multi-Database/Heterogeneous Database
Multidatabase
IMS
Hierarchical
Model
• semantic inconsistency
INGRES
Relational
Model
.. .
ORION
ObjectOriented
Model
• data incompleteness
• global schema
Wei-Pang Yang, Information Management, NDHU
1-29
DB + AI
Query
Database
AI
DBMS
Language Processor
Query Optimizer
Knowledge
Base
Operator Processor
Access Method
File Manager
Distributed
DB design
Knowledge
Base
Wei-Pang Yang, Information Management, NDHU
Logical
DB design
Knowledge
Base
1-30
KBMS
Database
Logic
 A Combined Model :
Logic Programming + Relational DB
 Three layers :
User Program
Knowledge
management
program
IDB
relational
interface
Relational DB
management
program EDB
Wei-Pang Yang, Information Management, NDHU
Query :
? :- ancestor (taro, Y)
? :- grandfather (?, c)
IDB:
ancestor(X,Y):- parent(X,Y)
ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y)
parent(X,Y):-edb(father(X,Y))
parent(X,Y):-edb(mother(X,Y))
grandfather(X,Z):- father(X,Y) ^ father(Y,Z)
EDB:
fathe
r
father son
A
B
X
Y
.
. .
B
mother
.
C
1-31
Database Object-Oriented
OODB

A typical Document : MEMO [Woelk86, SIGMOD]
MCC
To: W. Kim
From: D. Woolk
Date: September 18, 1992
Subject: Workstations
can be heard
In the computer center of National Chiao-Tung University, there are
a lot of workstations. There are HP RS serials, SUNs, Apollo, and so
on. The students in NCTU learn to use workstation since they are freshmen. The configuration of the workstations follows:
Workstation
.
speaker
voice message
associated
.
.
Workstation
}
text
}
}
graphics
Database
Server
In the course introduction to Computer Science? students do their
homework's on workstations.
image
Wei-Pang Yang, Information Management, NDHU
1-32
Use of a Database Management System in
Design and Application
Design
Release
Control
Detailed
Design
Database Manufacturing
Fabrication
Assembly
Info
Analysis
Models
Preliminarydesign
INTERFACE
DBMS
Database Management
System
Graphic Interface
Test /
Inspection
Language Interface
APPLICATION
Design
Analysis
Design
Verification
Manufacturing
Planning
Wei-Pang Yang, Information Management, NDHU
Evaluation
Production
Control
Synthesis
Release
Project
Management
1-33
Fuzzy Database
Database
Fuzzy Set Theory
 Fuzzy Query
<e.g.>
SELECT
FROM
WHERE
STUDENT.NAME
STUDENT
SEX = M
AND HEIGH = TALLER
AND WEIGH = SLIMMER
STUDENT:
NAME SEX HEIGHT WEIGHT IQ
Mary
Linda
.
.
.
F
F
.
.
.
<e.g.>
Wei-Pang Yang, Information Management, NDHU
158
165
.
.
.
SELECT
FROM
WHERE
55
55
.
.
.
High
Medium
.
.
.
STUDENT.NAME
STUDENT
IQ >= 130
1-34
More?
AI
DB
1993
DB
?
?
2001
VR
DB
?
1995
DB
1997
1999
?
2004
??? SARS
?
DB
Mobile/video
?
DNA/BioInfo
.
DB
WWW
DB
Wei-Pang Yang, Information Management, NDHU
DB
2003
?
911/ Anthrax
2006
???
?
1-35