CS 195 Course Outline & Introduction to Java

Download Report

Transcript CS 195 Course Outline & Introduction to Java

CSC 411/511:
DBMS Design
The Relational Model
(Chapter 3)
Dr. Nan Wang
CSC411_L3_Relational Model
1
Relational Database: Definitions
• Relational database: a set of relations with distinct
relation names
– Database: a set of tables with distinct table names
• Relation (table): made up of 2 parts:
–
–
Instance : a table, with rows and columns.
#Rows = cardinality, #fields = degree / arity.
Schema : specifies name of relation (table), plus name and
type of each column.
• E.G. Students(sid: string, name: string, login: string,
age: integer, gpa: real).
• Can think of a relation as a set of rows or tuples
(i.e., all rows are distinct).
Dr. Nan Wang
CSC411_L3_Relational Model
2
Example Instance of Students Relation
FIELDS (ATTRIBUTES, COLUMNS)
TUPLES
(RECORDS,
ROWS)
sid
53666
53688
53650
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
age
18
18
19
gpa
3.4
3.2
3.8

Cardinality = 3, degree = 5, all rows distinct

Do all columns in a relation instance have to
be distinct?
Dr. Nan Wang
CSC411_L3_Relational Model
3
Relational Query Languages
• A major strength of the relational model
– supports simple, powerful querying of data.
• Queries can be written intuitively, and the DBMS is
responsible for efficient evaluation.
–
–
The key: precise semantics for relational queries.
Allows the optimizer to extensively re-order operations, and
still ensure that the answer does not change.
Dr. Nan Wang
CSC411_L3_Relational Model
4
The SQL Query Language
• Structured query language (SQL)
• Developed by IBM (system R) in the 1970s
• Need for a standard since it is used by many
vendors
• Standards (ANSI):
–
–
–
–
SQL-86
SQL-89 (minor revision)
SQL-92 (major revision)
SQL-99 (major extensions, current standard)
Dr. Nan Wang
CSC411_L3_Relational Model
5
The SQL Query Language
• SQL uses table, row, and column for relation,
tuple, and attribute, respectively.
• DDL (Data Definition Language)
– A subset of SQL that supports the creation, deletion, and
modification of tables
– Commands CREATE, DROP, and ALTER are used for data
definition
• DML (Data Manipulate Language)
– The SELECT-FROM-WHERE structure of SQL queries
SELECT <attribute list>
FROM <table list>
WHERE <condition>
Dr. Nan Wang
CSC411_L3_Relational Model
6
DDL
•
•
•
•
Create database
Create table
Alter table
Drop table
Dr. Nan Wang
CSC411_L3_Relational Model
7
DDL
• Create database [if not exists] db_name;
• Use db_name;
– e.g.
– Create database db_university;
– Use db_university;
Dr. Nan Wang
CSC411_L3_Relational Model
8
Creating Relations in SQL
• Creates the Students
relation.
– Observe that the
type
(domain) of each field
is specified, and enforced by
the DBMS whenever tuples
are added or modified.
• As another example, the
Enrolled table holds
information about
courses that students
take.
Dr. Nan Wang
CREATE TABLE Students
(sid: CHAR(20),
name: CHAR(20),
login: CHAR(10),
age: INTEGER,
gpa: REAL)
CREATE TABLE Enrolled
(sid: CHAR(20),
cid: CHAR(20),
grade: CHAR(2))
CSC411_L3_Relational Model
9
Destroying and Altering Relations
DROP TABLE [if exists] tbl_name;
DROP TABLE Students;
DROP TABLE Enrolled;
• Destroys the relation Students. The schema
information and the tuples are deleted.
Dr. Nan Wang
CSC411_L3_Relational Model
10
Destroying and Altering Relations
ALTER TABLE [if exists] tbl_name
ADD COLUMN col_name: domain
DROP COLUMN col_name
ALTER TABLE Students
ADD COLUMN firstYear: integer

The schema of Students is altered by adding a
new field; every tuple in the current instance
is extended with a null value in the new field.
Dr. Nan Wang
CSC411_L3_Relational Model
11
DML: Adding Tuples
• Can insert a single tuple using:
– INSERT INTO TABLE_NAME
[ (col1, col2, col3,...colN)]
VALUES (value1, value2, value3,...valueN);
INSERT INTO Students (sid, name, login, age, gpa)
VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)
We can optionally omit the list of column name in the
INTO clause and list the values in the appropriate
order.
Students
(53688, ‘Smith’, ‘smith@ee’, 18, 3.2)
INSERT INTO
VALUES
Dr. Nan Wang
CSC411_L3_Relational Model
12
DML: Deleting Tuples
DELETE FROM table_name
[WHERE condition];
Can delete all tuples satisfying some condition
(e.g., name = Smith):
DELETE FROM Students
WHERE name = ‘Smith’
* Powerful variants of these commands are available; more later!
Dr. Nan Wang
CSC411_L3_Relational Model
13
The SQL Query Language
SELECT column_list
FROM table-name
[WHERE Clause]
• To find all 18 year old students, we can write:
SELECT *
FROM Students S
WHERE S.age=18
sid
name
53666 Jones
login
jones@cs
age gpa
18
3.4
53688 Smith smith@ee 18
3.2
•To find just names and logins, replace the first line:
SELECT S.name, S.login
FROM Students S
WHERE S.age=18
Dr. Nan Wang
CSC411_L3_Relational Model
14
Querying Multiple Relations
• What does the
following query
compute?
SELECT S.name, E.cid
FROM Students S, Enrolled E
WHERE S.sid=E.sid AND E.grade=‘A’
Given the following instances
of Enrolled and Students:
sid
53666
53688
53650
name
login
age gpa
Jones jones@cs
18 3.4
Smith smith@eecs 18 3.2
Smith smith@math 19 3.8
Dr. Nan Wang
sid
53831
53831
53650
53666
cid
grade
Carnatic101
C
Reggae203
B
Topology112
A
History105
B
we get:
S.name E.cid
Smith
Topology112
CSC411_L3_Relational Model
15
DML: updating data within a table
UPDATE table_name
SET column_name1 = value1,
column_name2 = value2, ...
[WHERE condition]
Dr. Nan Wang
CSC411_L3_Relational Model
16
SQL
• Data Definition Language (DDL) statements
are used to define the database structure or
schema. Some examples:
– CREATE - to create objects in the database
– ALTER - alters the structure of the database
– DROP - delete objects from the database
• Data Manipulation Language (DML)
statements are used for managing data within
schema objects. Some examples:
–
–
–
–
SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for
the records remain
Dr. Nan Wang
CSC411_L3_Relational Model
17
Integrity Constraints (ICs)
• IC: condition that must be true for any instance
of the database; e.g., domain constraints.
–
ICs are specified when schema is defined.
–
ICs are checked when relations are modified.
• A legal instance of a relation is one that
satisfies all specified ICs.
–
DBMS should not allow illegal instances.
• A DBMS enforces integrity constrains
– If the DBMS checks ICs, stored data is more faithful to
real-world meaning.
–
Avoids data entry errors, CSC411_L3_Relational
too!
Model
Dr. Nan Wang
18
Primary Key Constraints
• A set of fields is a key for a relation if :
1. No two distinct tuples can have same values in all key fields,
and
2. This is not true for any subset of the key.
– Part 2 false? A superkey.
•
–
A set of fields that contain a key
If there are two or more keys for a relation, one of the keys is
chosen (by DBA) to be the primary key.
• E.g., sid is a key for Students. (What about
name?) The set {sid, gpa} is a superkey.
Dr. Nan Wang
CSC411_L3_Relational Model
19
Primary and Candidate Keys in SQL
• Possibly many candidate keys (specified using
UNIQUE), one of which is chosen as the primary
key.


CREATE TABLE Enrolled
(sid CHAR(20)
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid,cid) )
“For a given student and
course, there is a single grade.”
vs. “Students can take only one
course, and receive a single
grade for that course; further,
CREATE TABLE Enrolled
no two students in a course
(sid CHAR(20)
receive the same grade.”
cid CHAR(20),
Used carelessly, an IC can
grade CHAR(2),
prevent the storage of database
PRIMARY KEY (sid),
instances that arise in practice!
UNIQUE (cid, grade) )
Dr. Nan Wang
CSC411_L3_Relational Model
20
Dr. Nan Wang
CSC411_L3_Relational Model
21
Foreign Keys, Referential Integrity
• Foreign key : Set of fields in one relation that is
used to `refer’ to a tuple in another relation.
(Must correspond to primary key of the second
relation.) Like a `logical pointer’.
• E.g. sid is a foreign key referring to Students:
–
–
Enrolled(sid: string, cid: string, grade: string)
If all foreign key constraints are enforced, referential
integrity is achieved, i.e., no dangling references.
Dr. Nan Wang
CSC411_L3_Relational Model
22
Foreign Keys in SQL
• Only students listed in the Students relation should
be allowed to enroll for courses.
CREATE TABLE Enrolled
(sid CHAR(20), cid CHAR(20), grade CHAR(2),
PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students )
Enrolled
sid
53666
53666
53650
53666
cid
grade
Carnatic101
C
Reggae203
B
Topology112
A
History105
B
Dr. Nan Wang
Students
sid
53666
53688
53650
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
CSC411_L3_Relational Model
age
18
18
19
gpa
3.4
3.2
3.8
23
Enforcing Referential Integrity
• Consider Students and Enrolled; sid in Enrolled is a
foreign key that references Students.
• What should be done if an Enrolled tuple with a nonexistent student id is inserted? (Reject it!)
• What should be done if a Students tuple is deleted?
–
–
–
–
Also delete all Enrolled tuples that refer to it.
Disallow deletion of a Students tuple that is referred to.
Set sid in Enrolled tuples that refer to it to a default sid.
(In SQL, also: Set sid in Enrolled tuples that refer to it to a
special value null, denoting `unknown’ or `inapplicable’.)
• Similar if primary key of Students tuple is updated.
Dr. Nan Wang
CSC411_L3_Relational Model
24
Referential Integrity in SQL
• SQL/92 and SQL:1999
CREATE TABLE Enrolled
support all 4 options on
(sid CHAR(20),
deletes and updates.
cid CHAR(20),
– Default is NO ACTION
(delete/update is rejected) grade CHAR(2),
– CASCADE (also delete all
tuples that refer to deleted PRIMARY KEY (sid,cid),
FOREIGN KEY (sid)
tuple)
– SET DEFAULT
REFERENCES Students
Set E.sid to S.sid of the
“default” student if a
student is deleted from
the Student relation
SET NULL
• Sets foreign key value of
referencing tuple to null
•
–
Dr. Nan Wang
ON DELETE NO CASCADE
ON UPDATE CASCADE )
CSC411_L3_Relational Model
25
Where do ICs Come From?
• ICs are based upon the semantics of the realworld enterprise that is being described in the
database relations.
• We can check a database instance to see if
an IC is violated, but we can NEVER infer
that an IC is true by looking at an instance.
–
–
An IC is a statement about all possible instances!
From example, we know name is not a key, but the
assertion that sid is a key is given to us.
• Key and foreign key ICs are the most
common; more general ICs supported too.
Dr. Nan Wang
CSC411_L3_Relational Model
26
Install MySQL on your laptop
• MySQL
• MySQL workbench
Dr. Nan Wang
CSC411_L3_Relational Model
27
Logical DB Design: ER to Relational
• Entity sets to tables:
ssn
name
Employees
Dr. Nan Wang
lot
CREATE TABLE Employees
(ssn CHAR(11),
name CHAR(20),
lot INTEGER,
PRIMARY KEY (ssn))
CSC411_L3_Relational Model
28
Relationship Sets to Tables
• In translating a
relationship set to a
relation, attributes of the
relation must include:
– Keys for each
participating entity set
(as foreign keys).
• This set of
attributes forms a
superkey for the
relation.
– All descriptive attributes.
Dr. Nan Wang
CREATE TABLE Works_In(
ssn CHAR(11),
did INTEGER,
since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)
CSC411_L3_Relational Model
29
since
name
ssn
dname
lot
Employees
Dr. Nan Wang
did
Works_In
budget
Departments
CREATE TABLE Works_In(
ssn CHAR(11),
did INTEGER,
since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)
CSC411_L3_Relational Model
30
Relationship Sets to Tables
CREATE TABLE Report_To(
supervisor_ssn CHAR(11),
subordinate_ssn CHAR(11),
PRIMARY KEY (supervisor_ssn,
subordinate_ssn),
FOREIGN KEY (supervisor_ssn)
REFERENCES Employees(ssn),
FOREIGN KEY (subordinate_ssn)
REFERENCES Employees(ssn))
name
ssn
lot
Employees
supervisor
subordinate
Reports_To
We need to explicitly name the referenced field of
Employees because the field name differs from the
name(s) of the refering field(s).
Dr. Nan Wang
CSC411_L3_Relational Model
31
Review: Key Constraints
• Each dept has at
most one manager,
according to the
key constraint on
Manages.
since
name
ssn
dname
lot
Employees
did
Manages
budget
Departments
Translation to
relational model?
1-to-1
Dr. Nan Wang
1-to Many
Many-to-1
Many-to-Many
CSC411_L3_Relational Model
32
Translating ER Diagrams with Key
Constraints
• Map relationship to
a table:
– Note that did is
the key now!
– Separate tables for
Employees and
Departments.
• Since each
department has a
unique manager,
we could instead
combine Manages
and Departments.
Dr. Nan Wang
CREATE TABLE Manages(
ssn CHAR(11),
did INTEGER,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees,
FOREIGN KEY (did) REFERENCES Departments)
CREATE TABLE Dept_Mgr(
did INTEGER,
dname CHAR(20),
budget REAL,
ssn CHAR(11),
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees)
CSC411_L3_Relational Model
33
Review: Participation Constraints
• Does every department have a manager?
–
If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
• Every did value in Departments table must appear
in a row of the Manages table (with a non-null ssn
value!)
since
name
ssn
dname
did
lot
Employees
Manages
budget
Departments
Works_In
since
Dr. Nan Wang
CSC411_L3_Relational Model
34
Participation Constraints in SQL
• We can capture participation constraints involving
one entity set in a binary relationship
CREATE TABLE Dept_Mgr(
did INTEGER,
dname CHAR(20),
budget REAL,
ssn CHAR(11) NOT NULL,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees
ON DELETE NO ACTION)
Dr. Nan Wang
CSC411_L3_Relational Model
35
Review: Weak Entities
• A weak entity can be identified uniquely only by
considering the primary key of another (owner)
entity.
–
–
Owner entity set and weak entity set must participate in a
one-to-many relationship set (1 owner, many weak entities).
Weak entity set must have total participation in this
identifying relationship set.
name
ssn
lot
Employees
Dr. Nan Wang
cost
pname
Policy
CSC411_L3_Relational Model
age
Dependents
36
Translating Weak Entity Sets
• Weak entity set and identifying relationship set
are translated into a single table.
–
When the owner entity is deleted, all owned weak entities
must also be deleted.
CREATE TABLE Dep_Policy (
pname CHAR(20),
age INTEGER,
cost REAL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (pname, ssn),
FOREIGN KEY (ssn) REFERENCES Employees
ON DELETE CASCADE)
Dr. Nan Wang
CSC411_L3_Relational Model
37
Review: ISA Hierarchies
name
As in C++, or other
PLs, attributes are
inherited.
 If we declare A ISA
B, every A entity is
also considered to be a
B entity.
ssn

lot
Employees
hourly_wages
hours_worked
ISA
contractid
Hourly_Emps
Contract_Emps
• Overlap constraints: Can Joe be an Hourly_Emps as well
as a Contract_Emps entity? (Allowed/disallowed)
• Covering constraints: Does every Employees entity also
have to be an Hourly_Emps or a Contract_Emps entity?
(Yes/no)
Dr. Nan Wang
CSC411_L3_Relational Model
38
Translating ISA Hierarchies to
Relations
• General approach:
–
3 distinct relations: Employees, Hourly_Emps and Contract_Emps.
• Hourly_Emps: Every employee is recorded in Employees. For
hourly emps, extra info recorded in Hourly_Emps
(hourly_wages, hours_worked, ssn); must delete Hourly_Emps
tuple if referenced Employees tuple is deleted).
• Queries involving all employees easy, those involving just
Hourly_Emps require a join to get some attributes.
• Alternative
– Create two relations: Just Hourly_Emps and Contract_Emps.
•
•
Hourly_Emps: ssn, name, lot, hourly_wages, hours_worked.
Each employee must be in one of these two subclasses.
Dr. Nan Wang
CSC411_L3_Relational Model
39
Review: Binary vs. Ternary
Relationships
name
ssn
• What are the
additional
constraints in
the 2nd
diagram?
pname
lot
Employees
Dependents
Covers
Bad design
Policies
policyid
cost
name
pname
ssn
lot
age
Dependents
Employees
Purchaser
Better design
Dr. Nan Wang
age
policyid
Beneficiary
Policies
CSC411_L3_Relational Model
cost
40
Binary vs. Ternary Relationships
(Contd.)
• The key constraints CREATE TABLE Policies (
policyid INTEGER,
allow us to
combine Purchaser cost REAL,
with Policies and
ssn CHAR(11) NOT NULL,
Beneficiary with
PRIMARY KEY (policyid),
Dependents.
FOREIGN KEY (ssn) REFERENCES Employees
• Participation
ON DELETE CASCADE)
constraints lead to
NOT NULL
constraints.
Dr. Nan Wang
CREATE TABLE Dependents (
pname CHAR(20),
age INTEGER,
policyid INTEGER,
PRIMARY KEY (pname, policyid),
FOREIGN KEY (policyid) REFERENCES Policies
ON DELETE CASCADE)
CSC411_L3_Relational Model
41
Binary vs. Ternary Relationships
(Contd.)
• What if
Policies is a
weak entity
set?
CREATE TABLE Policies (
policyid INTEGER,
cost REAL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (policyid, ssn),
FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
CREATE TABLE Dependents (
pname CHAR(20),
age INTEGER,
policyid INTEGER, NOT NULL
PRIMARY KEY (pname, policyid, ssn),
FOREIGN KEY (policyid, ssn) REFERENCES Policies,
ON DELETE CASCADE)
Dr. Nan Wang
CSC411_L3_Relational Model
42
name
ssn
lot
Employees
cost
pname
Policy
age
Dependents
CREATE TABLE IF NOT EXISTS dep_policy
(pname char(20),
age integer,
ssn char(11) not null,
cost real,
primary key(pname, ssn),
Foreign key (ssn) references Employees)
Dr. Nan Wang
CSC411_L3_Relational Model
43
Dr. Nan Wang
CSC411_L3_Relational Model
44
Dr. Nan Wang
CSC411_L3_Relational Model
45
Create table if not exists dept_runs
(dno integer,
Dname char(64),
Office char(10),
Manager_ssn char(10) not null,
Primary key (dno),
Foreign key (manager_ssn) references Professors(Prof_ssn))
Dr. Nan Wang
CSC411_L3_Relational Model
46
Dr. Nan Wang
CSC411_L3_Relational Model
47
Not null
Dr. Nan Wang
CSC411_L3_Relational Model
48
Dr. Nan Wang
CSC411_L3_Relational Model
49
Homework
• Write a script for creating database, tables
• Run your script in MySQL
• Insert some rows to each table
• Turn in your script, grader will run it and test your
homework
Dr. Nan Wang
CSC411_L3_Relational Model
50
Views
• A view
– is a table whose rows are not explicitly stored in the database but are
computed as needed from a view definition
– A view is a relation. We store a definition, rather than a set of tuples.
CREATE VIEW YoungActiveStudents (name, grade)
AS SELECT S.name, E.grade
FROM Students S, Enrolled E
WHERE S.sid = E.sid and S.age<21

Views can be dropped using the DROP VIEW
command.

How to handle DROP TABLE if there’s a view on the table?
• DROP TABLE command has options to let the user specify this.
Dr. Nan Wang
CSC411_L3_Relational Model
51
Views and Security
• Views can be used to present necessary
information (or a summary), while hiding
details in underlying relation(s).
–
Given YoungStudents, but not Students or Enrolled, we
can find students who have are enrolled, but not the
cid’s of the courses they are enrolled in.
Dr. Nan Wang
CSC411_L3_Relational Model
52
Relational Model: Summary
• Relational database
– A tabular representation of data.
– Simple and intuitive, currently the most widely used.
• SQL query languages
– Create and modifying relations (CREATE, DROP, ALTER)
– Manipulate data (INSERT, UPDATE, DELETE)
• Integrity constraints can be specified by the DBA, based on
application semantics. DBMS checks for violations.
–
–
Two important ICs: primary and foreign keys
In addition, we always have domain constraints.
• Rules to translate ER to relational model
Dr. Nan Wang
CSC411_L3_Relational Model
53
CSC 411/511:
DBMS Design
Questions?
Dr. Nan Wang
CSC411_L3_Relational Model
54