Avoiding Database Anomalies

Download Report

Transcript Avoiding Database Anomalies

44220: Database Design & Implementation
Avoiding Database Anomalies
Ian Perry
Room: C49
Tel Ext.: 7287
E-mail: [email protected]
http://itsy.co.uk/ac/0506/sem2/44220_DDI/
Avoiding Database Anomalies

This lecture concentrates upon building
a ‘robust’ Logical Data Model. i.e.:




Transforming a Conceptual Data Model into
a set of Relations.
Checking these Relations for any Anomalies.
Documenting them as a Database Schema.
Most Database books have a section
describing a mathematically-based
technique called Normalisation:

Ian Perry
I will show you a much easier way of
achieving the same result, i.e. a robust
database design.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 2
What is an Anomaly?


Anything we try to do with a database that
leads to unexpected and/or unpredictable
results.
Three types of Anomaly to guard against:




insert
delete
update
Need to check your database design carefully:

Ian Perry
the only good database is an anomaly free
database.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 3
Insert Anomaly

When we want to enter a value into a data cell
but the attempt is prevented, as another
value is not known.
CoNo Tutor Room RSize EnLimit
353
Smith A532
45
40
351
Smith C320
100
60
355
Clark H940
400
300
456 Turner H940
400
45
 e.g. We have built a new Room (e.g. B123), but
it has not yet been timetabled for any courses
or members of staff.
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 4
Delete Anomaly

When a value we want to delete also means we
will delete values we wish to keep.
CoNo Tutor Room RSize EnLimit
353
Smith A532
45
40
351
Smith C320
100
60
355
Clark H940
400
300
456 Turner H940
400
45
 e.g. CoNo 351 has ended, but Room C320 will
be used elsewhere.
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 5
Update Anomaly

When we want to change a single data item
value, but must update multiple entries
CoNo Tutor Room RSize EnLimit
353
Smith A532
45
40
351
Smith C320
100
60
355
Clark H940
400
300
456 Turner H940
400
45
 e.g. Room H940 has been improved, it is now of
RSize = 500.
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 6
A Conceptual Model

Consider the following ‘simple’ conceptual
data model:
Course
M
M
Staff
1
M
Student
Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB, ...)
Student(Enrol-No, Name, Address, OLevelPoints, ...)
Course(CourseCode, Name, Duration, ...)
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 7
The ‘Translation’ Process




Entities become Relations
Attributes become Attributes(?)
Key Attribute(s) become Primary Key(s)
Relationships are represented by
additional Foreign Key Attributes:

Ian Perry
for those Relations that are at the ‘M’ end of
each 1:M Relationship.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 8
The ‘Staff’ & ‘Student’ Relations
Staff(Staff-ID, Name, Address, ScalePoint,
RateOfPay, DOB, ...)
becomes:
Staff
Staff-ID
Name
Address
ScalePoint
RateOfPay DOB
Student(Enrol-No, Name, Address, OLevelPoints, ...)
becomes:
Student Enrol-No Name
Address OLevelPoints Tutor
NB. Foreign Key Tutor references Staff.Staff-ID
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 9
The ‘Staff’ & ‘Course’ Relations
Staff
Staff-ID
Name
Address
ScalePoint
RateOfPay DOB
Course(CourseCode, Name, Duration, ...)
becomes:
Course
CourseCode Name
Duration
NB. Can’t add a Foreign Key; as BOTH Relations
have a ‘M’ end:
 I warned you about leaving M:M relationships
in your Conceptual Data Model.
 Must create an ‘artificial’ linking Relation.
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 10
‘Staff’, ‘Course’ & ‘Team’ Relations
Staff
Team
Course
Staff-ID
Name
Address
ScalePoint
RateOfPay DOB
CourseCode Staff-ID
CourseCode Name
Duration
NB. In the ‘artificial’ Relation (i.e. Team):
The Primary Key is a composite of CourseCode & Staff-ID
Foreign Key CourseCode references Course.CourseCode
Foreign Key Staff-ID references Staff.Staff-ID
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 11
4 Relations from 3 Entities?
Student
Staff
Team
Course
Enrol-No Name
Staff-ID
Name
CourseCode
Address
OLevelPoints Tutor
Address
ScalePoint
RateOfPay DOB
Staff-ID
CourseCode Name
Duration
BUT - are they anomaly free?
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 12
Check Relations for Anomalies!




every Tuple unique?
no hidden meaning from location?
data cells atomic?
for Relations with single-attribute keys:


every Attribute depends upon the Primary
Key?
for Relations with composite keys:

Ian Perry
every Attribute depends upon all of the
Composite Key?
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 13
What if the checks fail?

If any Relation fails ‘checks’:


we MUST split that Relation into
multiple Relations:


especially those checking dependency.
until they pass the tests.
but MUST remember to leave behind a
Foreign Key:

Ian Perry
to ‘point’ forwards to the Primary Key of
the ‘new’ split-off Relation.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 14
Are they Anomaly Free?
Student
Staff
Team
Course
Ian Perry
Enrol-No Name
Staff-ID
Name
CourseCode
Address
OLevelPoints Tutor
Address
ScalePoint
RateOfPay DOB
Staff-ID
NOT this one!
CourseCode Name
Duration
as RateOfPay
does NOT depend
upon Staff-ID
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 15
Fixing this ‘Problem’
Staff

Staff-ID
Name
Address
ScalePoint
RateOfPay DOB
The Attribute ‘RateOfPay’ depends upon
‘ScalePoint’ NOT ‘Staff-ID’.
 So, we need to split this Relation:
Staff
Staff-ID
Name
Pay
ScalePoint
RateOfPay
Address
ScalePoint DOB
NB. Foreign Key ScalePoint references Pay.ScalePoint
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 16
5 Relations from 3 Entities
Student
Staff
Course
Team
Pay
Enrol-No Name
Staff-ID
Name
Address
Address
CourseCode Name
CourseCode Staff-ID
ScalePoint
RateOfPay
OLevelPoints Tutor
ScalePoint DOB
Duration
an ‘artificial’ Relation
- to ‘solve’ a M:M ‘problem’
a ‘split-off’ Relation
- to ‘solve’ a Dependency
‘problem’
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 17
Don’t change Conceptual Model

Remember, we can chose from one of a
range of Database Theories with which
to build our Logical Data Model:




Hierarchical
Relational
Object
Each of these Database Theories may
require different compromises (i.e. at
the Logical Modelling stage);

Ian Perry
from the ‘pure’ meaning captured by your
Conceptual Model.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 18
Document Relations as a Database Schema

A Database Schema:




We should have ‘captured’ the Business
situation (assumptions and constraints) in the
Conceptual Data Model, e.g:



defines all Relations,
lists all Attributes (with their Domains),
and identifies all Primary & Foreign Keys.
a College only delivers 10 Courses.
a Hospital only has 12 Wards.
These assumptions and constraints need to be
expressed as the Domains of the Database
Schema.
Ian Perry
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 19
Logical Schema 1 - Domains


Schema College
Domains










Ian Perry
StudentIdentifiers = 1 - 9999;
StaffIdentifiers = 1001 - 1199;
PersonNames = TextString (15 Characters);
Addresses = TextString (25 Characters);
CourseIdentifiers = 101 - 110;
CourseNames = Comp, IS, Law, Mkt, ...;
OLevelPoints = 0 - 100;
ScalePoints = 1 - 12;
PayRates = £14,005, £14,789, £15,407, ...;
StaffBirthDates = Date (dd/mm/yyyy), >21 Years
before Today;
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 20
Logical Schema 2 - Relations

Relation Student







Ian Perry
Enrol-No: StudentIdentifiers;
Name: PersonNames;
Address: Addresses;
OLevelPoints: OLevelPoints;
Tutor: StaffIdentifiers;
Primary Key: Enrol-No
Foreign Key Tutor references Staff.Staff-ID
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 21
Logical Schema 3 - Relations

Relation Staff







Ian Perry
Staff-ID: StaffIdentifiers;
Name: PersonNames;
Address: Addresses;
ScalePoint: ScalePoints;
DOB: StaffBirthDates;
Primary Key: Staff-ID
Foreign Key ScalePoint references Pay.ScalePoint
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 22
Logical Schema ...

Relation Course




Continue to define each of the Relations in a
similar manner.



CourseCode: CourseIdentifiers;
Name: CourseNames;
… etc.
All Relations MUST have a Primary Key.
Any Relation at the M-end of a 1:M Relationship
MUST have a Foreign Key.
Make sure that you define ALL of the
Relations, including:


Ian Perry
‘artificial’ ones (e.g. Team)
‘split-off’ ones (e.g. Pay)
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 23
This Week’s Workshop

In this Workshops session we will;
1.
2.
test a logical data model; to ensure that it is anomaly free
(i.e. robust),
practice documenting a Database Schema; based on a small
conceptual model (as represented by an ER Diagram).
1. Examine a table of data:


Explain the ‘potential’ for insert, delete & update anomalies
in a table of data.
Define what a ‘better’ set of tables (Relations?) to store
the data look like?
2. Examine an ER Diagram:


Ian Perry
Identify suitable Attributes for each Relation; as a
minimum those that will act as the Primary & Foreign Keys.
Document as a Database Schema; starting with the
Relations first, then coming back to document suitable
Domains.
44220: Database Design & Implementation: Avoiding Database Anomalies
Slide 24