Systems Analysis & Design with Databases

Download Report

Transcript Systems Analysis & Design with Databases

Information Retrieval and Use
Data Analysis & Data Modeling,
Relational Data Analysis and
Logical Data Modeling
Geoff Leese September 2009
1
Relational Data Analysis


2
Captures the detailed knowledge of the
meaning of the data.
Ensures that the data is logically easy to
maintain and extend.
 Data inter-dependencies have been
identified
 Ambiguities have been resolved.
 Eliminate unnecessary duplication of
data.
 Forms the data into optimum groups.
 Validates the Logical Data Model (LDM).
Logical Data Modelling

Basic Rules for converting 3NF to a LDM




3
Create an entity type for each data relation
Mark qualifying foreign keys
Check compound key relations
Make foreign/primary key relations
Guidelines for logical
modelling
Entity type names are singular
nouns, descriptive, concise and
organisation specific.
 Attribute names are unique
descriptive nouns of standard
format.
 Relationship names are descriptive,
precise verb phrases.

4
Simple Master-Detail
relationships


5
Where a single foreign key of a relation
corresponds to the primary key of another
relation
See next slide for example.
Simple Master-Detail
relationships
Shows SINGLE primary key at MASTER entity
(Organisation) connected to SINGLE foreign key at
DETAIL entity (Contact people)
6
Multiple level
Master-Detail Relationships
 Example:
7
five entities
Identifying Recursive (Unary)
Relationships
Is a relation where a foreign key
references the same relation.
 Example: Employee

Employee-number
Employee-name
Employee-manager-number
Employee
8
Relationships: Student/Module

At this point we need to identify the data
items that describe or identify each entity

Entity attributes are also known as data
items

What are the data items associated with
the following LDS diagram?
Is taken by
Student
Takes
9
Module
The Student
Entity Type
Attribute Name
Attribute
Student
Student Name
Street Address
Town
Post Code
Telephone
Jones
Leek Road
Stoke-on-Trent
ST4 2DE
294303
Is taken by
Student
Takes
10
Module
The Module
Entity Type
Attribute
Module
Module Number
Module Name
Module Leader
Level
Cats Points
Is taken by student
Student
Takes
11
Attribute Type
CM5111-1
SSAT
A Lecturer
1
10
Module
The Data Items
Is taken by student
Student
Module
Takes
Student Name
Street
Address Town
Post
Code ST4 2DE
Telephone
12
Module Number
Module Name
Module Leader
Level
Cats Points
Identifying occurrences of
entities





13
Each occurrence of an entity must be
uniquely identified in some way
Imagine the British Gas data base that
used only surnames to identify account
holders
There would be 100,000 account holders
called Jones in this country
Even if we used the given names there
would still be considerable duplication
It would be impossible to find the right
account by name alone
Adding a Primary Key
Is taken by student
Student
Module
Takes
Student Number
Student Name
Street
Address Town
Post
Code ST4 2DE
Telephone
Module Number
Module Name
Module Leader
Level
Cats Points
Primary key added
14
Relationships: Getting it right
Is this right?
Is taken by student
Student
Module
Takes
The real situation is surely
Is taken by student
Student
Takes
15
Module
Putting it right: Intersection
entity
We need a link entity - less ambiguity
Student
Student Number
Student Name
Street
Address Town
Post
Code ST4 2DE
Telephone
16
Module
Stud/Mod
Student Number
Module number
Module Number
Module Name
Module Leader
Level
Cats Points
Normalisation - steps

Start with a set of un-normalised
tables
 Entity/attribute
list
Step 1 - remove ambiguity and
repeating data
 Step 2 - remove shared data

17
Normalisation - step 1

Break down ALL attributes into smallest
meaningful parts


Remove REPEATED information to form a
new table

18
EG student name becomes student surname,
student firstname, student title
EG a course may be composed of MANY
modules (but assume that each module is only
on one course!) - so form a MODULE table
Normalisation - step 2

Remove SHARED data to form new
tables
 EG
modules may share tutors - so form
a TUTORS table.
19
Normalisation

20
FIRST NORMAL FORM - a relation
(table) is in 1NF if it contains atomic
values and all repeating groups have
been removed
Normalisation

21
SECOND NORMAL FORM - a
relation(table) is in 2NF if it is in 1NF
and every non-key attribute is fully
dependent on the primary key
Normalisation

22
THIRD NORMAL FORM - a
relation(table) is in 3NF if it is in 2NF
and every non-key attribute is not
dependent on any other non-key
attribute
Relational Data Analysis Form


Validates the LDM against the relations.
Consists of:




23
Unnormalised Form
– attributes
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
– Relations
– Attributes
RDA Form
Name
UNF
attributes
24
Date
1NF
2NF
3NF
Result
relation attribute
Data Dictionary

lists, for every field in every table
 Tablename
 Fieldname
 Field
Type
 Field size (if variable)
 Decimal places (if applicable)
 Description (if required)
 Other significant field properties
25
Data Dictionary example
Tablename
Fieldname
Fieldtype
Length
Students
Students
Students
Counter
Text
Text
N/A
20
5
Students
Students
Student ID
Student firstname
Student other
initials
Student Surname
Fee paid
Dec
Places
N/A
N/A
N/A
25
N/A
N/A
2
Students
Date of Birth
Text
Number
(currency)
Date/Time
N/A
N/A
Students
Etc
Full Time?
Yes/No
N/A
N/A
26
Description
Full firstname(capitalised)
Other initials, Capitals,
Space separated
Surname, Capitalised
Fee paid
Input mask Short date,
format Medium Date
The domain
Is the “set” of items, and the definition
thereof to which an attribute belongs
 Define domain once, saves time when
defining attributes belonging to it.
 For example - Date of Birth, Course
Start Date and Enrolment Date all
belong to the DATE domain - data type
is date/time, format dd/mm/yyyy, nonunique, non-null.

27
Further reading
Rolland chapters 3 and 4
 Hoffer chapters 10 and 12
 Kendall & Kendall chapter 17

28