An introduction to data modelling The purpose of data modelling Modelling data relationships.
Download
Report
Transcript An introduction to data modelling The purpose of data modelling Modelling data relationships.
1
An
introduction to data modelling
The
purpose of data modelling
Modelling
data relationships
2
A
database will contain “a logically
coherent collection of data with some
inherent meaning”[1]
In this module we are dealing with
modelling to support relational databases –
such as Access, MYSQL, Oracle
In a relational database, data is stored in
relations (tables)
The purpose of the data model is to show
how the data is related
3
The
process model provides information about
the processes you are dealing with – to help
you build the data model
The
data model provides information about the
data enabling you to get from this:
(student
enrolment form)
4
5
The
data model includes the ER diagram but
also includes other elements – covered in later
lectures
An
ERD on its own is not a complete data model
6
Student ref no
Award details
• Award title
• Award code
• Level
• Home faculty
• Intake month
• Award completion date
Personal Details
• Family name
• Forename(s)
• Title
• Gender
• DOB
• Previous Family name (if any0
• Work contact address
• Term time address
• Post code
• Home address
• Postcode
• Tel
• Mobile phone
• Postcode
• Tel
• Private email address
There are at least 84
possible items of
information on the
student enrolment form
Most of these items
would need to be
broken down further to
store in a database
Data modelling is the
tool that lets you order
this information
7
Yes
A
– but only if you do it properly
correct data model will give you:
• A diagrammatic representation of the way in which
entities are related
• Data dictionary information about the entities and
the attributes and the relationships
• Properly organised data
• A basis for design
8
10.41
is an item of data
‘The
time is 10:41’ is a piece of
information
Data
is raw, unstructured facts
Information is data organised to provide
meaning
9
An entity
• Is something about which you wish to store information
• STUDENT is an entity
An attribute is information about an entity
• STUDENT DOB is information about the entity STUDENT
It is better to talk about an occurrence of an entity.
We have an entity STUDENT. We then store information
about specific students – Mr John Doe and Ms Jane Doe.
Each of these is an occurrence of the entity STUDENT
One entity may have many occurrences of an entity – you
would expect this – one table may have many rows; there
are many students at the university
10
Data modelling starts by
identifying the entities
An entity is usually a noun –
Customer, Order, Delivery
would be entities
You can use the process
model as the starting point
for identifying entities – a
store on the DFD holds
information about
something
The DFD example
used in Dave’s lecture
include a data store
Since you want to
store information
about staff, reasonable
that this will be an
entity on the data
model
11
If
the descriptions for
the data flow have
been done correctly,
we will know what
‘staff details’ means
Assume it covers:
Name, Address, Age,
Position
12
How would you tell one
member of staff from
another?
How useful is ‘address’?
What is the problem with
‘age’?
Get in the habit now of
always putting data into
atomic form
Look at what the attributes
are really describing
Staff
Staff_name
Staff_Address
Staff_Age
13
Why
isn’t there a process model?
A good place to start is by identifying the
nouns:
• A Student enrols on a Module taught by a Tutor
• The convention is that an entity is always described
in the singular, and starts with a capital
• What does Module mean?
• (if you have done a process model, you will already
have the answer to this question)
14
Always state
assumptions
15
3
entities have been identified:
• STUDENT
• MODULE
• TUTOR
Start
by identifying the primary key for
each entity
16
Uniquely
identifies each occurrence of an
entity; is not null
Which means that:
• A primary key cannot be duplicated. You cannot
have 2 Tutors identified by the same PK
• Cannot be left blank
• Is essential - in a relational database, every
occurrence of an entity must have a PK
17
Last Name?
No, because
the PK must be unique not only
now but in the future.
Module?
OK as long as a tutor only teaches on one
module but –
• Tutors teach on many modules
• Module details and tutors change from time to time
The
obvious answer is a Tutor_ID
18
Must
uniquely identify the occurrence of
the entity
Must not be likely to change (another
reason for not using Name)
Does not usually contain meaningful data
so will tend to be numeric or
alphanumeric
19
Possible
relationships are:
• 1:M
• 1:1
Not
allowed:
• M:M also written as N:M
In
this system, a tutor may teach on many
modules, but a module is only ever
taught by 1 tutor
20
21
Think
of it as the primary key of another
entity that you use to link two entities
22
Mandatory
Optional
FK
PK
Cardinality
Attributes
23
What
happens if you do this?
24
Many
students study many modules
A M:M or N:M relationship cannot be
implemented in a relational database
Since you can’t change reality, you must
change the data model
25
Whenever
you have or appear to have a N:M
relationship, look carefully at the entities
Many students enrol on many modules
1 student can enrol on 1 or more or no modules
A module can be studied by 1 or more or no
students
Hidden in this scenario is the idea of enrolment
26
27
A Weak
(or Link or Intersection) entity
depends for its existence on other
entities.
A student can choose whether to enrol or
not, but an enrolment cannot exist
without the student
Participation is always mandatory on the
Weak entity side
28