An introduction to data modelling The purpose of data modelling Modelling data relationships.
Download ReportTranscript An introduction to data modelling The purpose of data modelling Modelling data relationships.
1 An introduction to data modelling The purpose of data modelling Modelling data relationships 2 A database will contain “a logically coherent collection of data with some inherent meaning”[1] In this module we are dealing with modelling to support relational databases – such as Access, MYSQL, Oracle In a relational database, data is stored in relations (tables) The purpose of the data model is to show how the data is related 3 The process model provides information about the processes you are dealing with – to help you build the data model The data model provides information about the data enabling you to get from this: (student enrolment form) 4 5 The data model includes the ER diagram but also includes other elements – covered in later lectures An ERD on its own is not a complete data model 6 Student ref no Award details • Award title • Award code • Level • Home faculty • Intake month • Award completion date Personal Details • Family name • Forename(s) • Title • Gender • DOB • Previous Family name (if any0 • Work contact address • Term time address • Post code • Home address • Postcode • Tel • Mobile phone • Postcode • Tel • Private email address There are at least 84 possible items of information on the student enrolment form Most of these items would need to be broken down further to store in a database Data modelling is the tool that lets you order this information 7 Yes A – but only if you do it properly correct data model will give you: • A diagrammatic representation of the way in which entities are related • Data dictionary information about the entities and the attributes and the relationships • Properly organised data • A basis for design 8 10.41 is an item of data ‘The time is 10:41’ is a piece of information Data is raw, unstructured facts Information is data organised to provide meaning 9 An entity • Is something about which you wish to store information • STUDENT is an entity An attribute is information about an entity • STUDENT DOB is information about the entity STUDENT It is better to talk about an occurrence of an entity. We have an entity STUDENT. We then store information about specific students – Mr John Doe and Ms Jane Doe. Each of these is an occurrence of the entity STUDENT One entity may have many occurrences of an entity – you would expect this – one table may have many rows; there are many students at the university 10 Data modelling starts by identifying the entities An entity is usually a noun – Customer, Order, Delivery would be entities You can use the process model as the starting point for identifying entities – a store on the DFD holds information about something The DFD example used in Dave’s lecture include a data store Since you want to store information about staff, reasonable that this will be an entity on the data model 11 If the descriptions for the data flow have been done correctly, we will know what ‘staff details’ means Assume it covers: Name, Address, Age, Position 12 How would you tell one member of staff from another? How useful is ‘address’? What is the problem with ‘age’? Get in the habit now of always putting data into atomic form Look at what the attributes are really describing Staff Staff_name Staff_Address Staff_Age 13 Why isn’t there a process model? A good place to start is by identifying the nouns: • A Student enrols on a Module taught by a Tutor • The convention is that an entity is always described in the singular, and starts with a capital • What does Module mean? • (if you have done a process model, you will already have the answer to this question) 14 Always state assumptions 15 3 entities have been identified: • STUDENT • MODULE • TUTOR Start by identifying the primary key for each entity 16 Uniquely identifies each occurrence of an entity; is not null Which means that: • A primary key cannot be duplicated. You cannot have 2 Tutors identified by the same PK • Cannot be left blank • Is essential - in a relational database, every occurrence of an entity must have a PK 17 Last Name? No, because the PK must be unique not only now but in the future. Module? OK as long as a tutor only teaches on one module but – • Tutors teach on many modules • Module details and tutors change from time to time The obvious answer is a Tutor_ID 18 Must uniquely identify the occurrence of the entity Must not be likely to change (another reason for not using Name) Does not usually contain meaningful data so will tend to be numeric or alphanumeric 19 Possible relationships are: • 1:M • 1:1 Not allowed: • M:M also written as N:M In this system, a tutor may teach on many modules, but a module is only ever taught by 1 tutor 20 21 Think of it as the primary key of another entity that you use to link two entities 22 Mandatory Optional FK PK Cardinality Attributes 23 What happens if you do this? 24 Many students study many modules A M:M or N:M relationship cannot be implemented in a relational database Since you can’t change reality, you must change the data model 25 Whenever you have or appear to have a N:M relationship, look carefully at the entities Many students enrol on many modules 1 student can enrol on 1 or more or no modules A module can be studied by 1 or more or no students Hidden in this scenario is the idea of enrolment 26 27 A Weak (or Link or Intersection) entity depends for its existence on other entities. A student can choose whether to enrol or not, but an enrolment cannot exist without the student Participation is always mandatory on the Weak entity side 28