Lecture 2 The Entity-Relationship (ER) Model INFS614, Fall 2008

Download Report

Transcript Lecture 2 The Entity-Relationship (ER) Model INFS614, Fall 2008

The Entity-Relationship (ER)
Model
Lecture 2
INFS614, Fall 2008
INFS614, GMU, Lecture 2
1
Purposes of DBMS

Provide support for “easy-to-use” data
– Data model (data)
– Transaction model (operation)

Provide efficient storage and access of the
data in terms of the data model and
transactional model.
INFS614, GMU, Lecture 2
2
Data Models

Tools to obtain data abstraction.

Necessary to be general and intuitive.

Data model: A class of mathematical
structures, with description and operations

Conceptual data model: Just structural
description
INFS614, GMU, Lecture 2
3
Overview of Database Design

Conceptual design
– Use ER Model: E- Entities and R-Relationships
– Decide the entities and relationships in the
enterprise.
– Decide what information about these entities and
relationships should we store in the database.
– Decide the integrity constraints or business rules.

Implementation
–
Map an ER model into a relational schema.
INFS614, GMU, Lecture 2
4
Entity Relationship Model

The E-R Model is used in the conceptual
design:
– An E-R Model can be represented pictorially: Result
is an ER Schema or an ER Diagram;
– An E-R Model can be mapped into a Relational
Database Schema.
INFS614, GMU, Lecture 2
5
ER Model Basics


Entity: A real-world object distinguishable from
other objects.
Distinguishable via its description (data).
– Example: specific student, faculty, department, event.


Attribute: a mapping that maps an object to a
value (called the attribute value). E.g.: Age is an
attribute of students objects.
An entity is described (in DB) using a set of
attributes values.
– Example: student-sid, name, address, phone, age, ssn.
INFS614, GMU, Lecture 2
6
ER Model Basics

Entity Set: A collection of similar entities.
E.g., all students, all faculties.
– Similar: All entities in an entity set have
the same set of attributes.
INFS614, GMU, Lecture 2
7
ER Diagram: Entity Set &
Example
ssn
name
lot
Employees
Diagram rule:
Entity set: Box (rectangle)
Attribute: “bubble” (oval)
Primary key: underlined
INFS614, GMU, Lecture 2
8
ER Model Basics
Entity Set : Properties


All entities in an entity set have the same
set of attributes
Each attribute has a Domain : a set of
possible values for the attribute :
Domain (age) : {0,1,..,100}
Domain (gender) : { female, male}
Domain (sname) : any string of at most 20
characters

Null versus non-null
INFS614, GMU, Lecture 2
9
Keys of Entity Sets
A superkey of an entity set is a (sub)set of
the attributes such that no two entities in
the set is allowed to have the same values
on all these (key) attributes.
 Candidate key =A superkey that does not
have a “redundant” attribute, i.e., if any
attribute is removed, the set is not a
superkey anymore.
 Primary key = One of the candidate keys
designated to be so.
 Every entity set must have a key.

INFS614, GMU, Lecture 2
10
Keys of Entity Sets
 Example:
– Candidate keys for Employees:
SSN
EmployeeID
– Primary key for Employees:
SSN
INFS614, GMU, Lecture 2
11
ER Model Basics (Contd.)






Relationship: Association among two or more
entities. E.g., Mary works in the Pharmacy
department.
Relationship Set: Collection of similar relationships.
Similarity: is in terms of entity sets where the
entities are from.
E.g.: A person (from Employees entity set) works in a
department (from Departments entity set).
An n-ary relationship set R relates n entity sets E1 ...
En; each relationship in R involves entities e1 in E1,...,en
in En
Same entity set could participate in different
relationship sets, or in different “roles” in same set.
INFS614, GMU, Lecture 2
12
Relationship Set Example
since
name
ssn
dname
lot
Employees
did
Works_In
budget
Departments
Relationship set: Works_In
INFS614, GMU, Lecture 2
13
Descriptive Attributes

Relationships can have attributes.

These attributes are called “descriptive”
attributes, because they only “describe”
relationships, but do not “distinguish”
relationships.

A relationship can only be distinguished by the
participating entities.
INFS614, GMU, Lecture 2
14
Relationship Set Example
since
name
ssn
dname
lot
Employees
did
Works_In
budget
Departments
Each Works_In relationship is uniquely identified by the
combination of employee ssn and department did.
Thus, for a given employee-department pair, we
cannot have more than one associated since value.
INFS614, GMU, Lecture 2
15
Another Relationship Set
role indicators
name
ssn
lot
Employees
supervisor
subordinate
Reports_To
since
Each Reports_To relationship is uniquely identified
by subordinate_ssn and supervisor_ssn.
INFS614, GMU, Lecture 2
16
Instance of a Relationship
Set

An instance of a relationship set is a specific set
of relationships.

Snapshot of the relationship set at some instant in
time.

E.g., Instance of relationship set Works_In:
{(123-22-3666,51), (213-55-1234,57), (232-32-6363,61)}

{(123-22-3666,51), (213-55-1234,57), (213-55-1234,57),
(232-32-6363,61)}
INFS614, GMU, Lecture 2
17
Relationship Sets
Example : Instance of relationship set Works_In :
– Phil Collins works for A-Dept since 3/3/96
– Brad Johnson works for B-Dept since 11/3/98
– Martha Stewart works for C-Dept since 11/3/98
– Will Smith works for C-Dept since 7/30/95
3/3/96
11/3/98
123-22-366
A
11/3/98
534-55-928
633-90-9767
B
11/3/98
C
231-89-6598
7/30/95
7/30/95
Employees
INFS614, GMU, Lecture 2
Works-In
Departments
18
Key Constraints
since
name
ssn
dname
lot
Employees
did
Manages
budget
Departments
Works_In
since


Consider Works_In: An employee can work in many
departments; a dept can have many employees.
In contrast, each dept has at most one manager,
according to the key constraint on Manages.
INFS614, GMU, Lecture 2
19
1:N Relationship Set
Manages : Each dept has at most one manager,
1:N
3/3/96
123-22-366
534-55-928
23
11/3/98
37
633-90-9767
7/30/95
51
231-89-6598
Employees


Manages
Departments
An entity in Departments is associated with at most one
entity in Employees via the relationship Manages
There is a Key Constraint on Dept w.r.t Manages
INFS614, GMU, Lecture 2
20
Key Constraints (Contd.)
 Works_In
relationship set: many-to-
many (M:N).
 Manages
relationship set: one-to-
many (1:N).
INFS614, GMU, Lecture 2
21
Key Constraints
since
name
ssn
dname
lot
Employees

did
Manages
budget
Departments
We add the restriction that each employee can
manage at most one department.
INFS614, GMU, Lecture 2
22
Key Constraints
since
name
ssn
dname
lot
Employees


did
Manages
budget
Departments
We add the restriction that each employee can
manage at most one department.
We obtain a one-to-one (1:1) relationship set.
INFS614, GMU, Lecture 2
23
1:1 Relationship Set
Manages : Each dept has at most one manager, and each
employee can manage at most one dept.
1 : 1
3/3/96
123-22-366
534-55-928
23
11/3/98
37
633-90-9767
7/30/95
51
231-89-6598
Employees
INFS614, GMU, Lecture 2
Manages
Departments
24
Types of Binary Relationship
Sets (Cardinality Constraints)
1-to-1
INFS614, GMU, Lecture 2
1-to Many
Many-to-1
Many-to-Many
25
Key Constraints for Ternary
Relationships
since
name
ssn
dname
lot
did
budget
Employees
Works_In2
Departments
addresses
Locations
capacity
Each employee works in at most one department
and at a single location.
INFS614, GMU, Lecture 2
26
Key Constraints
 An
entity set may participate in a
relationship set as a “key” participant.
 What it means is that each entity of
the “key” entity set can only
participate at most once in the
relationship set.
 More than one entity set can be key
participant (e.g. one-to-one
relationship set).
INFS614, GMU, Lecture 2
27
Participation Constraints

Does every department have a manager?
–
If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
 Every did value in Departments table must appear in a row
of the Manages table (with a non-null ssn value!)
since
name
ssn
dname
did
lot
Employees
Manages
budget
Departments
Works_In
since
INFS614, GMU, Lecture 2
28
Participation Constraints

Does every department have a manager?
–
If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
 Every did value in Departments table must appear in a row
of the Manages table (with a non-null ssn value!)
since
name
ssn
dname
did
lot
Employees
Manages
budget
Departments
Works_In
since
INFS614, GMU, Lecture 2
29
Participation Constraints
3/3/96
123-22-366
534-55-928
23
11/3/98
37
633-90-9767
7/30/95
51
231-89-6598
Employees
Partial Participation
INFS614, GMU, Lecture 2
Manages
Departments
Total Participation
30
Weak Entities

Consider the following situation: employees can purchase
insurance policies to cover their dependents.
name
ssn
pname
lot
Employees
Policy
age
Dependents
The attribute pname does not identify a dependent
uniquely.
INFS614, GMU, Lecture 2
31
Weak Entity Sets

A weak entity can be identified uniquely only by
considering the primary key of another (owner)
entity.
–
–
Owner entity set and weak entity set must participate in
a one-to-many relationship set (one owner, many weak
entities).
Weak entity set must have total participation in this
identifying relationship set.
partial key
name
ssn
lot
Employees
INFS614, GMU, Lecture 2
cost
Policy
pname
age
Dependents
32
Another example
Note: the primary key of the strong entity set is not
explicitly stored with the weak entity set, since it is
implicit in the identifying relationship.
INFS614, GMU, Lecture 2
33
Strong vs. Weak entity sets

Strong entity set:
– Has sufficient attributes to form a primary key

Weak entity set:
– Lacks sufficient attributes to form a primary key
– Hence, it lacks enough attributes to form any key.

But every entity needs a key: what do we do?
– Must import attributes from strong entity set(s)
– Importation is done via one-to-many relationships
– A weak entity is subordinate to the dominant entity(-ies)
from strong entity set(s) providing (reliably) attributes
to complete its key.
INFS614, GMU, Lecture 2
34
Class Hierarchies



We may want to classify the entities in an
entity set into subclasses.
E.g.: classify the entities in Employees as
Hourly_Emps or Contract_Emps.
Semantics: every entity in Hourly_Emps
and Contract_Emps is also an Employees.
Thus, must have all attributes of
Employees defined.
INFS614, GMU, Lecture 2
35
ISA (`is a’) Hierarchies
As
in C++, or other PLs,
attributes are inherited.
If we declare A ISA B, every
A entity is also considered to
be a B entity.
hourly_wages
name
ssn
lot
Employees
hours_worked
ISA
contractid
Hourly_Emps
INFS614, GMU, Lecture 2
Contract_Emps
36
Example
INFS614, GMU, Lecture 2
37
name
ISA (`is a’) Hierarchies
hourly_wages
ssn
lot
Employees
hours_worked
ISA
contractid
Hourly_Emps


Contract_Emps
Overlap constraints: Can Joe be an Hourly_Emps as
well as a Contract_Emps entity? (Allowed/disallowed)
Covering constraints: Does every Employees entity also
have to be an Hourly_Emps or a Contract_Emps entity?
(Yes/no)
INFS614, GMU, Lecture 2
38
ISA (`is a’) Hierarchies

Reasons for using ISA:
– To add descriptive attributes specific to a
subclass:
E.g., hourly_wages does not make sense for
Contract_Emps.
– To identify entities that participate in a
relationship:
Define Senior_Emps ISA Employees;
Define Manages relationship between entity sets
Senior_Emps and Departments to ensure that only
senior employees can be managers.
INFS614, GMU, Lecture 2
39
Aggregation


Entity set Projects.
Each Projects
entity is sponsored
by one or more
departments.
A department that sponsors
a project might assign
employees to monitor the
sponsorship.

HOW?
since
started_on
pid
pbudget
Projects
INFS614, GMU, Lecture 2
did
Sponsors
dname
budget
Departments
40
Aggregation

Used when we have
to model a
relationship
involving (entity
sets and) a
relationship set.
–
Aggregation allows
us to treat a
relationship set as
an entity set for
purposes of
participation in
(other)
relationships.
INFS614, GMU, Lecture 2
name
ssn
lot
Employees
Monitors
since
started_on
pid
pbudget
Projects
until
did
Sponsors
dname
budget
Departments
41
Aggregation
 Aggregation vs. ternary
ssn
relationship:
 Monitors is a distinct
relationship,
with a descriptive attribute.
 Also, can say that each
sponsorship
is monitored by at
started_on
most one employee.
pid
lot
Employees
Monitors
until
since
pbudget
Projects
INFS614, GMU, Lecture 2
name
did
Sponsors
dname
budget
Departments
42
Aggregation: another example
Suppose we want to record managers for tasks performed by an
employee at a branch
INFS614, GMU, Lecture 2
43
Example (cont.)

Relationship sets works-on and manages represent
overlapping information
– Every manages relationship corresponds to a works-on
relationship
– However, some works-on relationships may not correspond to
any manages relationships
 So we can’t discard the works-on relationship

Eliminate this redundancy via aggregation
– Treat relationship as an abstract entity
– Allows relationships between relationships
– Abstraction of relationship into new entity

Without introducing redundancy, the following diagram
represents:
– An employee works on a particular job at a particular branch
– An employee, branch, job combination may have an associated
manager
INFS614, GMU, Lecture 2
44
With aggregation
INFS614, GMU, Lecture 2
45

Conceptual Design Using the ER
Model
Design choices:
–
–
–

Should a concept be modeled as an entity or an
attribute?
Should a concept be modeled as an entity or a
relationship?
Identifying relationships: Binary or ternary?
Aggregation?
Constraints in the ER Model:
–
–
A lot of data semantics can (and should) be captured.
But some constraints cannot be captured in ER diagrams.
INFS614, GMU, Lecture 2
46
Entity vs. Attribute
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address
must be an entity (since attributes cannot be setvalued).
 If the structure (city, street, etc.) is important, e.g.,
we want to retrieve employees in a given city, address
must be modeled as an entity (since attribute values
are atomic).
INFS614, GMU, Lecture 2
47
Entity vs. Attribute (Contd.)


Works_In2 does not
allow an employee to
work in a department
for two or more periods.
Similar to the problem
of wanting to record
several addresses for
an employee: we want
to record several values
of the descriptive
attributes for each
instance of this
relationship.
from
name
ssn
dname
lot
did
Works_In2
Employees
budget
Departments
name
dname
ssn
lot
Employees
from
INFS614, GMU, Lecture 2
to
did
Works_In3
Duration
budget
Departments
to
48
Entity vs. Relationship


First ER diagram OK if a
manager gets a separate
discretionary budget for
each dept.
What if a manager gets a
discretionary budget that
covers all managed
depts?
– Redundancy of
dbudget, which is
stored for each dept
managed by the
manager.
– Misleading: suggests
dbudget is tied to
managed dept.
since
name
ssn
lot
Employees
dname
did
budget
Departments
Manages2
since
name
ssn
dname
lot
Employees
did
Manages3
budget
Departments
ISA
Managers
INFS614, GMU, Lecture 2
dbudget
dbudget
49
Binary vs. Ternary Relationships
name
ssn

If each policy
is owned by
just 1
employee:
–

Key constraint
on Policies
would mean
policy can only
cover 1
dependent!
What are the
additional
constraints in
the 2nd
diagram?
INFS614, GMU, Lecture 2
pname
lot
Employees
Dependents
Covers
Bad design
age
Policies
policyid
cost
name
pname
ssn
lot
age
Dependents
Employees
Purchaser
Beneficiary
Better design
policyid
Policies
cost
50
Binary vs. Ternary Relationships
(Contd.)


Previous example illustrated a case when two
binary relationships were better than one
ternary relationship.
An example in the other direction: a ternary
relation Contracts relates entity sets Parts,
Departments and Suppliers, and has descriptive
attribute qty.
INFS614, GMU, Lecture 2
51
Binary vs. Ternary Relationships
(Contd.)
Parts
Can-Supply
Needs
Contracts
Departments
Deals-With
qty
Suppliers
No combination of binary relationships is an
adequate substitute:
S “can-supply” P, D “needs” P, and D “deals-with”
S does not imply that D has agreed to buy P from
S. How do we record qty?
INFS614, GMU, Lecture 2
52
Summary of Conceptual Design

Conceptual design follows requirements analysis,
–

Yields a high-level description of data to be stored
ER model popular for conceptual design
–
Constructs are expressive, close to the way people think
about their applications.
Basic constructs: entities, relationships, and
attributes (of entities and relationships).
 Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
 Note: There are many variations on ER model.

INFS614, GMU, Lecture 2
53
Summary of ER (Contd.)

Several kinds of integrity constraints can be expressed in
the ER model: key constraints, participation constraints,
and overlap/covering constraints for ISA hierarchies.
– Some constraints (notably, functional dependencies)
cannot be expressed in the ER model:
 salary
depends on the rank of an employee
 The number of employees who works for a project
is never greater than 15
–
Constraints play an important role in determining the
best database design for an enterprise.
INFS614, GMU, Lecture 2
54
Summary of ER (Contd.)

ER design is subjective. There are often many
ways to model a given scenario! Analyzing
alternatives can be tricky, especially for a large
enterprise. Common choices include:
–

Entity vs. attribute, entity vs. relationship, binary or
n-ary relationship, whether or not to use ISA
hierarchies, and whether or not to use aggregation.
Ensuring good database design: resulting
relational schema should be analyzed and
refined further. FD information and
normalization techniques are especially useful.
INFS614, GMU, Lecture 2
55
Summary of symbols
INFS614, GMU, Lecture 2
56
Summary of symbols (cont.)
INFS614, GMU, Lecture 2
57
Alternative notations in E-R
INFS614, GMU, Lecture 2
58