Transcript Chapter 6

Data Modeling and Database Design

Chapter 6: The Relational Data Model

The Motivation for Logical Data Modeling

• Completion of conceptual modeling phase results in reasonably clear picture of data requirements for application system at high level of abstraction • Note: conceptual data model is technology-independent • During conceptual modeling, analysis and design activity is not constrained by boundaries of anticipated technology that will be used for implementation lest the richness of design will be compromised • Conceptual schema may contain constructs not directly compatible with technology intended for implementation

Chapter 6 – The Relational Data Model

2

The Motivation for Logical Data Modeling (continued)

• Sometimes further refinement may be required to eliminate data redundancy in design • Transforming conceptual schema to state better compatible with implementation technology of choice is achieved via logical data modeling • Logical data modeling phase serves as transition from technology-independent conceptual schema to technology dependent design

Chapter 6 – The Relational Data Model

3

The Relational Data Model

• • • • • • • • • • E. F. Codd in 1970 used the concept of mathematical relations to define the relational data model Database = collection of relations Relation = two-dimensional table Row in the table = related data values = tuple Column in the table = attribute Set of

all

tuples in the table = relation Relation consists of a heading and a body Heading = relation schema, schema, intension, relvar Body = extension Domain = set of possible atomic values of an attribute

Chapter 6 – The Relational Data Model

4

A Technical Definition of a Relational Data Model

• If r is relation whose structure is defined by set of attributes A1, A2, …, An, then R(A1, A2, …, An) is called

relation schema

of relation r • An attribute, A, is ordered pair (N, D) where N is the name of attribute and D is the domain that named attribute represents – Alternatively, an attribute represents a ‘named role’ of a domain • Domain of Ai (i = 1, 2, …, n) is often denoted as

Dom

(Ai) • • Relation schema, R, is named collection of attributes (R, {C}) where R is name of relation schema and {C} is set { (N1, D1), (N2, D2), . . . . . (Nn, Dn) } where N1, N2. …, Nn are distinct names

r

is relation (or relation state) over schema

R Chapter 6 – The Relational Data Model

5

A Technical Definition of a Relational Data Model (continued)

• Relation state r of relation schema R(A1, A2. …, An), also denoted as r(R), is set of n tuples {t1, t2. …, tm} • Each n-tuple tj (j = 1, 2, …, m) in r(R) is ordered list of n values where each vij (i = 1, 2, …, m) is element of

Dom

(Ai) (i = 1, 2, …, n) or when allowed, a missing value represented by special value called null • Number of attributes (n) in R is called

degree

(or

arity

) of R • Number of tuples,

m

, in relation state is called

cardinality

of relation

Chapter 6 – The Relational Data Model

6

A Technical Definition of a Relational Data Model (continued)

• • Relation schema is sometimes loosely (and incorrectly) referred to as a relation – C. J. Date (2004) has coined the term relvar (abbreviation for relation variable) to distinguish relation schema from relation

Relational schema

data model defines set of relation schemas in relational • Important to note difference between relation and relation schema as well as relation schema and relational schema (also known as database schema)

Chapter 6 – The Relational Data Model

7

Characteristics of a Relation

• A relation: – Is equivalent to a two-dimensional table – Has a heading and a body • Attributes of relation schema have unique names • Values of attribute in relation come from same domain that has no null values • Attributes of relation cannot have null values • Order of arrangement of tuples is immaterial

Chapter 6 – The Relational Data Model

8

Characteristics of a Relation (continued)

• Each attribute value in tuple is atomic; hence, composite and multi valued attributes are not allowed in relation • Order of arrangement of attributes in relation schema is immaterial as long as correspondence between attributes and their values in relation is maintained • Derived attributes are not captured in relation schema • All tuples in relation must be distinct (i.e., relation schema must have unique identifier)

Chapter 6 – The Relational Data Model

9

Characteristics of a Relation (continued) Example

Chapter 6 – The Relational Data Model

10

Universal Relation Schema Defined

Universal Relation Schema

(URS) assumption dictates that every attribute name must be unique because attributes have global meaning in database schema – Therefore, if attribute name appears in several relation schemas, all of these denote same meaning • In ER model, however, same attribute name is allowed to appear in different entity types since they imply different roles for the attribute name • Thus, mapping of attributes from ER model to logical schema requires careful attention in order to ensure unique attribute names in logical schema • Note: referencing foreign key and corresponding referenced primary (or alternate) key having the same attribute name in a logical schema does not violate the URS assumption

Chapter 6 – The Relational Data Model

11

Naming Conventions

• Since relational database theory stipulates that attribute names must be unique over entire relational schema, the following guidelines represent the approach used in this chapter for developing attribute names: – Each attribute name begins with up to a three-letter prefix that represents a meaningful abbreviation of the name of relation schema to which attribute belongs • This prefix is followed by an underscore character • Only first letter of prefix is capitalized – Following the underscore character is suffix that corresponds to attribute name itself • This suffix may contain only lowercase letters, a pound sign (#), and underscore characters, and corresponds to name of the attribute in conceptual data model – Examples: Pl_name, Pl_p#, Pl_budget

Chapter 6 – The Relational Data Model

12

Data Integrity Constraints

• Data integrity constraints: – Are rules that govern the behavior of data at all times in a database – Generally referred to as just

integrity constraints

– Are technical expressions of business rules that emerge from user requirement specifications for database application • The source of integrity constraints is the business rules – Prevail across all tiers of data modeling – conceptual, logical, and physical – Are considered to be part of the schema in that they are declared along with structural design of data model (conceptual, logical, and physical) and hold for all valid states of a database that correctly model an application (Ullman and Widom, 1997)

Chapter 6 – The Relational Data Model

13

Classification of Data Integrity Constraints

• Inherent Model-based Constraints: – Constraints driven by modeling grammar • Schema-based or Declarative Constraints: – Domain constraints – Key constraints – Relationship structural constraints – Entity integrity constraints – Referential integrity constraints – Functional dependency constraints • Semantic Integrity Constraints: – Application-based procedural constraints – DBMS-based procedural constraints (e.g., assertions, triggers, etc.)

Chapter 6 – The Relational Data Model

14

Types of Data Integrity Constraints

• State Constraints: – All declarative and procedural constraints that every valid state of a database must satisfy • Transition Constraints: – Procedural constraints that define legal transitions of state

Chapter 6 – The Relational Data Model

15

The Concept of Unique Identifiers

• Superkey: – A set of one or more attributes, which taken collectively, uniquely identifies a tuple of a relation {uniqueness property} • Candidate Key: – A superkey with no proper subset that uniquely identifies a tuple of a relation {uniqueness property + irreducibility} • Primary Key: – A candidate key with no missing values for the constituent attributes {uniqueness property + irreducibility + entity integrity constraint} • Alternate Key: – Any candidate key that is not serving the role of the primary key

Chapter 6 – The Relational Data Model

16

Graphically…

Primary Key Candidate Key Superkey

Chapter 6 – The Relational Data Model

17

Subsets and Proper Subsets

The Difference Between a Subset and a Proper Subset Attribute(s)

Pl_name Pl_p# Pl_budget (Pl_name, Pl_p#)

Subset

Pl_name Pl_p# Pl_budget (Pl_name, Pl_budget) (Pl_name, Pl_p#) Pl_name Pl_p# (Pl_name, Pl_budget) Pl_name Pl_budget (Pl_p#, Pl_budget) (Pl_p#, Pl_budget) Pl_p# Pl_budget (Pl_name, Pl_p#, Pl_budget) (Pl_name, Pl_p#, Pl_budget) Pl_name Pl_p# Pl_budget (Pl_name, Pl_p#) (Pl_name, Pl_budget) (Pl_p#, Pl_budget)

Proper Subset(s)

None None None Pl_name Pl_p# Pl_name Pl_budget Pl_p# Pl_budget Pl_name Pl_p# Pl_budget (Pl_p#, Pl_name) (Pl_name, Pl_budget) (Pl_p#, Pl_budget) Note: A set is a subset of itself, however, a set is not a proper subset of itself.

Chapter 6 – The Relational Data Model

18

A Sample Relation Instance Example

Chapter 6 – The Relational Data Model

19

The Concept of Key/NON-KEY Attribute

• Key attribute: – Any attribute that is a proper subset of a candidate key • Non-key attribute: – Any attribute that is not a subset of a candidate key • Note: An attribute is either a key attribute, a non-key attribute, or a candidate key!

• Example: – Rx_rx# is not a key attribute of PRESCRIPTION-A since it is not a proper subset of a candidate key.

– Rx_rx# is not a non-key attribute of PRESCRIPTION-A since it is a subset of a candidate key.

– Rx_rx# is a candidate key of PRESCRIPTION-A since it is an irreducible superkey of PRESCRIPTION-A.

Chapter 6 – The Relational Data Model

20

The Concept of Key/NON-KEY Attribute (continued)

Chapter 6 – The Relational Data Model

21

Superkey and Candidate Key: Worksheet

Chapter 6 – The Relational Data Model

22

Superkey and Candidate Key: Examples

Chapter 6 – The Relational Data Model

23

Referential Integrity Constraint

• While the key constraints (superkey and candidate key) and entity integrity constraint (primary key) pertain to individual relation schemas, a referential integrity constraint is specified between two relation schemas, R1 and R2.

• Specifically, a referential integrity constraint is specified between two relations in order to maintain consistency across tuples of the two relations.

• Informal definition: A tuple in one relation that refers to another relation must refer to an existing tuple in that relation.

• Foreign key constraint: A specific (special) form of referential integrity constraint specification

Chapter 6 – The Relational Data Model

24

Foreign Key Constraint

• Foreign Key Constraint Establishes an explicit association between two relation schemas and maintains the integrity of such an association • Foreign key: An attribute(s) set, A2, in a relation schema R2 that shares the same domain with a candidate key (A1) of another relation schema R1; A2 is said to reference or refer to the relation schema R1. Note: R2 is known as the referencing relation and R1 is called the referenced relation. The attribute(s) doing the referencing (A2 in R2) is the foreign key, while the candidate key being referenced (A1 in R1) is the referenced attribute(s).

• Referred to as Inclusion Dependency, this constraint is algebraically expressed as: R2.{ A2} C R1.{ A1}

Chapter 6 – The Relational Data Model

25

Foreign Key Constraint: Example (Source Schema)

Box 1

Bearcat Incorporated is a manufacturing company that has several plants in the northeastern part of the United States.

These plants are responsible for leading different projects that the company might undertake, depending on a plants’ function. A certain plant might even be associated with several projects but a project is always under the control of just one plant.

Some plants do not undertake any projects at all

. If a plant is closed down, the projects undertaken by that plant cannot be canceled. The project assignments from a closed plant must be temporarily removed in order to allow the project to be transferred to another plant. PLANT

1

Undertaken_by

n

PROJECT

Chapter 6 – The Relational Data Model

26

Example of a Foreign Key (Version 1)

Chapter 6 – The Relational Data Model

27

Example of a Foreign Key (Version 2)

Chapter 6 – The Relational Data Model

28

Naming Convention for Foreign Keys

• The name of a foreign key attribute in the referencing relation schema consists of: – The prefix used for the attribute names in the referencing relation schema, – An underscore, and – The referenced attribute name

Chapter 6 – The Relational Data Model

29

A Brief Introduction to Relational Algebra

Selection (σ) :

Selects all tuples that satisfy the selection condition from a relation R.

Projection (π) :

Produces a new relation with only some of the attributes of R, and removes duplicate tuples.

.

Union (U) :

Produces a relation that includes all the tuples in R1 or R2 or both R1 and R2; R1 and R2 must be union compatible.

Chapter 6 – The Relational Data Model

30

Introduction to Relational Algebra (continued)

Intersection (∩) :

Produces a relation that includes all the tuples in both R1 and R2; R1 and R2 must be union compatible.

Difference (-) :

Produces a relation that includes all the tuples in R1 that are not in R2; R1 and R2 must be union compatible.

.

Natural Join (*) :

Produces all the combinations of tuples from R1 and R2 that satisfy a join condition; R1 and R2 must be join compatible.

Chapter 6 – The Relational Data Model

31

Example Relational Schema

Chapter 6 – The Relational Data Model

32

Relation Instances of the Example Relational Schema

AW_PLANT TX_PLANT PROJECT

Aw_pl_name

Black Horse Mayde Creek Whitefield River Oaks King's Island Ashton

Tx_pl_name

Southern Oaks River Oaks Kingwood

Prj_name

Solar Heating Lunar Cooling Synthetic Fuel Nitro-Cooling Robot Sweeping Robot Painting Ozone Control

Aw_pl_p# Aw_pl_budget

11 13 12 17 19 15 1230000 1930000 2910000 1930000 2500000 2500000

Tx_pl_p#

16 17 18

Prj_n#

41 17 29 23 31 37 13

Tx_pl_budget Prj_location

Sealy Salem Parthi Parthi 1230000 1930000 1930000 Yoakum Ponca City Yoakum

Prj_aw_pl_p#

11 17 17 12 11 19 19

Chapter 6 – The Relational Data Model

33

Selection Operation

• • The SELECT operator is used to select a horizontal subset of the tuples that satisfy a selection condition from a relation.  (R) •

Example: Which award-winning plants have a budget that exceeds $2,000,000?

2000000> (AW_PLANT)

R_aw_pl_name

Whitefield King's Island Ashton

R_aw_pl_p#

12 19 15

R_aw_pl_budget

2910000 2500000 2500000

Chapter 6 – The Relational Data Model

34

Projection Operation

• • The PROJECT operator selects certain

attributes

from the relation and discards other attributes. In other words, a Projection operation forms a new relation by taking a vertical subset of an existing relation.  (R)

Chapter 6 – The Relational Data Model

35

Projection Operation (continued)

• Example:

What is the plant number and budget of each award-winning plant?

(AW_PLANT)

R_aw_pl_p#

11 13 12 17 19 15

R_aw_pl_budget

1230000 1930000 2910000 1930000 2500000 2500000

Chapter 6 – The Relational Data Model

36

Set-Theoretic Operators

Chapter 6 – The Relational Data Model

37

Union Compatibility in Set-Theoretic Operators

Two relations R(A1, A2, …, An) and S(B1, B2, …, Bn) are said to be union compatible: – If they have the same degree (i.e., have the same number of attributes) and – If the domain of Ai is equal to the domain of Bi for 1  i  n (i.e., corresponding attributes in R and S share the same domain)

Chapter 6 – The Relational Data Model

38

Set-Theoretic Operators Defined

Union

(R  S) Yields a relation that includes all tuples that are either exclusively in R or exclusively in S or in both R and S. That is, duplicate tuples are eliminated. •

Intersection

(R  S) Yields a relation that includes all tuples that are in both R and S. •

Difference

(R – S) Yields a relation that includes all tuples that are in R but not in S.

Chapter 6 – The Relational Data Model

39

Union Operation

• Example:

What plants are located either in Texas or are award winning plants?

AW_PLANT  TX_PLANT

R_aw_pl_name

Black Horse Mayde Creek Whitefield River Oaks King's Island Ashton

R_aw_pl_p#

11 13 12 17 19 15

R_aw_pl_budget

1230000 1930000 2910000 1930000 2500000 2500000 Southern Oaks Kingwood 16 18 1230000 1930000 Note: AW_PLANT and TX_PLANT are union-compatible

Chapter 6 – The Relational Data Model

40

Intersection Operation

Example:

Which award-winning plants are located in Texas?

AW_PLANT  TX_PLANT

R_aw_pl_name

River Oaks

R_aw_pl_p#

17

R_aw_pl_budget

1930000 Note: AW_PLANT and TX_PLANT are union-compatible

Chapter 6 – The Relational Data Model

41

Difference Operation

• Example:

Which Texas plants are not award-winning plants?

TX_PLANT - AW_PLANT

R_aw_pl_name

Southern Oaks Kingwood

R_aw_pl_p#

16 18

R_aw_pl_budget

1230000 1930000 Note: AW_PLANT and TX_PLANT are union-compatible

Chapter 6 – The Relational Data Model

42

Natural Join Operation

• • Example:

Perform a natural join of the award-winning plant and project relations. The plant attributes needed are only P# and P_name.

A natural join between

PROJECT

and a projection from

AW_PLANT PROJECT * (  AW_PLANT)

R_prj_name

Solar Heating Lunar Cooling Synthetic Fuel Nitro-Cooling Robot Sweeping Robot Painting Ozone Control

R_prj_n# R_prj_location R_prj_pl_p# R_aw_pl_name

41 Sealy 11 Black Horse 17 29 Yoakum Salem 17 17 River Oaks River Oaks 23 31 37 13 Parthi Ponca City Yoakum Parthi 12 11 19 19 Whitefield Black Horse King's Island King's Island

Chapter 6 – The Relational Data Model

43

Views

• A view is a named ‘virtual’ relation schema constructed from one or more relation schemas.

• Unlike a relation schema, a view does not store data.

• A view is just a logical window used to ‘view’ selected data (attributes and tuples) from one or a set of relations. • The value of a view at any given time is a ‘derived’ relation state resulting from the evaluation of a specified relational expression (e.g., join, project) at that time.

Chapter 6 – The Relational Data Model

44

Advantages of Views

• Views allow the same data to be seen by different users in different ways at the same time. • Views provide security by restricting user access to a predetermined set of tuples and attributes from predetermined relations.

• Views hide data complexity from the user.

Chapter 6 – The Relational Data Model

45

Mapping: ER Model → Logical Schema

• More specifically: Mapping a Fine-granular Design Specific ER model into a relational schema • The goal of any schema transformation method ought to be the preservation of the information capacity of the source data model (e.g., ER model) in the target data model (e.g., relational data model).

Chapter 6 – The Relational Data Model

46

Mapping Entity Types

• Create a relation schema for each base entity type in the ER diagram.

• Create an attribute for every stored attribute. This implies: – For composite attributes only their constituent atomic components are recorded – Derived attributes are not recorded – Multi-valued attributes do not exist in a Fine-granular Design Specific ER model • Choose a primary key from among the candidate keys by underlining the attribute(s) constituting the primary key.

• For each weak entity type, add the primary key of the identifying parent entity type as attribute(s) in the relation schema. – The attribute(s) thus added plus the partial key of the weak entity type form the primary key of the relation schema representing the weak entity type.

Chapter 6 – The Relational Data Model

47

Example

Emp_a [N,5] Emp_n [A,1] Minit [A,20] Fname Emp# [A,1] Gender [Dt,8] [A,20] Lname Name [N,1] Name_tag [X,50] Address EMPLOYEE (1, 1) [N,6] Salary [A,20] Building

- - - - -

[N,2] No_of_dependents BUILDING

Works_in

(1,1) [A,30] Pl_name [N,2] Pnumber (0, n)

R

[N,3] No_of_employees PLANT

R

(3, n) Budget

Houses

Figure 6.2 Excerpt from Fine-granular Design-Specific ER diagram of Figure 3.12

Chapter 6 – The Relational Data Model

48

Relation Schemas for the Entity Types in Figure 6.2

EMPLOYEE Emp_e#a Emp_e#n Emp_fname Emp_minit Emp_lname Emp_nametag Emp_gender Emp_address Emp_salary Emp_datehired PLANT PI_name Pl_p# Pl_budget BUILDING Bld_building Bld_pl_p# Note 1: Only the atomic attributes constituting Emp# and Name in Figure 6.2 are recorded in EMPLOYEE.

Note 2: The derived attributes, No_of_dependents in EMPLOYEE and No_of_employees in PLANT (Figure 6.2) are not captured here.

Chapter 6 – The Relational Data Model

49

Mapping Relationship Types

• Characteristic of the Design-Specific ER model: – Only binary or recursive relationships are present – Only 1:1 or 1:n cardinality constraints are present • Relationship type mapped by enforcing a foreign key constraint between the relation schemas participating in the relationship type.

• Remember: – A referential integrity constraint requires that the referenced attribute(s) exist in the referenced relation schema – The foreign key constraint requires that the referenced attribute(s) be a candidate key of the referenced relation.

• Note: Participation constraints are not mapped.

Chapter 6 – The Relational Data Model

50

Mapping Cardinality Ratio of 1:n

• Add a foreign key attribute to the referencing schema (Child in the relationship) – A foreign key must share the same domain with a candidate key in the referenced schema (Parent) – Best Practice: use primary key as the referenced attribute • Remember: – The entity type on the “many-side” of the relationship type is the referencing relation schema (“child”), – The entity type on the “one-side” is the referenced relation schema (“parent”).

Notation: Directed arc emanating from a foreign key points to a candidate key/the primary key)

Chapter 6 – The Relational Data Model

51

Mapping Relationship Types [Cardinality constraint of 1:n]

Note: The foreign key in EMPLOYEE (Emp_pl_name) referencing PLANT is referring to a candidate key (Pl_name) of PLANT – not its primary key, while the foreign key in BUILDING (Bld_pl_p#) referencing PLANT is referring to the primary key (Pl_p#) of PLANT.

Chapter 6 – The Relational Data Model

52

Alternative Notation for Mapping Relationships

Chapter 6 – The Relational Data Model

53

What About Participation Constraints?

• Suppose only some employees work in a plant – (0,1) instead of (1,1) [A,1] Emp_a [N,5] Emp_n [A,1] Minit [A,20] Fname Emp# [A,1] Gender [Dt,8] Date_hired [N,2] No_of_dependents [A,20] Lname Name [N,1] Name_tag [X,50] Address EMPLOYEE (0, 1) [N,6] Salary [A,30] Pl_name

Works_in

(0,n)

R

[N,3] No_of_employees PLANT [N,2] Pnumber [N,7] Budget

Chapter 6 – The Relational Data Model

54

Example: Source Schema for Mapping

[A,1] Emp_a [N,5] Emp_n [A,1] Minit [A,20] Fname Emp# [A,1] Gender [Dt,8] Date_hired [A,20] Lname Name [N,1] Name_tag [X,50] Address EMPLOYEE (0, 1) [N,6] Salary [A,20] Building

- - - - -

[N,2] No_of_dependents BUILDING

Works_in

(1,1) [A,30] Pl_name [N,2] Pnumber (0, n)

R

[N,3] No_of_employees PLANT

R

(3, n) [N,7] Budget

Houses

Figure 6.5: Reproduction of Figure 6.2 with a change in participation constraint of EMPLOYEE in

Works_in

Note: While the logical design in Figure 6.3 can be offered as a solution to this mapping, the partial participation of EMPLOYEE in

Works_in

entails possible null values for the foreign key Emp_pl_name in EMPLOYEE in some tuples. In a strict sense, this condition violates the definition of a relation schema.

Chapter 6 – The Relational Data Model

55

Solution A: Foreign Key Design (Figure 6.3) (Partial Participation of EMPLOYEE in

Works_in

)

EMPLOYEE Emp_e#a Emp_e#n Emp_fname Emp_minit Emp_lname Emp_nametag Emp_gender Emp_address Emp_salary Emp_datehired Emp_pl_name PLANT PI_name Pl_p# Pl_budget BUILDING Bld_building Bld_pl_p# Note: The foreign key in EMPLOYEE (Emp_pl_name) referencing PLANT may have null values in some tuples because of the partial participation of EMPLOYEE in

Works_in

. In a strict sense, this condition violates the definition of a relation schema.

Chapter 6 – The Relational Data Model

56

Solution A (Foreign Key Design) Demonstrated

• (1,1): Foreign key must not have null values.

PLANT Pl_number Pl_name

10 Aberdeen 11 Bellaire 12 Edloe 13 Underwood

EMPLOYEE Emp_n Emp_a

7 XY 5 YW 12 FX 22 CD 44 XP

Emp_name

Biggio Bagwell Kent Clemens Oswalt

Emp_pl_number

13 13 11 11 10 • (0,1): Foreign key can have null values.

PLANT Pl_number Pl_name

10 Aberdeen 11 Bellaire 12 Edloe 13 Underwood

EMPLOYEE Emp_n Emp_a

7 XY 5 YW 12 FX 22 CD 44 XP

Emp_name

Biggio Bagwell Kent Clemens Oswalt

Emp_pl_number

13 13 11 NULL 10

Chapter 6 – The Relational Data Model

57

Solution B (Cross-Referencing Design)

• Cross-referencing design: Create a separate relation schema representing the relationship type.

– Tuples in the relation “WORKS_IN” represent only employees who actually work in a plant.

– All tuples in “WORKS_IN” have corresponding employee tuples in the EMPLOYEE relation.

– The EMPLOYEE relation may also contain additional tuples i.e., tuples for employees who do not work in any plant.

Chapter 6 – The Relational Data Model

58

Solution B (Cross-Referencing Design) (Partial Participation of EMPLOYEE in

Works_in

)

Note: WORKS_IN contains only employees that work in a plant. The foreign key in WORKS_IN (Wrk_pl_p#) referencing PLANT will never have a null value in any tuple. The partial participation of EMPLOYEE in

Works_in

results from tuples in EMPLOYEE not referenced by WORKS_IN. The design, however requires addition of a relation schema.

Chapter 6 – The Relational Data Model

59

Solution B (Cross-Referencing Design) (Alternative Notation)

EMPLOYEE ( Emp_e#a, Emp_e#n , Emp_fname, Emp_minit, Emp_lname, Emp_nametag, Emp_gender, Emp_address, Emp.salary, Emp_datehired) WORKS_IN (Wrk_emp_e#a, Wrk_emp_e#n, Wrk_pl_p#) PLANT (PI_plname, Pl_p#, Pl_budget) BUILDING (Bld_building, Bld_pl_p#)

Solution B (Cross-referencing Design) Demonstrated

Foreign key design: Partial participation of EMPLOYEE in the relationship permits null value for foreign key violating the strict definition of a relation schema.

PLANT Pl_number Pl_name

10 Aberdeen 11 Bellaire 12 Edloe 13 Underwood

EMPLOYEE Emp_n Emp_a

7 XY 5 YW 12 FX 22 CD 44 XP

Emp_name

Biggio Bagwell Kent Clemens Oswalt

Emp_pl_number

13 13 11 NULL 10 Cross referencing design: Despite partial participation of EMPLOYEE in the relationship, foreign key will not have null value in any tuple, thus complying with the strict definition of a relation schema .

PLANT Pl_number Pl_name

10 Aberdeen 11 Bellaire 12 Edloe 13 Underwood

EMPLOYEE Emp_n Emp_a

7 XY 5 YW 12 FX 22 CD 44 XP

Emp_name

Biggio Bagwell Kent Clemens Oswalt

WORKS_IN

Emp_n Emp_a

7 XY 5 YW 12 FX 44 XP

Wk_pl_number

13 13 11 10

Chapter 6 – The Relational Data Model

61

Mapping Cardinality Ratio 1:1

• Case 1: The participation of one of the entity types in the relationship type is total.

• Case 2: The participation of both entity types in the relationship type is partial.

• Case 3: The participation of both entity types in the relationship type is total.

Chapter 6 – The Relational Data Model

62

Case 1 - The participation of one entity type participating in the relationship type is total

Chapter 6 – The Relational Data Model

63

Case 1 - The participation of one entity type participating in the relationship type is total

Steps: • Choose the entity type with the total participation in the relationship to be the referencing relation schema (i.e., child).

• Add the foreign key to the child relation schema referencing the primary key (or an alternate key) in the referenced relation schema.

• Foreign key must not have a null value.

[Specification of total participation of child in the relationship] • Foreign key must be unique.

[Specification of 1:1 cardinality constraint]

Chapter 6 – The Relational Data Model

64

Case 1 Solution: Foreign Key Design

Chapter 6 – The Relational Data Model

65

Case 2 - The participation of both entity types in the relationship type is partial

Chapter 6 – The Relational Data Model

66

Case 2 - The participation of both entity types in the relationship type is partial

Steps: • Choose any entity type to be the child.

• Add a foreign key attribute to the child relation schema referencing the primary key (or an alternate key) in the referenced relation schema.

– Foreign key is allowed to have a null value.

Note: Since from a strictly design perspective either one of the relation schemas in this (1:1) relationship with partial participation of both relation schemas can be chosen as the child (carrying the foreign key), often other semantic or operational consideration may suggest the best choice.

Chapter 6 – The Relational Data Model

67

Case 2 Solution: Foreign Key Design

PLANT Pl_number Pl_name

10 Aberdeen 11 Bellaire 12 Edloe 13 Underwood

EMPLOYEE Emp_n Emp_a

7 XY 5 YW 12 FX 22 CD 44 XP

Emp_name

Biggio Bagwell Kent Clemens Oswalt

Emp_pl_number

13 13 11 NULL 10 Note: In this design the foreign key may have null value in some tuples

Chapter 6 – The Relational Data Model

68

Case 2: An Alternative Solution [Cross-Referencing Design]

• • Create a separate relation schema representing the relationship type. Here, MANAGED_BY references PLANT as well as EMPLOYEE.

Note: Foreign keys in MANAGED_BY will never have a null value in any tuple

Chapter 6 – The Relational Data Model

69

Mutual Referencing Design

• Can be used for both case 1 and case 2 • Since this is a (1:1) relationship, both relation schemas can be parent as well as child. – Add foreign key to both relation schemas • Demerits: – Unnecessary to establish the relationship – Creates a cycle; so, enforcement of at least one of the referencing must be deferred until run time.

– Entails specification of additional constraints for consistency maintenance across the relations • Merit: Can facilitate optimal data access

Chapter 6 – The Relational Data Model

70

Case 3 - The participation of both entity types in the relationship type is total

Chapter 6 – The Relational Data Model

71

Case 3 - The participation of both entity types in the relationship type is total

Steps • Use mutual referencing – i.e., add a foreign key in both relation schemas.

• Issues: {Require procedural intervention} – Consistency maintenance across the relation schemas – Mutual referencing creates a cycle requiring run-time resolution • Alternative Solution: Use a single-schema design (i.e., collapse the two relation schemas into one)

Chapter 6 – The Relational Data Model

72

Case 3 Solution: Mutual Referencing Design

Note 1: Mutual referencing between PLANT and MANAGER Note 2: Foreign key design and cross-referencing design will not yield correct solution.

Chapter 6 – The Relational Data Model

73

Information Lost in the Mapping Process

• Both methods (Directed arc and Inclusion dependency) are incapable of distinguishing between 1:1 and 1:n cardinality ratios • Both methods do not map the participation constraints of a relationship type • The optional/mandatory property of an attribute is not retained in the transformation • Alternate keys (candidate keys not chosen as the primary key) can no longer be identified

Chapter 6 – The Relational Data Model

74

Information Lost in the Mapping Process (continued)

• The composite nature of some collection of atomic attributes is ignored in the mapping process • Derived attributes specified in the ER diagram are not carried forward • Deletion rules are not mapped to the logical schema • Attribute type and size specified in the ER diagram are not carried forward

Chapter 6 – The Relational Data Model

75

Alternatively, Use Another Notation…

Information-preserving grammar for the logical schema

Chapter 6 – The Relational Data Model

76

Source Schema for Information-preserving Mapping

[A,5] Suffix [A,1] Prefix [A,1] Minit [A,20] Fname [A,20] Lname [N,1] Name_tag [X,50] Address [N,6] Salary [A,30] Pl_name [N,2] Pnumber [N,7] Budget [Dt,8] Mgr_start_dt [N,3] No_of_employees [A,1] Gender [Dt,8] Date_hired [N,2] No_of_dependents [N,2] Pnumber [A,20] Pr_name [A,15] Plocation [A,15] Dname [A,12] Related_how [N,3] Hours [A,20] Building [A,1] Acct_type [X,6] Account# [Dt,8] Birthdate [A,1] Gender [N,6] Annual_cost [A,20] Hb_name (1, 1) [N,2.1] Hrs_per_wk [N,8.2] Balance Io_activity [A,1] Gi_activity [A,1] Figure 6.20 Fine-granular Design-Specific ER diagram for Bearcat Incorporated

Chapter 6 – The Relational Data Model

77

Solution 1: Directed Arc Method (Information-reducing)

EMPLOYEE Emp_fname Emp_minit Emp_lname Emp_nametag Emp_gender Emp_address Emp_salary Emp_datehired Emp_pl_name Emp-emp_e#a Emp_emp_e#n Emp_e#a Emp_e#n PLANT Pl_name Pl_p# Pl_budget Pl_emp_e#a Pl_emp_e#n Pl_mgrstdte BUILDING Bld_building Bld_pl_p# PROJECT Prj_name Prj_location Prj_p# Prj_pl_p# ASSIGNMENT Asg_prj_p# Asg_emp_e#a Asg_emp_e#n Asg_hrs DEPENDENT Dep_sex Dep_brthdte Dep_name Dep_relhow Dep_emp_e#a Dep_emp_e#n BCU_ACCOUNT Bcu_dep_name Bcu_dep_relhow Bcu_dep_emp_e#a Bcu_dep_emp_e#n Bcu_acct_type Bcu_acct# Bcu_balance Bcu_emp_e#a Bcu_emp_e#n PARTICIPATION Par_dep_name Par_dep_relhow Par_dep_emp_e#a Par_dep_emp_e#n Par_hob_name Par_anncost Par_hrsweek HOBBY Hob_name Hob_ioact Hob_giact Figure 6.21 Logical schema for the Bearcat Incorporated: Foreign key design using directed arcs

Chapter 6 – The Relational Data Model

78

Expression of Solution 1 using Inclusion Dependencies

Figure 6.22 Logical schema for the Bearcat Incorporated: Foreign key design using inclusion dependencies

Chapter 6 – The Relational Data Model

79

Expression of Solution 1 using Inclusion Dependencies

L1: EMPLOYEE (Emp_fname, Emp_minit, Emp_lname, Emp_nametag, Emp_emp_e#a, Emp_emp_e#n, Emp_address, Emp_salary, Emp_pl_name, Emp_gender, Emp_datehired, Emp_e#a, Emp_e#n) # EMPLOYEE.{Emp_emp_e#a, Emp_emp_e#n}  EMPLOYEE.{Emp_e#a, Emp_e#n} or

null

EMPLOYEE.{Emp_pl_name}  PLANT.{Pl_name} L2: PLANT (Pl_p#, Pl_budget, Pl_name, Pl_emp_e#a, Pl_emp_e#n, Pl_mgrstdte) # PLANT.{Pl_emp_e#a, Pl_emp_e#n}  EMPLOYEE.{Emp_e#a, Emp_e#n} L3: BUILDING (Bld_building, Bld_pl_p#) # BUILDING.{Bld_pl_p#}  PLANT.{Pl_p#}

Chapter 6 – The Relational Data Model

80

Solution 1 using Inclusion Dependencies (continued)

L4: PROJECT (Prj_name, Prj_location, Prj_p#, Prj_pl_p#) # PROJECT.{Prj_pl_p#}  PLANT.{Pl_p#} or

null

L5: ASSIGNMENT (Asg_prj_p#, Asg_emp_e#a, Asg_emp_e#n, Asg_hrs) # ASSIGNMENT.{Asg_prj_p#}  PROJECT.{Prj_p#} ASSIGNMENT.{Asg_emp_e#a, Asg_emp_e#n}  EMPLOYEE.{Emp_e#a, Emp_e#n} L6: DEPENDENT (Dep_sex, Dep_brthdte, Dep_name, Dep_relhow, Dep_emp_e#a, Dep_emp_e#n) # DEPENDENT.{Dep_emp_e#a, Dep_emp_e#n}  EMPLOYEE.{Emp_e#a, Emp_e#n}

Chapter 6 – The Relational Data Model

81

Solution 1 using Inclusion Dependencies (continued)

L7: BCU_ACCOUNT (Bcu_dep_name, Bcu_dep_relhow, Bcu_dep_emp_e#a, Bcu_dep_emp_e#n, Bcu_acct_type, Bcu_acct#, Bcu-balance, Bcu_emp_e#a, Bcu_emp_e#n) # BCU_ACCOUNT.{Bcu_emp_e#a, Bcu_emp_e#n}  EMPLOYEE.{Emp_e#a, Emp_e#n} or

null

BCU_ACCOUNT.{Bcu_dep_name, Bcu_dep_relhow, Bcu_dep_emp_e#a, Bcu_dep_emp_e#n}  DEPENDENT.{ Dep_name, Dep_relhow, Dep_emp_e#a, Dep_emp_e#n} or

null

L8: PARTICIPATION (Par_dep_name, Par_dep_relhow, Par_dep_emp_e#a, Par_dep_emp_e#n, Par_hob_name, Par_anncost, Par_hrsweek) # PARTICIPATION.{Par_hob_name}  HOBBY.{Hob_name} PARTICIPATION.{Par_dep_name, Par_dep_relhow, Par_dep_emp_e#a, Par_dep_emp_e#n}  DEPENDENT.{ Dep_name, Dep_relhow, Dep_emp_e#a, Dep_emp_e#n} L9: HOBBY (Hob_name, Hob_Ioact, Hob_giact)

Chapter 6 – The Relational Data Model

82

Solution 2: Information-preserving Mapping

L1: EMPLOYEE

[

Emp_e#a (A,1) Emp_e#n

]

(N,5)

Q[

Emp_fname  Emp_minit (A,20) (A,1) (A,20) Emp_nametag

]

 Emp_gender  Emp_address  Emp_salary (N 1) (A,1) (X,50) (N,6) 100 L2 n Emp_datehired 

Emp_pl_name

(Dt,8)

(A,30)

1 R 1 0 <- - - - - L1 - - - - -> 20 Emp_emp_e#a (A,1) Emp_emp_e#n (N,5) No_of_dependends - - - - - - - - - (N 2) 0 <- - - - - D - - - - -> 1 L2:: PLANT

Q[

Pl_name

]

(A,30) Pl_p# (N,2) Pl_budget (N,7) 0 <- - - - - L1 - - - -> 1

Pl_emp_e#a

(A,1)

Pl_emp_e#n

(N,5) Pl_nemps - - - - - - - - - - (N,3) 1 < - - - - R - - - - > 1 Pl_mgrstdte (Dt,8) L3: BUILDING 3 L2 n Bld_building  (A,20)

Bld_pl_p#

(N,2) 1 R 1 L4: PROJECT

Q[

Prj_name

]

(A,20) Prj_location  (A,15) Prj_p# (N,2) 0 L2 n

Prj_pl_p#

(N,2) 0 N 1 L5: 1 L4 m ASSIGNMENT

Asg_prj_p#

(N,2) 1 C 1 1 <- - - - - L1 - - - - -> n

Asg_emp_e#a

(A,1)

Asg_emp_e#n

(N,5) 1 <- - - - - R - - - - -> 1 Asg_hrs (N,3)  L6: DEPENDENT Dep_sex (A, 1) Dep_brthdte (Dt, 8) [ Dep_name  (A, 15) Dep_relhow (A, 12) 0 <- - - - - - - L1 - - - - - - -> n

Dep_emp_e#a

(A, 1)

Dep_emp_e#n

(N, 5) 1 <- - - - - - - C - - - - - - -> 1 L7: BCU_ACCOUNT 0 <- - - - - - - - - - - - - - - L6 - - - - - - - - - - - - - - -> n 1 <- - - - - - - L1- - - - - - -> m

Bcu_dep_name

(A, 15)

Bcu_dep_relhow

(A, 12)

Bcu_dep_emp_e#a

(A, 1)

Bcu_dep_emp_e#n

(N, 5)

[

Bcu_acct_type (A, 1) Bcu_acct#

]

(X, 6) Bcu_balance (N, 8.2)

Bcu_emp_e#a

(A, 1)

Bcu_emp_e#n

(N, 5) 0 <- - - - - - - - - - - - - - - C - - - - - - - - - - - - - - -> 1 0 <- - - - - - - C - - - - - - -> 1 L8: PARTICIPATION 0 <- - - - - - - - - - - - - - - - - - - - L6 - - - - - - - - - - - - - - - - - - - - -> n

Par_dep_name

(A, 15)

Par_dep_relhow

(A, 12)

Par_dep_emp_e#a

(A, 1)

Par_dep_emp_e#n

(N, 5) 1 <- - - - - - - - - - - - - - - - - - - - - C - - - - - - - - - - - - - - - - - - - - - - -> 1 1 L9 m

Par_hob_name

(A, 20) 1 R 1 Par_anncost (N, 6) Par_hrsweek (N, 2.1) L9: HOBBY Hob_name (A, 20) Hob_ioact (A, 1) Hob_giact (A, 1) Figure 6.24 Information-preserving logical schema for the Fine-granular Design-Specific ER diagram for Bearcat Incorporated in Figure 6.20

Chapter 6 – The Relational Data Model

83

Mapping EER Constructs

• Mapping a specialization/generalization hierarchy and lattice • Mapping a categorization • Mapping an aggregation Remember: • SC/sc relationships always have a cardinality ratio of 1:1 • The participation of a subclass in the relationship is always total

Chapter 6 – The Relational Data Model

84

Source Schema for Mapping a Specialization

Note: Disjoint specialization partial completeness

Chapter 6 – The Relational Data Model

85

Solution 1

• • • This solution fully supports disjointness (disjoint or overlap) and completeness (partial or total) constraints of a specialization construct However, the solution entails creation of four relation schemas and three foreign key references The most general solution

Chapter 6 – The Relational Data Model

86

Solution 2

• • • • This solution also fully supports disjointness (disjoint or overlap) and completeness (partial or total) constraints of a specialization construct The solution entails creation of just one relation schema There are no foreign key references However, disjoint subclass attributes will have null values in every tuple

Chapter 6 – The Relational Data Model

87

Solution 3

• • • • • • This solution fully supports disjointness (disjoint or overlap) constraint of a specialization construct The solution entails creation of three relation schemas There are no foreign key references The disjoint specialization will not cause null values in any tuple Caution: Partial completeness not supported – mapping will result in loss of tuples Note : Best solution when completeness constraint is ‘Total’

Chapter 6 – The Relational Data Model

88

Multiple Specializations and Specialization Hierarchy

Chapter 6 – The Relational Data Model

89

Foreign Key Design for Multiple Specializations and Specialization Hierarchy

Chapter 6 – The Relational Data Model

90

Source Schema for a Specialization Lattice and a Categorization

Chapter 6 – The Relational Data Model

91

Foreign Key Design for the Specialization Lattice and Categorization

Surrogate Key

Chapter 6 – The Relational Data Model

92

Source Schema for an Aggregation and a Categorization

Chapter 6 – The Relational Data Model

93

Foreign Key Design for the Aggregation and Categorization

Aggregation Aggregation Categorization Categorization Surrogate Key

Chapter 6 – The Relational Data Model

94

Information Lost in EER Mapping

• The type of relationship (e.g., specialization/generalization, categorization, aggregation) is not carried forward to the logical schema • SC/sc relationships (i.e., intra-entity class relationships) become indistinguishable from the regular (i.e., inter-entity class) relationships • The disjointness constraint of a specialization/generalization is lost during the conversion process

Chapter 6 – The Relational Data Model

95

Information Lost in EER Mapping (continued)

• Multiple specializations of the same superclass are not captured • Specialization lattices are not discernible • The number of subclasses participating in a specialization and the number of superclasses participating in a categorization and/or aggregation is lost in the mapping • The completeness constraint of a SC/sc relationship is not present in the logical schema

Chapter 6 – The Relational Data Model

96