Transcript Chapter 1

Today ’s Agenda

 Any questions about the assignment (due Mon)?

 Quiz  Quiz review  Homework for Friday:  Watch the two videos on the Coursera db website that deal with relational algebra  And/or read up on it from your favorite resource  Formalize concepts of database anomalies  If time, finish off our Transit Tables 4-1

Today ’s slides from: Chapter 4 The Database Management System Concept

Fundamentals of Database Management Systems

by Mark L. Gillenson, Ph.D.

University of Memphis John Wiley & Sons, Inc.

Objectives

 List the three problems created by data redundancy.  Describe the nature of data redundancy among many files.  Explain the relationship between data integration and data redundancy in one file.

4-3

The Database Concept

 Data Integration and Data Redundancy - The ability to achieve data integration while at the same time storing data in a nonredundant fashion. This, alone, is the central, defining feature of the database approach.

 Multiple Relationships - The ability to store data representing entities involved in multiple relationships without introducing data redundancy or other structural problems. 4-4

Data Integration and Data Redundancy

 Data integration - the ability to tie together pieces of related data within an information system.

 Data redundancy - the same fact about the business environment is stored more than once within an information system.

4-5

Data Redundancy - Problems

 Redundant data takes up a great deal of extra disk space.  If the redundant data has to be updated, it takes additional time to do so. This can be a major performance issue.

 There is the potential for data integrity problems.

4-6

Data Integrity

 Refers to the accuracy of the data.

 Inaccurate data leaves the whole information system of limited value.

4-7

Data Redundancy, Data Integrity

 When all copies of redundant data are not updated consistently, a data integrity problem exists.

4-8

Three Files with Redundant Data

Sales file

Customer Number

2746795

Customer Name

John Jones

Customer Address

123 Elm Street Accounts Receivable file

Customer Number

2746795

Customer Name

John Jones Credit file

Customer Number

2746795

Customer Name

John Jones

Customer Address

123 Elm Street

Customer Address

123 Elm Street 4-9

Three Files with a Data Integrity Problem

Sales file

Customer Number

2746795

Customer Name

John Jones

Customer Address

456 Oak Street Accounts Receivable file

Customer Number

2746795

Customer Name

John Jones Credit file

Customer Number

2746795

Customer Name

John Jones

Customer Address

456 Oak Street

Customer Address

123 Elm Street 4-10

Salesperson Number

137 186 204 361 (a) Salesperson file

Salesperson Name

Baker

Commission Percentage

10 Adams Dickens Carlyle 15 10 20

Year Of Hire

1995 2001 1998 2001

Customer Number

0121 0839 0933 1047 1525 1700 1826 2198 2267 (b) Customer file

Customer Name Salesperson Number

Main St.

Hardware Jane’s Stores 137 186 137 ABC Home Stores Acme Hardware Store Fred’s Tool Stores XYZ Stores City Hardware Western Hardware Central Stores 137 361 361 137 204 186

HQ City

New York Chicago Los Angeles Los Angeles Atlanta Washington New York New York New York  General Hardware Company Files 4-11

General Hardware Company Combined File

0121 Main St.

Hardware 0839 Jane’s Stores 0933 ABC Home Stores 1047 Acme Hardware Store 1525 Fred’s Tool Stores 1700 XYZ Stores 1826 City Hardware 2198 Western Hardware 2267 Central Stores 137 New York 137 Baker 186 Chicago 186 Adams 137 Los Angeles 137 Baker 137 Los Angeles 137 Baker 361 Atlanta 361 Washington 137 New York 204 New York 186 New York 361 Carlyle 10 1995 15 2001 10 1995 10 1995 20 2001 361 Carlyle 137 Baker 20 2001 10 1995 204 Dickens 10 1998 186 Adams 15 2001 4-12

Anomalies

 Typically occur in poorly structured files.

 Problems arise when two different kinds of data, like salesperson and customer data are merged into one file.

4-13

Anomalies

Deletion Anomaly

- e.g, if you delete a customer and that record was the only one for a salesperson, the salesperson ’ s data is gone.

Insertion Anomaly

- e.g., General Hardware cannot add data about a new salesperson the company just hired until she is assigned at least one customer.

Update Anomaly

- redundant data in the database file must be updated each place it exists when it changes 4-14

Database Management System

 A software utility for storing and retrieving data that gives the end-user the impression that the data is well integrated even though the data can be stored with no redundancy at all.

4-15

How would you represent Sales and Customers?

 What is the cardinality/modality of the relationship  How should we handle it?

4-16

In order to avoid anomalies, we normalize our tables

 But first… we will start a side trip into Relational Algebra first  Need to know the basic rules for how relational data is updated/deleted/combined/selected/projected in order to fully understand the implications of normalization  Do your homework for Friday – dive into relational algebra 4-17

“ Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information contained herein.

” 4-18