Chapter 9 Study Tool - Universiti Teknologi MARA
Download
Report
Transcript Chapter 9 Study Tool - Universiti Teknologi MARA
CHAPTER 6
DATA DESIGN
Chapter Objectives
Explain data design concepts and structures
Describe file processing systems
Explain database systems and define the
components of a database management system
(DBMS)
Describe Web-based data design
2
Chapter Objectives
Explain data design terminology, including entities,
fields, common fields, records, files, tables, and key
fields
Describe data relationships, draw an entity
relationship diagram, define cardinality, and use
cardinality notation
Explain the concept of normalization
Explain the importance of codes and describe
various coding schemes
3
Chapter Objectives
Describe relational and object-oriented database
models
Explain data warehousing and data mining
Differentiate between logical and physical storage
and records
Explain data control measures
4
Introduction
You will develop a physical plan for data
organization, storage, and retrieval
Begins with a review of data design concepts and
terminology, then discusses file-based systems and
database systems, including Web-based
databases
Concludes with a discussion of data storage and
access, including strategic tools such as data
warehousing and data mining, physical design
issues, logical and physical records, data storage
formats, and data controls
5
Data Design Concepts
Before constructing an information system, a systems
analyst must understand basic design concepts,
including data structures and the characteristics of
file processing and database systems, including
Web-based database design
6
Data Design Concepts
Data Structures
Each
file or table contains data about people, places,
things or events that interact with the information system
File-oriented system
File processing system
Database system
7
Data Design Concepts
Overview of File Processing
Some
companies still use file processing to handle large
volumes of structured data on a regular basis
Although much less common today, file processing can
be efficient and cost-effective in certain situations
8
Data Design Concepts
Overview of File Processing
Potential
problems
Data
redundancy
Data integrity
Rigid data structure
9
Data Design Concepts
Overview of File Processing
Various
types of files
Master
file
Table file
Transaction file
Work file
Security file
History file
10
Data Design Concepts
The Evolution from File Systems to Database
Systems
A
properly designed database system offers a solution
to the problems of file processing
Provides an overall framework that avoids data
redundancy and supports a real-time, dynamic
environment
11
Data Design Concepts
The Evolution from File Systems to Database
Systems
A
database management system (DBMS) is a collection
of tools, features, and interfaces that enables users to
add, update, manage, access, and analyze the contents
of a database
The main advantage of a DBMS is that it offers timely,
interactive, and flexible data access
12
Data Design Concepts
The Evolution from File Systems to Database
Systems
Advantages
Scalability
Better
support for client/server systems
Economy of scale
Flexible data sharing
Enterprise-wide application – database administrator (DBA)
Stronger standards
13
Data Design Concepts
The Evolution from File Systems to Database
Systems
Advantages
Controlled
redundancy
Better security
Increased programmer productivity
Data independence
14
Data Design Concepts
The Evolution from File Systems to Database
Systems
Although
the trend is toward enterprise-wide database
design, many companies still use a combination of
centralized DBMSs and smaller, department-level
database systems
The compromise, in many cases, is a client/server
design, where processing is shared among several
computers
15
DBMS Components
A DBMS provides an interface between a database
and users who need to access the data
In addition to interfaces for users, database
administrators, and related systems, a DBMS also
has a data manipulation language, a schema and
subschemas, and a physical data repository
16
DBMS Components
Interfaces for Users, Database Administrators, and
Related Systems
Users
Query
language
Query by example (QBE)
SQL (structured query language)
Database
A
Administrators
DBA is responsible for DBMS management and support
17
DBMS Components
Interfaces for Users, Database Administrators, and
Related Systems
Related
information systems
A
DBMS can support several related information systems
that provide input to, and require specific data from, the
DBMS
No human intervention is required for two-way
communication
18
DBMS Components
Data Manipulation Language
A
data manipulation language (DML) controls database
operations, including storing, retrieving, updating, and
deleting data
Some database products also provide an easy-to-use
graphical environment that enables users to control
operations with menu-driven commands.
19
DBMS Components
Schema
The
complete definition of a database, including
descriptions of all fields, tables, and relationships, is
called a schema
You also can define one or more subschemas
For example, specific users, systems, or locations might
be permitted to create, retrieve, update, or delete
data, depending on their needs and the company’s
security policies
20
DBMS Components
Physical Data Repository
The
data dictionary is transformed into a physical data
repository, which also contains the schema and
subschemas
The physical repository might be centralized, or
distributed at several locations
ODBC – open database connectivity
JDBC – Java database connectivity
21
Web-Based Database Design
The following sections discuss the characteristics of
Web-based design, Internet terminology, connecting
a database to the Web, and data security on the
Web
22
Web-Based Database Design
Characteristics of Web-Based Design
In
a Web-based design, the Internet serves as the
front end, or interface for the database management
system. Internet technology provides enormous power
and flexibility
Web-based systems are popular because they offer
ease of access, cost-effectiveness, and worldwide
connectivity
23
Web-Based Database Design
Internet Terminology
Web
browser
Web page
HTML (Hypertext Markup Language)
Tags
Web server
Web site
24
Web-Based Database Design
Internet Terminology
Intranet
Extranet
Protocols
Web-centric
Clients
Servers
25
Web-Based Database Design
Connecting a Database to the Web
Database
must be connected to the Internet or intranet
The database and the Internet speak two different
languages
Middleware
Adobe
ColdFusion
26
Web-Based Database Design
Data Security
Web-based
data must be secure, yet easily accessible
to authorized users
To achieve this goal, well-designed systems provide
security at three levels: the database itself, the Web
server, and the telecommunication links that connect the
components of the system
27
Data Design Terminology
Definitions
Entity
Table
or file
Field
Attribute
Common
field
Record
Tuple
28
Data Design Terminology
Key Fields
Primary
key
Combination
key
Composite key
Concatenated key
Multi-valued key
29
Data Design Terminology
Key Fields
Candidate
Nonkey
key
field
Foreign
key
Secondary key
30
Data Design Terminology
Referential Integrity
Validity
checks can help avoid data input errors
In a relational database, referential integrity means
that a foreign key value cannot be entered in one table
unless it matches an existing primary key in another
table
Orphan
31
Entity-Relationship Diagrams
An entity is a person, place, thing, or event for
which data is collected and maintained
Entity-relationship diagram (ERD)
An ERD provides an overall view of the system, and
a blueprint for creating the physical data structures
32
Entity-Relationship Diagrams
Drawing an ERD
The
first step is to list the entities that you identified
during the fact-finding process and to consider the
nature of the relationships that link them
A popular method is to represent entities as rectangles
and relationships as diamond shapes
33
Entity-Relationship Diagrams
Types of Relationships
Three
types of relationships can exist between entities
One-to-one relationship (1:1)
One-to-many relationship (1:M)
Many-to-many relationship (M:N)
Associative
entity
34
Entity-Relationship Diagrams
Cardinality
Cardinality
notation
Crow’s foot notation
Unified Modeling Language (UML)
Now that you understand database elements and their
relationships, you can start designing tables
35
Normalization
Normalization
Table design
Involves four stages: unnormalized design, first
normal form, second normal form, and third normal
form
Most business-related databases must be designed
in third normal form
36
Normalization
Standard Notation Format
Designing
tables is easier if you use a standard
notation format to show a table’s structure, fields, and
primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
37
Normalization
Repeating Groups and Unnormalized Design
Repeating
Often
group
occur in manual documents prepared by users
Unnormalized
Enclose
the repeating group of fields within a second
set of parentheses
38
Normalization
First Normal Form
A
table is in first normal form (1NF) if it does not
contain a repeating group
To convert, you must expand the table’s primary key to
include the primary key of the repeating group
39
Normalization
Second Normal Form
To
understand second normal form (2NF), you must
understand the concept of functional dependence
Field X is functionally dependent on field Y if the value
of field X depends on the value of field Y
40
Normalization
Second Normal Form
A standard process exists for converting a table
from 1NF to 2NF
1.
2.
3.
First, create and name a separate table for each field
in the existing primary key
Next, create a new table for each possible
combination of the original primary key fields
Finally, study the three tables and place each field
with its appropriate primary key
41
Normalization
Second Normal Form
Four
kinds of problems are found with 1NF description
that do not exist with 2NF
Consider
the work necessary to change a particular
product’s description
1NF tables can contain inconsistent data
Adding a new product is a problem
Deleting a product is a problem
42
Normalization
Third Normal Form
3NF
design avoids redundancy and data integrity
problems that still can exist in 2NF designs
A table design is in third normal form (3NF) if it is in
2NF and if no nonkey field is dependent on another
nonkey field
43
Normalization
Third Normal Form
To
convert the table to 3NF, you must remove all fields
from the 2NF table that depend on another nonkey
field and place them in a new table that uses the
nonkey field as a primary key
44
Normalization
A Normalization Example
To
show the normalization process, consider the
familiar situation in Figure 9-27 which might depict
several entities in a school advising system: ADVISOR,
COURSE, and STUDENT
45
Using Codes During System Design
Overview of Codes
A
code is a set of letters or numbers that represents a
data item. Codes can be used to simplify output, input,
and data formats.
Because codes often are used to represent data, you
encounter them constantly in your everyday life
They save storage space and costs, reduce data
transmission time, and decrease data entry time
Can reduce data input errors
46
Using Codes During System Design
Types of Codes
[1] Sequence codes
Numbers/letters assigned in a specific order
E.g. UiTM student matric number
[2] Block sequence codes
Block sequence codes use blocks of numbers for different
classifications.
E.g. course codes
1xx – 3xx : diploma courses
4xx – 6xx : bachelor courses
7xx – 8xx : master courses
47
Using Codes During System Design
Types of Codes
[3] Alphabetic codes
[a] Category codes – identify related items using
numbers/letters
E.g. ITS – system science courses, ITT – data comm. &
networking courses, ITC – comp. science courses, ACC
– accounting courses
[b] Abbreviation codes – mnemonic
codes/abbreviations
E.g MY = Malaysia, SG = Singapore, NZ = New
Zealand
48
Using Codes During System Design
Types of codes
[4] Significant digit codes - Distinguish items by using
a series of subgroups of digits
[5] Derivation codes - Combine data from different
item attributes/characteristics to build the code
E.g. Classroom number in UiTM Pahang: J1-01, J1-02, A101, A2-01
E.g. Magazine release num: 201109.Vol.18 (Vol. 18,
Released on September 2011)
Cipher codes
Action codes
49
Using Codes During System Design
Types of codes
[6] Cipher codes: use a keyword to encode a
number
E.g. E.g. IASETTO = 1453770
[7] Action codes - Indicate action to be executed
associated with item
E.g.
X – exit program, F – File menu
50
Using Codes During System Design
Developing a Code
1.
2.
3.
4.
5.
Keep codes concise
Allow for expansion
Keep codes stable
Make codes unique
Use sortable codes
51
Using Codes During System Design
Developing a Code
6.
7.
8.
9.
Avoid confusing codes
Make codes meaningful
Use a code for a single purpose
Keep codes consistent
52
Steps in Database Design
1.
2.
3.
4.
Create the initial ERD
Assign all data elements to entities
Create 3NF designs for all tables
Verify all data dictionary entries
After creating your final ERD and normalized table
designs, you can transform them into a database
53
Database Models
Relational Databases
The
relational model was introduced during the 1970s
and became popular because it was flexible and
powerful
Because all the tables are linked, a user can request
data that meets specific conditions
New entities and attributes can be added at any time
without restructuring the entire database
54
Database Models
Object-Oriented Databases
Many
systems developers are using object-oriented
database (OODB) design as a natural extension of
the object-oriented analysis process
Object
Management Group (OMG)
Each object has a unique object identifier
55
Data Storage and Access
Data storage and access involve strategic business
tools
Strategic tools for data storage and access
Data
warehouse – dimensions
Data mart
56
Data Storage and Access
Strategic tools for data storage and access
Data
Mining
Increase
average pages viewed per session.
Increase number of referred customers
Reduce clicks to close
Increase checkouts per visit
Increase average profit per checkout
Clickstream storage – market basket analysis
57
Data Storage and Access
Logical and Physical Storage
Logical
storage
Characters
Date
element or data item
Logical record
Physical
storage
Physical
record or block
Buffer
Blocking
factor
58
Data Storage and Access
Data Coding and Storage
Binary
digits
Bit
Byte
EBCDIC,
EBCDIC
ASCII, and Binary
and ASCII
59
Data Storage and Access
Data Coding and Storage
EBCDIC,
ASCII, and Binary
Binary
storage format
Integer format
Long integer format
Unicode
60
Data Storage and Access
Data Coding and Storage
Storing
dates
Y2K
Issue
Most date formats now are based on the model
established by the International Organization for
Standardization (ISO)
Absolute date
61
Data Control
File and database control must include all
measures necessary to ensure that data storage is
correct, complete, and secure
A well-designed DBMS must provide built-in
control and security features, including
subschemas, passwords, encryption, audit trail
files, and backup and recovery procedures to
maintain data
62
Data Control
User ID
Password
Permissions
Encryption
Backup
Recovery procedures
Audit log files
Audit fields
63
Chapter Summary
Files and tables contain data about people, places,
things, or events that affect the information system
DBMS designs are more powerful and flexible than
traditional file-oriented systems
64
Chapter Summary
An entity-relationship (ERD) is a graphic
representation of all system entities and the
relationships among them
A code is a set of letters or numbers used to
represent data in a system
The most common database models are relational
and object-oriented
65
Chapter Summary
Logical storage is information seen through a
user’s eyes, regardless of how or where that
information actually is organized or stored
Physical storage is hardware-related and involves
reading and writing blocks of binary data to
physical media
File and database control measures include
limiting access to the data, data encryption,
backup/recovery procedures, audit-trail files, and
internal audit fields
66