LIS 397.1 Introduction to Research in Library and

Download Report

Transcript LIS 397.1 Introduction to Research in Library and

LIS 384K.11
Database-Management
Principles and Applications
How to Design and Develop
a Database Application
R. E. Wyllys
Last revised 2002 Mar 25
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
What Is the Process of Developing a
Database Application?
• Begin with
– Analysis of the situation for which the database is to be
used
• Continue by identifying the
– Sources and destinations of information to be handled
– Major concepts, i.e., the entities, in the situation
– Relationships among the entities
• Explore in detail the
– Characteristics, i.e., the attributes, of the entities
– Connectivities and cardinalities of the relationships
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
What Is the Process of Developing a
Database Application? (cont'd)
• Design the means of
– Inputting information into the database, e.g.,
data-entry forms
– Providing outputs to users of the database, e.g.,
report forms, standard queries
• Finish by
– Putting all the pieces together
– Preparing sample data for tests
– Testing the application by inputting sample data
and outputting queries and reports
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Development Life Cycles:
What Are They?
• Processes that take place over periods of time
tend to be viewed in terms of sequential phases
that, taken together, form a “life cycle.”
• You have just seen an outline of the process of
developing a database application. This
process can be formalized as the Database
Development Life Cycle (DBDLC).
• The DBDLC can be compared with other similar
development life-cycles: e.g.,
– Software Development Life Cycle (SoftDLC
– System Development Life Cycle (SysDLC)
– Business-Process Improvement Life Cycle (BPILC)
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Software Development Life Cycle
• The development of computer software is typically
managed in terms of a Software Development Life
Cycle (SoftDLC).
• An excellent statement of the phases of the SoftDLC
has been given by Dale, Joyce, and Weems, as
follows*:
– “Problem analysis Understanding the nature of the problem
to be solved
– “Requirements elicitation Determining exactly what the
program must do
– “Software specification Specifying what the program must
do (the functional requirements) and the constraints on the
solution approach (nonfunctional requirements, such as
what language to use)
*From: Dale, N.; Joyce, D. T.; Weems, C. Object-Oriented Data Structures
Using Java. Sudbury, MA: Jones and Bartlett; 2002.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Software Development Life Cycle
– “High- and low-level design Recording how the
program meets the requirements, from the “big picture”
overview to the detailed design
– “Implementation of the design Coding a program in a
computer language
– “Testing and verification Detecting and fixing errors and
demonstrating the correctness of the program
– “Delivery Turning over the tested program to the
customer or user . . .
– “Operation Actually using the program
– “Maintenance Making changes to fix operational errors
and to add to or modify the functions of the program”
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The development of complex systems of
people and equipment to carry out
various functions in organizations is
usually managed in terms of a System
Development Life Cycle (SysDLC).
• Typical phases of the SysDLC are:
– Analysis
– Design
– Production
– Implementation
– Operation
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The analysis phase of the SysDLC deals with
such questions as:
– What is the mission, or purpose, of the organization?
– How do the functions that the system in question is
supposed to handle relate to the mission?
– How are those functions currently handled?
– In what ways is the current handling unsatisfactory?
– What technology is potentially available to assist in
accomplishing the desired functions?
– Which aspects of the system's environment can be
determined within the organization, and which are
determined by factors not under the organization's
control?
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The design phase of the SysDLC deals with such
questions as:
– Which of the possible overall designs for the system best suits
both the functions to be accomplished and also the organization?
– What is the best way of accomplishing those functions that deal
with the strictly internal aspects of the system (i.e., those aspects
wholly controllable by the organization)?
– What is the best way of accomplishing those functions that deal
with a mixture of internal and external aspects of the systems
(external aspects being those not controllable by the
organization)?
– How are the various components of the system to be tied
together?
– How will the system be tested while its pieces are being put
together?
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The production phase of the SysDLC deals
with such matters as:
– Ordering equipment
– Ordering off-the-shelf software
– For custom software, designing the computer
programs and beginning the programming
– Planning and beginning the writing of manuals
of procedures for staff members and for users
of the system
– Beginning the training of staff members
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The implementation phase of the SysDLC deals
with such matters as:
– Receiving, installing, and testing equipment
– Receiving, installing, tailoring, and testing off-the-shelf
software
– Completing the programming of custom software, and
testing it
– Completing the writing of manuals of procedures for
staff members and system users
– Completing the training of staff members
– Integrating the complete system and testing all its
aspects for satisfactory accomplishment of its functions
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
System Development Life Cycle
• The operation phase of the SysDLC deals
with such matters as:
– Running the completed system
– Evaluating its continuing operations in terms of
• How well it performs the functions it was intended to
accomplish
• How well it copes with the inevitable changes in the
environment
– Initiating a new round of system development if
and when needed
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Business-Process Improvement
Life Cycle
• In business-process improvement (BPI), the emphasis is
on systems rather than lower-level processes and subprocesses
• Systems are viewed as sets of interconnected processes
that must be treated as wholes with respect to
improvement (i.e., sub-optimization is to be avoided)
• Continuous improvement, rather than one-time projectoriented improvement, is often the goal
• The overall goal of the BPI effort must be to heighten
customer (i.e., user) satisfaction with the quality of the
products and services offered by the organization. BPI is
a result of the quality-management revolution sparked by
Dr. W. Edwards Deming.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Business-Process Improvement
Life Cycle
• The Business-Process Improvement Life Cycle
(BPILC) consists of the following phases:
– Defining the problem, in conjunction with management
– Initiating work, by organizing work teams and setting
initial tasks and schedules
– Analyzing the situation in detail
– Evaluating alternative solutions
– Developing improved systems (sets of processes) and
changing organization where necessary
– Evaluating the results
– If necessary, go back to analyzing the situation again
and proceed from there
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Database Development Life Cycle
• The development of databases is typically
managed in terms of the Database
Development Life Cycle (DBDLC). The
phases of the DBDLC can be defined as
follows*:
– “Database initial study
•
•
•
•
Analyze the company situation
Define problems and constraints
Define objectives
Define scope and boundaries
*From Chapter 6 of: Rob, P.; Coronel, C. Database Systems: Design,
Implementation, and Management. 4th ed. Cambridge, MA: Course
Technology; 2000.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Database Development Life Cycle
– “Database Design
• Create the conceptual design [i.e., model the realworld situation]
• DBMS software selection
• Create the logical design [i.e., express the model in
terms of the selected DBMS]
• Create the physical design [i.e., deal with the
physical storage and access of the data]
– “Implementation and loading
• Install the DBMS
• Create the database(s)
• Load or convert the data
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Database Development Life Cycle
– "Testing and evaluation
• Test the database
• Fine-tune the database
• Evaluate the database and its application
programs
– "Operation
• Produce the required information flow
– "Maintenance and evolution
• Introduce changes
• Make enhancements"
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Summary of Life Cycles
• We can summarize the foregoing discussion of life
cycles as follows: The development of complex,
interrelated processes aimed at solving a problem tends
to be viewed in terms of life cycles that
– Begin with an analysis of the problem
– Continue with careful planning and designing of a solution to the
problem
– Proceed further with the carrying out of the practical steps
involved in achieving the solution, including the testing of pieces
of the solution and of the complete, integrated solution
– Conclude with the solution being put into full operation
– May lead eventually to recognition of new difficulties and the
initiation of a new life cycle of analysis, design, and
implementation.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
A Further Look at
Database Application Development
• In addition to their view of the DBDLC, Rob and
Coronel also interpret the process of developing a
database application as a sequence of
–
–
–
–
Conceptual Design
DBMS Software Selection
Logical Design
Physical Design
• The following ten slides examine this interpretation*
more closely
As presented in Chapter 6 of: Rob, P.; Coronel, C. Database
Systems: Design, Implementation, and Management. 4th ed.
Cambridge, MA: Course Technology; 2000.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• Rob and Coronel view conceptual design
as the step in which "data modeling is
used to create an abstract database
structure that represents real-world
objects in the most realistic way possible."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• To do a good job of conceptual design, the designer
must consider
–
–
–
–
"Information needs"
"Information users"
"Information sources"
"Information constitution," including
•
•
•
•
Data elements and data attributes
Relationships among the data
Data volume and frequency of use
Data transformations needed, if any
• Answers to questions that arise in the above
considerations come from
– "Developing and gathering end-user data views"
– "Direct observation of the current system: existing and desired
output"
– "Interface with the systems-design group"
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• Entity-Relationship modeling is essential in developing
the conceptual design of a database application. Rob
and Coronel outline the E-R modeling process as:
– "1. Identify, analyze, and refine the business rules.
– "2. Identify the main entities, based on Step 1.
– "3. Define the relationships among the entities, based on Steps
1 and 2.
– "4. Define the attributes, primary keys, and foreign keys for each
of the entities.
– "5. Normalize the entities [i.e., develop a set of tables, each in at
least Boyce-Codd Normal Form, that represents each entity].
– "6. Complete the initial E-R diagram.
– "7. Have the main end users verify the model in Step 6 against
the data, information, and processing requirements.
– "8. Modify the E-R diagram, based on the results of Step 7."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• Rob and Coronel invite special attention to certain
details of the E-R modeling process, saying, "All objects
(entities, attributes, relations, views, and so on) are
defined in a data dictionary, which is used in tandem
with the normalization process to help eliminate data
anomalies and redundancy problems. During this
process the designer must:
– "Define entities, attributes, primary keys, and foreign keys. . . .
– "Make decisions about adding new primary key attributes in
order to satisfy end user and/or processing requirements.
– "Make decisions about the treatment of multivalued attributes.
– "Make decisions about adding derived attributes to satisfy
processing requirements.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
– "Make decisions about the placement of foreign keys in 1:1
relationships.
– "Avoid unnecessary ternary relationships.
– "Draw the corresponding E-R diagram.
– "Normalize the data model.
– "Include all the data element definitions in the data dictionary.
– "Make decisions about standard naming conventions."
• Rob and Coronel caution that the "naming conventions
requirement is important, yet it is frequently ignored at
the designer's peril. Real database design is generally
accomplished by teams. Therefore, it is important to
ensure that the team members work in an environment
in which naming standards are defined and enforced."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• Data Model Verification
– Rob and Coronel note that an important step in the conceptualdesign process is to verify the E-R model "against the proposed
system processes in order to corroborate that the intended
processes can be supported by the database model."
• Verification involves testing the model against
– "End user data views and their required transactions: SELECT,
INSERT, UPDATE, and DELETE operations and queries and
reports.
– "Access paths, security, and concurrency control.
• "Concurrency control is a feature that allows simultaneous access
to a database [by multiple users], while preserving data integrity."
– "Business-imposed data requirements and constraints."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Conceptual Design
• Having also recommended that the overall design be
broken up into modules, so that the work can better be
handled by different teams or in different stages, Rob
and Coronel outline the following steps in the E-R Model
Verification Process:
– "1. Identify the E-R model's central entity.
– "2. Identify each module and its components.
– "3. Identify each module's transaction requirements:
• "Internal: Updates/Inserts/Deletes/Queries/Reports
• "External: Module interfaces
– "4. Verify all processes against the E-R model
– "5. Make all necessary changes suggested in Step 4.
– "6. Repeat Steps 2 through 5 for all modules."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
DBMS Software Selection
• Rob and Coronel outline the major considerations in
selecting DBMS software as follows:
– "Cost. Purchase, maintenance, operational, license, installation,
training, and conversion costs.
– "DBMS features and tools." Some of the possibilities are:
• "Query-by-example
• "Screen painters
• "Report generators"
– "Underlying model. Hierarchical, network, relational,
object/relational, or object.
– "Portability. Across platforms, systems, and languages.
– "DBMS hardware requirements. Processor(s), RAM, disk space,
etc."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Logical Design
• Rob and Coronel say:
– "Logical design follows the decision to use a specific database
model (hierarchical, network, or relational). Once the database
model is identified, we can map the conceptual design onto a
logical design that is tailored to the selected database model.
– "Logical design is used to translate the conceptual design into
the internal model for a selected [DBMS], such DB2, SQL
Server, Oracle, IMS, Informix, Access, Ingress, and so on. This
includes mapping all objects in the model to the specific
constructs used by the selected [DBMS]. For a relational DBMS,
the logical design includes the design of" such features as
•
•
•
•
•
Tables
Indexes
Views
Transactions
Security restrictions
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Physical Design
• Rob and Coronel say:
– "Physical design is the process of selecting the data storage
and data access characteristics of the database. . . .
– "Physical design is particularly important in the older
hierarchical and network models. . . . Relational database are
more isolated from physical layer details" than are the older
models.
– But even with RDBMSs, "performance can be affected by the
characteristics of the storage media, such as seek time, sector
and block (page) size, buffer pool size, and number of disk
platters and read/write heads. In addition, such factors as the
creation of an index can have a considerable performance effect
on the relational database's data access speed and efficiency."
– "Physical design becomes more complex when data are
distributed at different locations, because the performance is
affected by the communication media's throughput."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Top-Down vs. Bottom-Up Design
• Database applications, like other complex systems, can
be designed by either of two strategies, or approaches,
called the "top-down" strategy and the "bottom-up"
strategy.
• In practice, a designer or design team rarely uses solely
one or the other of these strategies. Nevertheless, their
differences are worth noting.
• Top-Down Design
– This approach starts by identifying the entities in the problem
and their relationships, and then working down to the attributes
and their details.
• Bottom-Up Design
– This approach starts with the attributes in the problem, and then
works up by identifying the entities to which the attributes need
to be linked and, in turn, the relationships among the entities.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Top-Down vs. Bottom-Up Design
• Rob and Coronel comment quite carefully about these
two somewhat different strategies, saying:
– "Although these two methodologies are complementary rather
than mutually exclusive, a primary emphasis on a bottom-up
approach may be [emphasis added] more productive for small
databases with few entities, attributes, relations, and
transactions. For situations in which the number, variety, and
complexity of entities, relations, and transactions is
overwhelming, a primarily top-down approach may be
[emphasis added] more easily managed."
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Centralized vs. Decentralized Design
• In a similar way, Rob and Coronel comment carefully
about centralized vs. decentralized design, which they
describe as contrasting "philosophies" of design. They
say:
– "Centralized design is typical of relatively simple and/or small
databases and can be successfully done by a single person . . .
or by a small, informal design team."
– "Decentralized design might be used when the data component
of the system has a considerable number of entities and
complex relations on which very complex operations are
performed. Decentralized design is also likely to be employed
when the problem itself is spread across several operational
sites and each element is a subset of the entire data set. . . ."
• As with top-down vs. bottom-up design strategies, I think
that in practice it is rare for either strictly centralized or
strictly decentralized design philosophies to be used.
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Other Important Matters
• The design process needs to
include consideration of
– Performance
– Security
– Data Integrity
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Other Important Matters
• Performance includes
– Speed of response to users
– Adequacy of response to users
– Ease of maintenance
– Time required for maintenance (to be as low
as possible consistent with safety and proper
functioning)
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Other Important Matters
• Security includes
– Means of controlling physical access: e.g.,
locks, hard-wired terminals
– Systems for providing and controlling password
access to various parts of the database
– Provision for audit trails of transactions and
changes to the databases themselves
– Provision for encryption of data transmitted over
easily accessible communications channels
– Establishing and employing regular back-ups
– Establishing, and keeping up to date, plans for
recovery from disasters
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Other Important Matters
• Data Integrity Procedures include
– Rules for handling attributes that are
primary or secondary keys, to provide extra
caution for correctness of data entry
– Careful analysis and design of update and
deletion cascades and restricts
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications
Designing a Database Application Can Seem
Like Finding Your Way Out Of a Labyrinth,
But Perseverance Will Bring Success
GSLIS - The University of Texas at Austin
LIS 384K.11, Database-Management Principles and Applications