Introduction to Databases “When I use a word,” Humpty Dumpty said in rather a scornful tone, “it means just what I choose it.
Download ReportTranscript Introduction to Databases “When I use a word,” Humpty Dumpty said in rather a scornful tone, “it means just what I choose it.
Slide 1
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 2
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 3
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 4
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 5
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 6
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 7
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 8
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 9
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 10
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 11
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 12
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 13
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 14
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 15
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 16
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 17
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 18
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 19
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 20
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 21
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 22
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 23
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 24
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 25
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 26
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 27
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 28
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 29
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 2
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 3
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 4
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 5
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 6
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 7
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 8
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 9
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 10
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 11
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 12
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 13
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 14
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 15
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 16
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 17
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 18
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 19
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 20
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 21
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 22
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 23
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 24
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 25
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 26
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 27
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 28
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank
Slide 29
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”
Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important?
What is a database and database schema?
What is a database management system?
What is a database application and what are its
components?
What are the levels of database representation?
What were the limitations of the systems that led to the
development of the current relational database systems?
What are various types of database systems?
What is a table, file and record?
When do I use a Database program?
Word
processing
Spreadsheet
Database
Text handling
excellent
fair
poor
Mathematical
functions
poor
excellent
very good
excellent
good
fair
Training Cost
low
moderate
high
Software Cost
low
moderate
high
Volume of data
low
moderate
very high
Multiuser Access
low
moderate
very high
Ease of Use
Principles of Information Resource Management
Organizational resources flow into and out of the organization
Two types of major organizational resources: Physical resources,
Conceptual resources (data & information)
As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources)
Conceptual resources can be managed just like physical
resources or assets (e.g., employees, $$, equipment, widgets,
etc.)
Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when
no longer required
Management of data & information can be achieved only through
Adapted from McFadden,
F.R. & Hoffer, J.A. (1994). Modern Database
organizational
commitment
Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
processing
Information is a major organizational resource
Action
Knowledge
Information
(organized data)
Data
(isolated facts)
Survey customers; invest in
advertising; cut costs, expand
product line
Sales have dropped between
July and August
Average/ July is 40
Average/ Aug is 15
John bought 50 in July
John bought 10 in Aug
Jane bought 30 in July
Jane bought 20 in Aug
What is a Database?
Organized collection of related information or data
stored on a computer disk for easy, efficient use
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Outstanding Invoice Amounts By Order
201
data
209
214
221
235
239
information
What is a Database Management
System (DBMS)?
“A set of programs used to define,administer, and process
the database and its applications conveniently and
efficiently”
Program (or collection of programs) that enables users to create the
database. The DBMS manages the storage and retrieval of data, and
provides the user with certain functionalities to guarantee that the
data will be logically organized and consistently applied.
Database
DBMS
(e.g., Oracle, dBase,
Access, Paradox)
Database
Application
user
What is a Database Application?
Database
DBMS
Database application
A computer program that
performs a specific task of
practical value in a business
situation
An interface that allows the user
to enter and manipulate data;
User can request abstract views
of data
Created by database designers
and developers using a DBMS
program or a programming
language
Major Components of a Database Application
1. Form- data entry
2. Report- summarizes & prints
3. Query- asks questions of data
4. Menu - organizes components
5. Program - used to automate a database
Features of a DBMS
DBMS
Database
• user data
• metadata
• indexes
• application
metadata
Design Tools Subsystem
D • Table Creation Tool
B • Form Creation Tool
M • Query Creation Tool
S • Report Creation Tool
• Procedural Language
Compiler
E
n
g
i
n
e
Run Time Subsystem
• Form Processor
• Query Processor
• Report Writer
• Procedural Language
RunTime
developer
Application
program
users
Application
program
Types of Database Systems
Centralized (single site)
Distributed
microcomputer (desktop)
>1 site, requires network
legacy mainframe/ mini computer (1
not widely adapted yet
CPU)
due to many problems
client/server architecture (>1 CPU)
# of concurrent
users
Typical size of
database
1
< 10 Megabytes
< 25
< 100 Megabytes
Larger
Organizational
Corporations or
(enterprise)
Government
hundreds
> 1 Trillion bytes
Multimedia
(Internet
technology)
possibly
hundreds
Any
Type
Example
Personal
Joe's House
Painting Service
Workgroup
Video rental store
Holiday resort
bookings (with
photos)
our focus;
centralized,
microcomputer
database
Three levels of Database Representation
data elements
& their
relationships
physical
implementation
- access
methods, index
construction,
data structure;
database exists
in reality only
here
Conceptual level
Internal level
database
design,
logical,
abstract
description of
each user
group will
have its own
view of the
database;
database is
accessed from
here
External level
Primary focus of the lectures of this course is the conceptual level because
the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
Focus of this course
Lectures
Conceptual design of
databases: determining
their purpose, developing
a model, identifying the
tables that are required,
designing normalized
tables and identifying
their relationship to one
another.
Laboratories
Implement a database at
the external level:
create databases (tables)
and database
applications (queries,
forms, reports,
programs) using a
typical microcomputer
relational database
management system,
MS Access 97.
The Database System Environment
Hardware - physical devices
you are here
computer, peripherals, network devices
Software
DBMS (manages the database)
operating systems software (manages hardware & software)
application programs (user access and manipulate database)
People
system administrators (manage general operations)
database designers (architects of database structure)
database administrators (ensure the database is functioning)
systems analysts & programmers (design & implement database)
end users (use application programs)
Procedures - rules of the company governing use of data
Data
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
File systems were typically
Name:
Address:
City:
Phone:
Date:
Time:
Patient:
OHIP:
Jane Doe
123 Easy St.
London
455-0897
Sept 14, 1955
2:00 p.m.
Jane Doe, 455-0897
123456789
organized by function (use)
The first data management
systems performed clerical
tasks (transactional processing)
such as order entry processing,
payroll, work scheduling.
e.g., files for patients (file
folder analogy); each record for
a single patient; another file for
appointment/ billing
information
Limitations of Data File Systems
Customer
processing
Application
Customer
file
Order
processing
Application
Order
file
Worked adequately if data collection needs were
relatively small.
Problems arose as data files, information needs, and
reporting requirements grow in complexity due to:
Extensive programming - use of third-generation languages
(e.g., COBOL, FORTRAN) in which the programmer must
specify what is be done as well as how it is to be done
Limitations of Data File Systems
Poor mechanisms for sharing data across organization files are often incompatible with one another (separate,
isolated data)
Data redundancy - duplicate information in two or more
files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - timeconsuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Customer
processing
Application
Order
processing
Application
DBMS
Database
Employee
processing
Application
Developed to overcome limitations of file systems, developed initially on
mainframe computers in late 60s and early 70s - a typical early DBMS
cost $100,000 (many are still in use)
First general databases were created for General Electric Company
(GEC) - Integrated Data Store (IDS), designed to run on GEC machines;
B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases
developed; at present, most database technology is used in workgroups
Better Definition of a Database
A collection of users’ data, organized logically and managed
by a unifying set of principles, procedures, and functionalities,
which help guarantee the consistent application and
interpretation of that data
(a) organized collection of related information or data
stored on a computer disk for easy, efficient use; represented in
tabular format
OrderNum
201
209
214
221
235
239
InvoiceAmt
854.00
1,106.00
1,070.50
1,607.00
1,004.50
1,426.50
CustomerName
Cottage Grill
Cleo's Dow ntow n Restaurant
Jean's Country Restaurant
Maxw ell's Restaurant
Embers Restaurant
The Empire
Ow nerName
Ms. Doris Reaume
Ms. Joan Hoffman
Ms. Jean Brooks
Ms. Barbara Feldon
Mr. Clifford Merritt
Ms. Curtis Haiar
Phone
(616) 643-8821
(616) 888-2046
(517) 620-4431
(219) 333-0000
(219) 816-2456
(616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is
self-describing
(metadata or system
catalogues or data
dictionary)
A database contains
a description of its
own structure (e.g.,
the names of all the
tables, the names
and types of data in
each column in all
the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database
Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves
performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated,
all indexes must also be updated, therefore, reserve index for
cases in which they are needed
Salesperson
Employee ID
Name
Office
27
Rodney Jones Toronto
44
Goro Azuma Tokyo
35
Francine Moire Brussels
37
Anne Abel
Tokyo
Office Index
Office
Toronto
Tokyo
Brussels
Employee ID
27
44, 37
35
(d) Application Metadata - stores structure and format of
application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
still in use in many older (1970s) legacy
systems; very few new databases;
referred to “navigational systems”
the vast majority currently use this,
therefore, our course’s focus is here
Semantic
ObjectRelational
ObjectOriented
Very few new databases are
being created using ObjectOriented Programming (not
many ODBMS for businesses to
implement this model)
The Relational Database Model
Agents
Clients
Entertainers
Engagements
Instruments
Entertainer styles
represented by tables (like spreadsheets)
tables are NOT linked with physical pointers
unlike earlier systems, all three types of relationships can be
represented
accommodates the design of larger databases that involve
complex relationships and intricate manipulations
Evaluation of the Relational database model
Advantages
But #1 problem still is
mechanisms for minimizing data redundancy and inconsistency
logical database design is separated from physical aspects
relatively program-data independent
management of data for access, manipulation, and security
flexible mechanisms for generating reports and queries
program development and maintenance costs are reduced
data can be accessed in a multiplicity of ways within and amongst
organizations
Disadvantages
ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
File Systems
• data dependence
• structural dependence
• demands upon programmer
Hierarchical, Network DBMS
• data independence
• structural dependence
• demands upon programmer
Relational DBMS
• data independence
• structural independence
• demands upon computer
Table
Users view their data in two-dimensional tables.
table =
file
=
relation
Field
The fields within records contain data.
Data within a field must be of the same data type. Each field within
a table must have a unique name. Order of fields is unimportant.
column
=
field
=
attribute
Record
A record is a group of related fields of information about
a single instance of one object or event in a database.
Tables consist of zero, one, or more records.
Order of rows is unimportant.
row
=
record
=
tuple
Database Schema
Database schema defines database’s structure, tables,
relationships, domains, and constraint rules
Tables
BOOK (ISBN, Title, AuthID, PubID, Price)
PUBLISHER (PubID, PubName, PubPhone)
AUTHOR (AuthID, AuthName, AuthPhone)
Relationships
Each book is published by one and only one publisher
Each publisher publishes one or more books
Domains (set of values in a column)
Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules)
Price cannot be less than zero; Author phone field cannot be left blank