Database Fundamentals

Download Report

Transcript Database Fundamentals

Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Introduction to Database Systems
• A database is a software program that stores
information which relates to a particular activity
or purpose
• Examples:
– A bank needs to store the information relating to
customer accounts
– A hospital needs to keep data about patients and the
medication dispensed,
– A university needs to maintain record of its students,
– A internet sales business needs to ..
Introduction to Database Systems
• A database includes tools to help organise and
maintain the data.
• A database design engineer must know and be
able to use these tools in order to make the
database efficient and accurate.
• In a properly designed database the information
is updated once and all other occurrences of this
within the database will be updated
automatically.
• A principal requirement of any database system
is the ability to generate, store and retrieve data
efficiently.
Basic components of a database
system
In order to convert data into useful information a set of software
tools are need, SQL, Form, etc..
Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Database Management Systems
(DBMS)
• DBMS is a collection
of programs provided
by the vendor, which
enable the data to be
accessed, filtered
and generally
processed efficiently,
in order to yield
useful information for
the user
Database Management Systems
(DBMS)
• In principle, user will generate a question (query)
and DBMS will search the dB to find the answer,
which will be returned to user
• A DBMS will include additional tools to provide
graphical user interfaces and report generation
programs to facilitate the delivery of information
to the user.
• It will also provide tools to create, edit, process,
and modify data and also to program the
database. SQL is a standard language
accessing relational databases.
What is a Database
Management System?
• A Database Management System (DBMS)
is a program that manages a database:
– Supports a high-level access language (e.g.
SQL).
– Application describes database accesses
using that language.
– DBMS interprets statements of language to
perform requested database access.
SQL examples:
• Example 1:
SELECT Course_Number
FROM
StudentRecords
WHERE student_name = ‘ ‘;
• Example 2:
SELECT student_number
FROM
StudentRecords
WHERE BEngCourse = ‘2388’ AND average_mark >70;
• These examples show that SQL is a very useful
tool for interrogating a database
Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Types of Database
• Classification by user
– Single user databases:
Access was initially
designed as a single
user database
– If Single user
database system used
for multiple user, there
are two problems need
to be concern: Data
corruption and
Performance & Speed.
Types of Database
• Classification by user
– Single user
databases: If there
are less than 50
users than this type
of database is
referred to as a
workgroup
database, if the
users number is
bigger than 50, it is
referred to as an
Enterprise database
Microsoft’s solutions for the multiuser market
• File-server: Access database is configured as a
network share so that multiple users can use it
simultaneously
• Client/Server: Access work as a client, there is
MS SQL server in the server side
• Database replication: MS networking supports
replication through its directory services.
• Web-based database solutions. ASP/Access or
ASP/SQL server
Classification by location
• Centralised Database: In a centralised
database system there is only one database
which is shared between users.
– Q: What is the main disadvantage of centralised
databases? How do we solve it?
– A: Perhaps the main disadvantage of this type of
database is that a failure of the server running
the database will prevent all users from using the
database until the server is back on-line. To
overcome this problem larger database
applications tend to be of a distributed type
– Q: What is the main disadvantage of centralised
databases? How do we solve it?
– A: Perhaps the main disadvantage of this type of
database is that a failure of the server running
the database will prevent all users from using the
database until the server is back on-line. To
overcome this problem larger database
applications tend to be of a distributed type
Classification by location Distributed Database
Classification by location Distributed Database
• A distributed database is commonly found in
networks. Here the current version of the
database is kept on different servers. Any
changes are replicated across the network so
that all copies of the database are current.
– Q: What are the main disadvantages and
disadvantages of distributed databases?
• The main advantage of this arrangement is perhaps in fault
tolerance. If one of the servers is down, others can take over
the workload. These type of databases are common in
enterprise databases.
• The main disadvantage is cost, both in terms of software
licensing and the cost of the hardware in a network. In heavy
utilisation environment network traffic may also be a problem.
Classification by type
• Classification by type distinguishes between a transactional
database and a data warehouse.
• The distinction is fairly intuitive with the transactional database being
used to process day-to-day transaction information and maintain a
record of decisions made relating to this data. Due to the fact that
many transactions can occur in this database system, and that these
transactions will impact on decisions made on that day, these types
of databases are treated as time critical.
• On the other hand, the data warehouse is used to generate
strategic decisions and the data stored tends to be longer term.
Consequently this data is considered to be NON-time-critical.
• In terms of the number of applications the transactional database
(also referred to as production database) is the more common of the
two database types.
Classification by type
What is a Transaction?
• When an event in the real world
changes the state of the enterprise, a
transaction is executed to cause the
corresponding change in the database
state
– With an on-line database, the event causes
the transaction to be executed in real time
• A transaction is an application program
with special properties - discussed later
- to guarantee it maintains database
correctness
What is a Transaction
Processing System?
• Transaction execution is controlled by a
TP monitor
– Creates the abstraction of a transaction,
analogous to the way an operating system
creates the abstraction of a process
– TP monitor and DBMS together guarantee the
special properties of transactions
• A Transaction Processing System consists
of TP monitor, databases, and
transactions
transactions
Transaction Processing System
DBMS
database
DBMS
database
TP Monitor
System Requirements
• High Availability: on-line => must be
operational while enterprise is functioning
• High Reliability: correctly tracks state,
does not lose data, controlled concurrency
• High Throughput: many users => many
transactions/sec
• Low Response Time: on-line => users
are waiting
System Requirements (con’t)
• Long Lifetime: complex systems are not
easily replaced
– Must be designed so they can be easily
extended as the needs of the enterprise
change
• Security: sensitive information must be
carefully protected since system is
accessible to many users
– Authentication, authorization, encryption
OLTP vs. OLAP
• On-line Transaction Processing (OLTP)
– Day-to-day handling of transactions that result
from enterprise operation
– Maintains correspondence between database
state and enterprise state
• On-line Analytic Processing (OLAP)
– Analysis of information in a database for the
purpose of making management decisions
OLAP
• Analyzes historical data (terabytes)
using complex queries
• Due to volume of data and complexity of
queries, OLAP often uses a data
warehouse
• Data Warehouse - (offline) repository of
historical data generated from OLTP or
other sources
• Data Mining - use of warehouse data to
discover relationships that might
influence enterprise strategy
Examples - Supermarket
• OLTP
– Event is 3 cans of soup and 1 box of crackers
bought; update database to reflect that event
• OLAP
– Last winter in all stores in northeast, how many
customers bought soup and crackers together?
• Data Mining
– Are there any interesting combinations of foods
that customers frequently bought together?
Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Database Design
Very basic level database components
Database Design
• The design of a database is concerned with the way that
these components are organised and this determines
how efficient the design solution will be.
• To design the data structure we need to ask ourselves a
few questions, for example:
– 1. Do we know all the fields that we need to include so that our
records and files contain all the necessary data?
– 2. What fields will each of our records contain.
– 3. Do we allow fields to duplicate in records? This will create
data redundancy.
– 4. What records are needed in files?
Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Database design considerations
• Data dependence and Structural dependence
– Data is organised by the designer into a structure that suits its
application. Thus, there is a structural dependence associated
with data in a database.
– Structural dependence deals with the mechanism for changing
database structure.
– Good database design should ensure that it is relatively easy to
make the necessary modifications to the structure.
– For example, the addition of a field to an existing file structure.
– data dependence and is concerned with the issues of data itself,
rather than the structure.
– For example, data dependence will be concerned with how easy
it is to change a field from say being an integer to a decimal.
Database design considerations
• Data Redundancy
– Data Redundancy is perhaps the most common problem in dB
design. This is to say that the same data is stored in a number of
locations. For example, a student’s address is kept in the finance
file as well as enrolment and exam results file.
– Good database design will ensure that the data is not duplicated
in this manner. This will save on storage space (memory) and
also improve the speed of data access and processing.
– data integrity, refers to the consistency of data which is
duplicated.
– For example, if our design requires us to maintain a student’s email address in a number of different locations, then failing to
update a change in e-mail address in all locations will result in
data inconsistency.
Database design considerations
• Data anomalies
– Modification, insertion and deletion anomalies all relate to
data in a one-to-many relationship. For example, there is one
personal tutor to many students. So this is a one-to-many
relationship and the database will keep a record indicating the
personal tutor for each student. If the tutor’s telephone number
changes then, because the relationship is one-to-many, all
student files have to be updated with this new information.
Failure to update will result in a data anomaly, which in this case
is referred to as a modification anomaly.
– Similarly we face problems with for example inserting new tutor,
when this is required and deleting an old tutor. All the students
files have to be updated accordingly, and failure to do this will
result in an insertion anomaly or conversely a deletion
anomaly.
Database Fundamentals
Introduction
•
•
•
•
•
•
Introduction to database systems
Database Management Systems (DBMS)
Type of Database
Database Design
Database Design Considerations
Component of Database Systems
Components of Database Systems
Components of Database Systems
• A Database system will comprise of hardware and
software components.
– The hardware part includes the computer hardware but it is worth
noting that besides the computer and peripheral hardware,
computer networks hardware plays a significant role in modern
database systems.
– The software components include the operating system (O/S), the
database engine (DBMS) and the additional tools including
application and utility software.
• The operating system refers to the software that makes all the
resources available to users. The resources usually include hardware
such as printers, hard disks, network interface cards, multimedia
equipment, but this can also include software, such as applications
that need to be shared between users. Common operating systems
include UNIX, Linux, Microsoft DOS and Windows, Mac and MVS
from IBM. The database engine sits above the O/S and uses it to
communicate with resources. It has its own software tools to help the
users in their efforts to communicate with the database and obtain
the required information. Common DMS include Microsoft access,
Oracle, IBMs DB2.