Document 7531455

Download Report

Transcript Document 7531455

Topics in Database Administration

What is database administration?

What are the tasks involved in establishing,
creating, implementing and maintaining a
database?

How is the quality of data addressed?

What are the tasks involved in:




Database transaction integrity?
Database backup/recovery?
Database security?
DBMS efficiency?
1
What is database administration?
Category
Data Administration Database Administration
Definition
Function responsible for
Function responsible for the
the overall management of technical viability of the
data resources.
database and database
management system.
Scope
Broad: organization-wide. Narrow: technical application.
Long term orientation.
Daily operations.
Major
Tasks
Sets policies and
standards.
Enforces policies and
procedures.
Coordinates and manages
database design.
Chooses and maintains
technology.
Tasks in “data” administration

Planning.
 Assist in the development of the IS architecture.
 Develop enterprise data model.
 Assist in database design for application development.

Policies and procedures.
 Establish metadata.
 Assesses and controls the quality of the data.
 Establish policies towards data ownership.

Internal marketing.
 Explain the concept of data as a shared resource.
 Resolve conflicts between organizational areas.
3
More detail: Data Resource Management Tasks

Assess organizational data strategy.

Evaluate data level of data integration.

Evaluate quality of data.

Create and maintain metadata.

Create and maintain data models.

Identify stakeholders and ensure ongoing participation
in data management.

Evaluate privacy of data. Determine ethical use of data
collection and access.

Make effective use of data resources through business
intelligence capabilities.
4
Tasks in “database” administration

Planning.
 Guide the selection of hardware and DBMS software.
 Choose appropriate tool set for database administration,
development and maintenance.

Procedures and policies.
 Establish security policies.


Access.
Update.
 Establish application development policies.


Use of views.
Use of indexes.
 Establish backup and recovery policies.
 Enforce policies.
5
More detail: Technical Management Tasks

Install and update the DBMS.

Create the database(s).
 Establish tablespace.
 Create users.
 Manage data dictionary.

Populate the database(s).

Ensure data integrity.
 Triggers and stored procedures.
 Referential integrity.
 Additional data constructs implemented through SQL.

Perform data backup and recovery.

Monitor and tune performance.
6
Data Administrator
Education
Experience
Skills
Database Administrator
Three of the critical technical tasks of a DBA

Database backup and recovery.
 Anticipating the event of a database failure.
 Establishing & enforcing database backup & recovery
procedures.

Database security and integrity.
 Defining security requirements based on data and
application needs.
 Establishing and enforcing security procedures.
 Establishing and enforcing data integrity procedures.

DBMS optimization.
 Tracking current response time.
 Modifying DBMS parameters to improve response time.
8
Database backup and recovery

What is backup and recovery?
 Backup: A method of storing data from a database in a
format that can be used to rebuild the database if
necessary.
 Recovery: Mechanisms for restoring a database quickly
and accurately after loss or damage.

Why have backup and recovery?






Human error.
Hardware failure.
Incorrect or invalid data.
Program errors.
Viruses.
Natural catastrophes.
9
Backup and recovery are based on transactions

A transaction is one or more database actions (SQL
statements) that are treated as a single unit of work.


If the transaction is successful, then the transaction is committed.
If the transaction is not successful, then the transaction is rolled
back or aborted.
OrderLine_tbl
Order_tbl
PK
OrderID
FK1
OrderDate
CustomerID
BillingCode
contains
PK,FK1
PK,FK2
OrderID
ProductID
Quantity
Price
Product_tbl
is for
PK
ProductID
Name
Description
QOH
Cost
is
placed
by
Customer_tbl
PK
CustomerID
Name
Address
Zip
Phone
Imagine a database with this
structure
10
Accepting an order for a product
INSERT INTO order_tbl VALUES
(123,’27-apr-06’, 765, ‘net30’);
INSERT INTO orderline_tbl VALUES
(123,6812,10, 34.99);
UPDATE
SET
WHERE
product_tbl
qoh = qoh - 10
prod_no = 6812;
11
DBMS’s have methods to control transactions

Databases that support transactions provide specific
commands for starting, committing, and rolling back
transactions.
 Begin transaction.
 End transaction.
 Commit.
 Rollback.
 Autocommit.

Transaction throughput: The number of transactions
processed per time interval. This is a measure of
transaction processing performance.
12
Transactions have properties

Atomicity: A transaction is an indivisible unit of work.
 Almost all languages that interface with a DBMS have a way to
signify the start and end of a transaction.
 Within the start and end are one or more SQL commands.

Consistency: The transaction moves a database from one state
of consistency, through inconsistency to another state of
consistency.

Isolation: A transaction executes in isolation from other
transactions.
 This is also referred to as the “serializability” of transactions.
 A transaction can affect another transaction, so the transaction
must complete as though it is isolated in order to be accurate.

Durability: Once a transaction is committed, its effects on the
database are durable, or permanent. No subsequent actions or
failures to the database can cause that transaction to be lost.
13
Transaction boundary decision

Division of work into transactions.

Objectives:
 Minimize transaction duration.
 Ensure transaction isolation.

Constraint: enforcement of important integrity
constraints.
14
Transaction example
Transaction boundary choices.

One transaction for the entire form?

One transaction for the main form and one transaction
for all subform records?

One transaction for the main form and separate
transactions for each subform record?
16
Backup is conducted in 3 processes

Backup: A DBMS software utility provides a way to do a complete,
full or incremental backup of the database in a consistent state.




Journalize: A DBMS software utility provides an audit trail of
changes to the database.



Complete: entire database.
Full: all rows of specified tables.
Incremental: rows that have changed since the last full backup.
Transaction log: contains all data used to process changes against the
database.
Database change log: contains a before-image and an after-image of
each row modified by a database transaction.
Checkpoint: A DBMS software utility that periodically suspends all
transaction processing and synchronizes files within the database.

Some databases, such as Oracle, do not actually halt processing. They
simply write checkpoint information to files.
17
Recovery methods

A DBMS has a utility to recover the database. Usually
referred to as the Recovery Manager.

The method of recovery depends on the type of failure.

Recovery Manager usually has the following options:
 Switch: Switches to a replica of the database on a
different storage device.
 Requires that a mirror image of the database is stored.
 Can be expensive.
 Assumption is a storage failure, not a failure in transaction
integrity, occurred.
 Restore/Rerun: Reprocesses the transactions for a given
time period against a correct version of the database.
 Assumption is that a failure in transaction integrity has
occurred.
 Can be very time-consuming.
18
Two common methods of restore/rerun

Backward recovery. Also called “rollback” recovery.
Used to undo unwanted changes to the database.
 Imagine that the current database is inaccurate.
 Before-images are applied to the current database to return
it to a prior state of consistency.
 Used to back out changes that are unwanted.

Forward recovery. Also called “rollforward” recovery.
Used to recover accurate transactions and apply them to
the database.
 Imagine that the current database is inaccurate. This
database must be replaced with a prior, consistent version
of the database before forward recovery can begin.
 After-images are applied to a past version of the database.
Does not require that all transactions are applied - just
takes the most recent after-images.
19
Issues in database backup and recovery

Cost.
 Media.
 Computer overhead (processor, memory, disk) to create
journalizing files, control files, checkpoint files, etc.
 Personnel to supervise and tune.

Time.
 Can result in regularly scheduled downtime.
 Can make the system slower.
20
Potential problems with shared databases

Concurrency control is the process of managing
concurrent operations against a database in order to
maintain data integrity.

Potential problems with shared databases are:
 Lost Update.
 Uncommitted Dependency. “Dirty Read”
 Incorrect Summary.
21
Lost update
Transaction A Time Transaction B
Read SR (10)
T1
T2 Read SR (10)
If SR > 0 then
T3
SR = SR -1
T4 If SR > 0 then
SR = SR -1
Write SR (9)
T5
T6 Write SR (9)
SR: Seats Remaining
22
Uncommitted dependency or “dirty read”
Transaction A
Read SR (10)
SR = SR - 1
Write SR (9)
Rollback
Time Transaction B
T1
T2
T3
T4 Read SR (9)
T5
23
Incorrect summary
Transaction A
Read SR1 (10)
SR1 = SR1 - 1
Write SR1 (9)
Read SR2 (5)
SR2 = SR2 - 1
Write SR2 (4)
Time
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
Transaction B
Read SR1 (9)
Sum = Sum + SR1
Read SR2 (5)
Sum = Sum + SR2
24
Methods of concurrency control

Scheduler. The scheduler establishes the order in which
the operations within concurrent transactions are
executed.
 The scheduler interleaves the execution of database
operations to ensure serializability.
 Some schedulers have the ability to analyze transaction
content.

Locking: Fundamental tool of concurrency control.

Obtain lock before accessing an item.

Wait if a conflicting lock is held.
 Shared lock: conflicts with exclusive locks
 Exclusive lock: conflicts with all other kinds of locks

Concurrency control manager maintains the lock table
25
More about locking

Locking. A lock guarantees exclusive use of a data item
to a current transaction.
 Locking can be performed programmatically or left to the
DBMS.
 Granularity of locking depends on the DBMS.





Database level.
Table level.
Page level.
Row level.
Column level.
 Can have shared or exclusive locks.
26
Problem in locking

Deadlock: An impasse that occurs when two or
more transactions have locked a common resource
and each is waiting for the other to finish.
27
Database security
Database Security: Protection of the data against accidental
or intentional loss, destruction, or misuse. Threats to database
security include the list below.

Accidental losses attributable to:
 Human error.
 Software failure.
 Hardware failure.

Theft and fraud.

Improper data access:
 Loss of privacy (personal data).
 Loss of confidentiality (corporate data).

Loss of data integrity.

Loss of availability.
28
DBMS security features

Views (frequently referred to as subschemas).

Authorization rules.
 Controls incorporated in the DBMS.
 Restrict access to specific data.
 Restrict actions that can be taken.

User-defined procedures.
 Trigger an authorization procedure which asks additional
identification questions.
 Written in a standard programming language or proprietary
language.

Encryption.

Authentication schemes.
 Biometric devices.
29
Oracle’s approach to security

Multi-user database systems, such as Oracle, include
security features that control how a database is accessed
and used. For example, security mechanisms:
 Prevent unauthorized database access.
 Prevent unauthorized access to schema objects.
 Control system resource usage (such as CPU time or disk
usage).
 Audit user actions.

Associated with each database user is a schema.
 A schema is a logical collection of database objects
(tables, views, sequences, synonyms, indexes, clusters,
procedures, functions, packages, and database links).
 By default, each database user creates and has access to
all objects in the corresponding schema.
30
Oracle’s security mechanisms

The Oracle server provides discretionary access control, which is a
means of restricting access to information based on privileges.


The appropriate privilege must be assigned to a user in order for that user
to access a schema object.
Appropriately privileged users can grant other users privileges at their
discretion; for this reason, this type of security is called "discretionary".
SQL statements used for security
SQL Statement
Action
CREATE USER
Allows the DBA to create a new user.
GRANT
Allows the user to give other users privileges
to access the user's objects.
CREATE ROLE
Allows the DBA to create a collection of
privileges that can be assigned as a group.
ALTER USER
Allows users to change their passwords. Can
also be used to change other attributes of a
user.
Removes privileges on an object from a user,
users, or role.
REVOKE
SQL Statements Used for Data Integrity
SQL Statement
Description
Create Domain
Create custom data types with predefined
CHECK constraints.
Create Assertion
Create constraints involving multiple tables
and calculations. Similar to CHECK
constraints, but allows access to multiple
tables through a SELECT statement.
Triggers and Stored
Procedures
Discussed previously. Programmer can
create customized integrity check through
proprietary procedural language. PL/SQL
for Oracle.
DBMS query optimization

DBMS’s are differentiated by their query optimizers.

A query optimizer is a component of a DBMS.

You do not have the choice of how queries are
implemented on the physical database, the query
optimizer assumes this responsibility.

You can sometimes “help” and thus possibly improve
the optimization process.
34
Query translation process
Query
Syntax and semantic
analysis
Parsed query
Query transformation
Relational algebra query
Access plan
evaluation
Access
Plan
Access
Plan
Access plan
interpretation
Code generation
Query results
Machine code
Query optimizer methods

Rule based.
 Looks at syntax.
 Parses query and executes in the order written according to the
rules pre-established by the person who wrote the query
optimizer.

Cost based.
 Looks at syntax.
 Looks at statistical data about the database.
 Parses query and executes based on the written and the
information about the current and historical data of the database.

Choose.
 Uses the rule based method for tables which have not been
used/analyzed in the past.
 Uses the cost based method for tables which have been
previously analyzed.
36
Helping the query optimizer

Know which type of optimizer is used by the DBMS.

Order the query in a way that eliminates more rows with
the first “where” condition.

Provide comments or suggestions.

Use joins instead of nested or correlated sub-queries.

Use single table conditions instead of joins.

Use queries that perform grouping with aggregate
calculations only when really necessary.

Avoid using the “<>” or not equal condition.
37
Becoming a DBA

Experience.
 Application programming with a database.
 Some DBA tasks.
 Systems programming with a database.

Education.
 Training classes with a specific DBMS.
 For example, Oracle offers a ten-class sequence for DBA’s.
 Master’s degree?

Certification.
 Specific to DBMS.
 Offered by DBMS vendor.
 Formal programs offered for Oracle and SQL Server;
informal programs for other DBMS types.
38