Database Management Systems - The Institute of Finance

Download Report

Transcript Database Management Systems - The Institute of Finance

Database Management Systems (DBMS) Module 1: Introduction

Contents       Historical Perspective DBMS Purpose of database systems Data abstraction Data models Instances and schemes Data independence

Contents       Data Definition Language (DDL) Data Manipulation language (DML) Database manager Database administrator Database users Overall system structure

What is DBMS  A database management system (DBMS), or simply a database system (DBS), consists of   A collection of interrelated and persistent data (usually referred to as the database (DB)).

A set of application programs used to access, update and manage that data (which form the data management system (MS)).

What is DBMS  The goal of a DBMS is to provide an environment that is both convenient and efficient to use in:   Retrieving information from the database.

Storing information into the database.

What is DBMS  Databases are usually designed to manage large bodies of information. This involves  Definition of structures for information storage (data modeling).

What is DBMS  Provision of mechanisms for the manipulation of information (file and systems structure, query processing).

 Providing for the safety of information in the database (crash recovery and security).

 Concurrency control if the system is shared by users.

1. Historical Perspective    From the earliest days of computers, storing and manipulating data have been a major application focus.

The first general-purpose DBMS was designed by Charles Bachman in the early 1960’s and was called the Integrated Data Store.

It formed the basis for the model network data which was standardised by the Conference on Data System Language (CODASYL)

1. Historical Perspective   In the late 1960s, IBM developed the Information Management System (IMS) DBMS, which formed the basis for an alternative data representation framework called the hierarchical data model.

The SABRE system for making airline reservations was jointly developed by IBM and America Airlines which allowed several people to access the same data through computer network.

1. Historical Perspective    In 1970 Edgar Codd at IBM came up with new data representation framework called relation data model.

The SQL query language for relational database developed in 1980s became the part of relational databases.

In the late 1980s and 1990s advances have been made in many areas of DB systems and allow complex analysis of data from all parts of firms.

1. Historical Perspective   Several vendors (e.g. IBM, Oracle, DB2,Microsoft, and Informix UDS) have extended their system to enable storage of images, and video data types.

Specialised systems have been developed by numerous vendors for creating data warehouses consolidating data from several databases and for undertaking analysis.

1. Historical Perspective   An interesting phenomenon is the emergence of Planning (ERP) Enterprise Resource DBMS.

which add a layer of application-oriented features on top of Now interestingly DBMS are part of Internet Techs where it is used to store data which can be accessed via Web.

1. Historical Perspective    Nowadays through DBMS more data is brought online.

Now we have multimedia databases, interactive video, streamlining data, digital libraries, google earth which store and process spatial data, etc.

Also DBMS is the part of E-Commerce apps such as Amazon.com and E Bay.com.

Summary: History of Data Management        Manual Record (4000 BC – AD 1900).

Punched Card Record (1900 - 55).

Programmed Record (1955 - 1970).

On – Line Network Data Management (1965 - 80).

Relational Data Management.

Multimedia Databases.

Web-Based System.

2. Purpose of database systems  The purpose of database management system can be derived from the drawbacks of which was supported by operating system.

file processing system

2. Purpose of database systems  Using that system there are problems which are:  Data redundancy and inconsistency  Same information may be duplicated in several places.

 All copies may not be updated properly.

2. Purpose of database systems   Difficulty in accessing data  May have to write a new application program to satisfy an unusual request.

 E.g. find all customers with the same postal code.

Data isolation  Data in different files.

  Data in different formats.

Difficult to write new application programs.

2. Purpose of database systems  Multiple users    Want concurrency for faster response time.

Need protection for concurrent updates.

E.g. two customers withdrawing funds from the same account at the same time account has $500 in it, and they withdraw $100 and $50. The result could be $350, $400 or $450 if no protection.

2. Purpose of database systems  Security problems  Every user of the system should be able to access only the data they are permitted to see.

  E.g. payroll people only handle employee records, and cannot see customer accounts; tellers only access account data and cannot see payroll data.

Difficult to enforce this with application programs.

2. Purpose of database systems  Integrity problems    Data may be required to satisfy constraints.

E.g. no account balance below $25.00.

Again, difficult to enforce or to change constraints with the le processing approach.

2. Purpose of database systems  Therefore the these problems of earlier file processing approach and others led to the development of database management systems.

2. Purpose of database systems: Summary  Therefore the purpose of database management system is to:    Keep information updated Keep information accurate Keep information organised

3. Data abstraction    This describes level of abstractions in DBMS. The system hides certain details of how data is stored and maintained.

In this complexity data should be hidden from database users.

The data in DBMS is described I three levels of abstraction

3. Data abstraction  The database description consists of a schema at each of these three levels of abstraction:    Conceptual Schema Physical Schema External Schema

Physical Level  This describe:  How the data are stored. E.g. index, B tree, hashing.

 It is the lowest level of abstraction.

 Also it is the complex low-level structures described in detail.

Conceptual Schema/Level    This is the next highest level of abstraction.

It describes:   what data are stored.

relationships among data.

It is the database administrator level.

Conceptual Schema/Level External schemas  The three levels of data abstraction

View Level    It is the highest level of abstraction.

It describes part of the database for a particular group of users.

Can be many different views of a database. E.g. tellers in a bank get a view of customer accounts, but not of payroll data.

4. Data Models   According to (Ramakrshnan and Gehrke, 2008) a data model is collection of high-level data description constructs that hide many low-level storage details.

A DBMS allows a user to define the data to be stored in terms of data model.

3. Data Models   Simply a data model is a collection of conceptual tools used for describing data, data relationships, data semantics and data constraints.

Most DBM today are based on the relational data model.

Types of Data Model  There are number of data models existing, such as hierarchical, network, and Semantic, however these can be grouped into three major data models:    Object-based Logical Models.

Record-based Logical Models.

Physical Data Models.

Object-based Logical Models  Object-based logical models:     Describe data at the conceptual and view levels.

Provide fairly flexible structuring capabilities.

Allow one to specify data constraints explicitly.

Over 30 such models, including

Object-based Logical Models   These can be sectioned into:  Entity-relationship model.

     Object-oriented model.

Binary model.

Semantic data model.

Infological model.

Functional data model.

Our focus will be on the first two models

Object Based Logical Model: E-R Model   The entity-relationship model is based on a perception of the world as consisting of a collection of basic objects (entities) and relationships among these objects.

The overall logical structure of a database can be expressed graphically by an E-R diagram

Object Based Logical Model: E-R Model     An entity is a distinguishable object that exists.

Each entity has associated with it a set of attributes describing it.

E.g. number and balance for an account entity.

A relationship is an association among several entities.

Object Based Logical Model: E-R Model  e.g. A customer acct relationship associates a customer with each account he or she has.

 The set of all entities or relationships of the same type is called the entity set or relationship set.

 Another essential element of the E-R diagram is the mapping cardinalities, which express the number of entities to which another entity can be associated via a relationship set.

Object Based Logical Model: E-R Model  The overall logical structure of a database can be expressed graphically by an E-R diagram:     rectangles: represent entity sets.

ellipses: represent attributes.

diamonds: represent relationships among entity sets.

lines: link attributes to entity sets and entity sets to relationships.

Object Based Logical Model: E-R Model  A sample E-R diagram

Object Based Logical Model: Object-Oriented Model  The object-oriented model is based on a collection of objects, like the E-R model.

 An object contains values stored in instance variables within the object.

 Unlike the record-oriented models, these values are themselves objects.

Object Based Logical Model: Object-Oriented Model      An object also contains bodies of code that operate on the the object.

These bodies of code are called methods.

Objects that contain the same types of values and the same methods are grouped into classes.

A class may be viewed as a type denition for objects.

Analogy: the programming language concept of an abstract data type.

Object Based Logical Model: Object-Oriented Model     The only way in which one object can access the data of another object is by invoking the method of that other object.

This is called sending a message to the object.

Internal parts of the object, the instance variables and method code, are not visible externally.

Result is two levels of data abstraction.

Object Based Logical Model: Object-Oriented Model  For example, consider an object representing a bank account.

 The object contains instance variables number and balance.

   The object contains a method pay-interest which adds interest to the balance.

Under most data models, changing the interest rate entails changing code in application programs.

In the object-oriented model, this only entails a change within the pay-interest method.

Object Based Logical Model: Object-Oriented Model  Unlike entities in the E-R model, each object has its own unique identity, independent of the values it contains:  Two objects containing the same values are distinct.

 Distinction is maintained in physical level by assigning distinct object identiers.

Record-based Logical Models   Also describe data at the conceptual and view levels.

Unlike object-oriented models, are used to   Specify overall logical structure of the database, and Provide a higher-level description of the implementation.

Record-based Logical Models     Named so because the database is structured in xed-format records of several types.

Each record type defines a fixed number of fields, or attributes.

Each field is usually of a fixed length (this simplifies the implementation).

Record-based models do not include a mechanism for direct representation of code in the database.

Record-based Logical Models     Separate languages associated with the model are used to express database queries and updates.

The three most widely-accepted models are the relational, network, and hierarchical.

This course will concentrate on the relational model.

The network and hierarchical models are covered in appendices in the text.

Record-based Logical Models: Relational Model    The central data description construct in this model is a relation, which can be thought of as a set of records.

A description of data is called a schema.

In the relational model the schema for a relation specifies its name, the name for each field (or attribute or column), and the type for each field.

Record-based Logical Models: Relational Model     A relation can be thought as a table.

Data and relationships are represented by a collection of tables.

Each table has a number of columns with unique names, e.g. customer, account.

Consider the figure below:

Record-based Logical Models: Relational Model  A sample relational database

Record-based Logical Models: Relational Model  E.g. Student information in a university database may be stored in a relation with the following schema: Students (sid: string, login : string, age: integer, gpa: real)

Record-based Logical Models: The Network Model    Data are represented by collections of records.

Relationships among data are represented by links.

Organization is that of an arbitrary graph

Record-based Logical Models: The Network Model  Figure shows a sample network database that is the equivalent of the relational database

Record-based Logical Models: The Hierarchical Model   Similar to the network model.

Organization of the records is as a collection of trees, rather than arbitrary graphs.

Record-based Logical Models: The Hierarchical Model  A sample hierarchical database

Physical Data Models  Are used to describe data at the lowest level.

 Very few models, e.g.

  Unifying model.

Frame memory.

 We will not cover physical models.

Instances and Schemes  Databases change over time.

 The information in a database at a particular point in time is called an instance of the database.

 The overall design of the database is called the database scheme.

Instances and Schemes  Analogy with programming languages:   Data type definition - scheme Value of a variable – instance  There are several schemes, corresponding to levels of abstraction:    Physical scheme Conceptual scheme Subscheme (can be many)

Data Independence  The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.

Data Independence  There are two kinds:  Physical data independence   The ability to modify the physical scheme without causing application programs to be rewritten Modifications at this level are usually to improve performance  Logical data independence  The ability to modify the conceptual scheme without causing application programs to be rewritten  Usually done when logical structure of database is altered