CS206 --- Electronic Commerce

Download Report

Transcript CS206 --- Electronic Commerce

+
Constraints and
Transactions and
ACID
CSCI 2141 – W2013
Slides from Ullman (A First Course in Database Systems:
http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/cons.ppt
http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/trans-view-index.ppt
www.cs.sunysb.edu/~cse515/Fall07/slides/ch21.ppt
1
+
2
Housekeeping

Missed 2 classes this week so are on a sprint to the finish

Today will focus on Transactions and ACID,



Included in today’s slide deck are details on constraints, triggers, and
transaction implementation
I will cover the high level and leave it to you to learn the implementation (db
dependent)
Assignment 5 is out and due on March 27th (Wed)

Practice more complex SQL and implementation of checks/constraints

Quiz on March 27th – same as assignment + ACID + concept of
transactions

Monday 25th: noSQL

April 1-3: Query optimization

April 5: DB security

April 8: Quiz (query optimization) + general review
+
3
Constraints and Triggers

A constraint is a relationship among data elements that the DBMS is
required to enforce.


Example: key constraints.
Triggers are only executed when a specified condition occurs, e.g.,
insertion of a tuple.

Easier to implement than complex constraints.
+
4
Kinds of Constraints

Keys.

Foreign-key, or referential-integrity.

Value-based constraints.


Tuple-based constraints.


Constrain values of a particular attribute.
Relationship among components.
Assertions: any SQL boolean expression.
+
Single-Attribute Keys

Place PRIMARY KEY or UNIQUE after the type in the declaration
of the attribute.

Example:
CREATE TABLE Product (
);
name
CHAR(20) UNIQUE,
manf
CHAR(20)
5
+
6
Multi-attribute Key
 The
canteen and product together are the key for Sells:
CREATE TABLE Sells (
canteen
CHAR(20),
product
VARCHAR(20),
priceREAL,
PRIMARY KEY (canteen, product)
);
+
7
Foreign Keys

Values appearing in attributes of one relation must appear
together in certain attributes of another relation.

Example: in Sells(canteen, product, price), we might expect that
a product value also appears in Product.name .
+
8
Expressing Foreign Keys
Use keyword REFERENCES, either:

1.
After an attribute (for one-attribute keys).
2.
As an element of the schema:
FOREIGN KEY (<list of attributes>)
REFERENCES <relation> (<attributes>)

Referenced attributes must be declared PRIMARY KEY or
UNIQUE.
+
9
Example: With Attribute
CREATE TABLE products (
name
CHAR(20) PRIMARY KEY,
manf
CHAR(20) );
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20) REFERENCES products(name),
price
REAL );
+
Example: As Schema Element
CREATE TABLE products (
name CHAR(20) PRIMARY KEY,
manf CHAR(20) );
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
price
REAL,
FOREIGN KEY(product) REFERENCES products(name));
10
+
Enforcing Foreign-Key Constraints
If there is a foreign-key constraint from relation R to
relation S, two violations are possible:

1.
An insert or update to R introduces values not found in S.
2.
A deletion or update to S causes some tuples of R to
“dangle.”
11
+
12
How should we handle this?

Example: suppose R = Sells, S = products.

An insert or update to Sells that introduces a nonexistent product
must be rejected.

A deletion or update to products that removes a product value
found in some tuples of Sells can be handled in three ways (next
slide).
+
13
3 possible actions
1.
Default : Reject the modification.
2.
Cascade : Make the same changes in Sells.
3.

Deleted product: delete Sells tuple.

Updated product: change value in Sells.
Set NULL : Change the product to NULL.
+
14
Example: Cascade

Delete the Coke tuple from products:


Then delete all tuples from Sells that have product = ’Coke’.
Update the Coke tuple by changing ’Coke’
to ’CokeClassic’:

Then change all Sells tuples with product = ’Coke’ to product
= ’CokeClassic’.
+
15
Example: Set NULL

Delete the Coke tuple from products:


Change all tuples of Sells that have product = ’Coke’ to have
product = NULL.
Update the Coke tuple by changing ’Coke’
to ’CokeClassic’:

Same change as for deletion.
+
16
Choosing a Policy

When we declare a foreign key, we may choose policies SET NULL or
CASCADE independently for deletions and updates.

Follow the foreign-key declaration by:
ON [UPDATE, DELETE][SET NULL CASCADE]

Two such clauses may be used.

Otherwise, the default (reject) is used.
+ Example: Setting Policy
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
price
REAL,
FOREIGN KEY(product)
REFERENCES products(name)
ON DELETE SET NULL
ON UPDATE CASCADE
);
17
+
18
Attribute-Based Checks

Constraints on the value of a particular attribute.

Add CHECK(<condition>) to the declaration for the attribute.

The condition may use the name of the attribute, but any
other relation or attribute name must be in a subquery.
+
Example: Attribute-Based Check
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20)
CHECK ( product IN
(SELECT name FROM products)),
price
);
REAL CHECK ( price <= 5.00 )
19
+
20
Timing of Checks

Attribute-based checks are performed only when a value for
that attribute is inserted or updated.


Example: CHECK (price <= 5.00) checks every new price and
rejects the modification (for that tuple) if the price is more than $5.
Example: CHECK (product IN (SELECT name FROM products))
not checked if a product is deleted from products (unlike foreign-keys).
+
21
Tuple-Based Checks

CHECK (<condition>) may be added as a relation-schema
element.

The condition may refer to any attribute of the relation.


But other attributes or relations require a subquery.
Checked on insert or update only.
+
22
Example: Tuple-Based Check
 Only
Joe’s canteen can sell product for more than $5:
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
priceREAL,
CHECK (canteen = ‘Joes canteen’ OR
price <= 5.00)
);
+
23
Assertions

These are database-schema elements, like relations or views.

Defined by:
CREATE ASSERTION <name>
CHECK (<condition>);

Condition may refer to any relation or attribute in the
database schema.
+
24
Example: Assertion

In Sells(canteen, product, price), no canteen may charge an
average of more than $5.
CREATE ASSERTION NoRipoffcanteens CHECK (
NOT EXISTS (
SELECT canteen FROM Sells
GROUP BY canteen
HAVING 5.00 < AVG(price)
));
canteens with a
average price
above $5
+
25
Example: Assertion
 In
Customers(name, addr, phone) and
canteens(name, addr, license), there cannot be
more canteens than Customers.
CREATE ASSERTION FewCanteen CHECK (
(SELECT COUNT(*) FROM canteens) <=
(SELECT COUNT(*) FROM Customers)
);
+
26
Timing of Assertion Checks

In principle, we must check every assertion after every
modification to any relation of the database.

A clever system can observe that only certain changes could cause
a given assertion to be violated.

Example: No change to products can affect FewCanteen. Neither can an
insertion to Customers.
+
27
Triggers: Motivation

Assertions are powerful, but the DBMS often can’t tell when
they need to be checked.

Attribute- and tuple-based checks are checked at known times,
but are not powerful.

Triggers let the user decide when to check for any condition.
+
28
Event-Condition-Action Rules

Another name for “trigger” is ECA rule, or event-conditionaction rule.

Event : typically a type of database modification, e.g., “insert
on Sells.”

Condition : Any SQL boolean-valued expression.

Action : Any SQL statements.
+
Preliminary Example: A Trigger

Instead of using a foreign-key constraint and rejecting
insertions into Sells(canteen, product, price) with unknown
products, a trigger can add that product to products, with a
NULL manufacturer.
29
+
30
Example: Trigger Definition
CREATE TRIGGER productTrig
The event
AFTER INSERT ON Sells
REFERENCING NEW ROW AS NewTuple
FOR EACH ROW
WHEN (NewTuple.product NOT IN
The condition
(SELECT name FROM products))
INSERT INTO products(name)
VALUES(NewTuple.product);
The action
+
31
Options: CREATE TRIGGER

CREATE TRIGGER <name>

Or:
CREATE OR REPLACE TRIGGER <name>

Useful if there is a trigger with that name and you want to modify
the trigger.
+
32
Options: The Event

AFTER can be BEFORE.

Also, INSTEAD OF, if the relation is a view.


A clever way to execute view modifications: have triggers
translate them to appropriate modifications on the base tables.
INSERT can be DELETE or UPDATE.

And UPDATE can be UPDATE . . . ON a particular attribute.
+
33
Options: FOR EACH ROW

Triggers are either “row-level” or “statement-level.”

FOR EACH ROW indicates row-level; its absence indicates
statement-level.

Row level triggers : execute once for each modified tuple.

Statement-level triggers : execute once for a SQL statement,
regardless of how many tuples are modified.
+
34
Options: REFERENCING

INSERT statements imply a new tuple (for row-level) or new
table (for statement-level).

The “table” is the set of inserted tuples.

DELETE implies an old tuple or table.

UPDATE implies both.

Refer to these by
[NEW OLD][TUPLE TABLE] AS <name>
+
35
Options: The Condition

Any boolean-valued condition.

Evaluated on the database as it would exist before or after the
triggering event, depending on whether BEFORE or AFTER is
used.


But always before the changes take effect.
Access the new/old tuple/table through the names in the
REFERENCING clause.
+
36
Options: The Action

There can be more than one SQL statement in the action.


Surround by BEGIN . . . END if there is more than one.
But queries make no sense in an action, so we are really
limited to modifications.
+
37
Another Example

Using Sells(canteen, product, price) and a unary relation
Ripoffcanteens(canteen), maintain a list of canteens that raise
the price of any product by more than $1.
+
38
The Trigger
CREATE TRIGGER PriceTrig
AFTER UPDATE OF price ON Sells
The event –
only changes
to prices
REFERENCING
Updates let us
OLD ROW AS ooo
talk about old
and new tuples
NEW ROW AS nnn
We need to consider
FOR EACH ROW
each price change
WHEN(nnn.price > ooo.price + 1.00)
INSERT INTO Ripoffcanteens
VALUES(nnn.canteen);
Condition:
a raise in
price > $1
When the price change
is great enough, add
the canteen to Ripoffcanteen
+
Transactions, Views, Indexes
Controlling Concurrent Behavior
Virtual and Materialized Views
Speeding Accesses to Data
39
+
40
Why Transactions?

Database systems are normally being accessed by many
users or processes at the same time.


Both queries and modifications.
Unlike operating systems, which support interaction of
processes, a DMBS needs to keep processes from
troublesome interactions.
+
41
Example: Bad Interaction

Two people with a joint back account each take $100 from
different ATMs at about the same time.


The DBMS better make sure one account deduction doesn’t get
lost.
Compare: An OS allows two people to edit a document at the
same time. If both write, one’s changes get lost.
+
42
Transactions

Transaction = process involving database queries and/or
modification.

Normally with some strong properties regarding
concurrency.

Formed in SQL from single statements or explicit
programmer control.
+ Transactions
43
 The
execution of each transaction must
maintain the relationship between the
database state and the enterprise state
 Therefore
additional requirements are placed
on the execution of transactions beyond those
placed on ordinary programs:
 Atomicity
 Consistency
 Isolation
 Durability
ACID properties
+
44
Database Consistency
 Enterprise
(Business) Rules limit the
occurrence of certain real-world events
 Student
cannot register for a course if the current
number of registrants equals the maximum allowed
 Correspondingly,
allowable database states are
restricted
cur_reg <= max_reg
 These
limitations are called (static) integrity
constraints: assertions that must be satisfied
by all database states (state invariants).
+
45
Database Consistency
(state invariants)
 Other static consistency requirements are
related to the fact that the database might
store the same information in different ways
 cur_reg
= |list_of_registered_students|
 Such
limitations are also expressed as integrity
constraints
 Database
is consistent if all static integrity
constraints are satisfied
+ Transaction Consistency
46
A
consistent database state does not necessarily
model the actual state of the enterprise

A deposit transaction that increments the balance by the
wrong amount maintains the integrity constraint balance 
0, but does not maintain the relation between the
enterprise and database states
A
consistent transaction maintains database
consistency and the correspondence between the
database state and the enterprise state (implements its
specification)

Specification of deposit transaction includes
balance = balance + amt_deposit ,
(balance is the next value of balance)
+
47
Dynamic Integrity Constraints
(transition invariants)
 Some
constraints restrict allowable state
transitions
A
transaction might transform the database from
one consistent state to another, but the
transition might not be permissible
 Example:
A letter grade in a course (A, B, C, D, F)
cannot be changed to an incomplete (I)
 Dynamic
constraints cannot be checked by
examining the database state
+
48
Transaction Consistency
 Consistent
transaction: if DB is in consistent
state initially, when the transaction
completes:
 All
static integrity constraints are satisfied (but
constraints might be violated in intermediate states)
 New
 No
state satisfies specifications of transaction
dynamic constraints have been violated
+ Checking Integrity Constraints
 Automatic:
Embed constraint in schema.
CHECK, ASSERTION for static constraints
 TRIGGER for dynamic constraints
 Increases confidence in correctness and decreases
maintenance costs
 Not always desirable since unnecessary checking
(overhead) might result


Deposit transaction modifies balance but cannot violate
constraint balance  0
 Manual:
Perform check in application code.
Only necessary checks are performed
 Scatters references to constraint throughout application
 Difficult to maintain as transactions are modified/added

49
+
50
Atomicity
A
real-world event either happens or does not
happen
 Student
either registers or does not register
 Similarly,
the system must ensure that either
the corresponding transaction runs to
completion or, if not, it has no effect at all
 Not
true of ordinary programs. A crash could leave
files partially updated on recovery
+
51
Commit and Abort
 If
the transaction successfully completes
it is said to commit
 The
system is responsible for ensuring that all
changes to the database have been saved
 If
the transaction does not successfully
complete, it is said to abort
 The
system is responsible for undoing, or
rolling back, all changes the transaction has
made
+
52
Reasons for Abort
 System
crash
 Transaction
 Execution
aborted by system
cannot be made atomic (a site is down)
 Execution
did not maintain database consistency
(integrity constraint is violated)
 Execution
was not isolated
 Resources
not available (deadlock)
 Transaction
requests to roll back
+
53
API for Transactions
 DBMS
and Transaction Processing (TP)
monitor provide commands for setting
transaction boundaries. Example:
 begin transaction
 commit
 rollback
 The
commit command is a request
 The
system might commit the transaction, or it
might abort it for one of the reasons on the
previous slide
 The
rollback command is always satisfied
+
54
Durability
 The
system must ensure that once a transaction
commits, its effect on the database state is not
lost in spite of subsequent failures
 Not
true of ordinary programs. A media failure after a
program successfully terminates could cause the file
system to be restored to a state that preceded the
program’s execution
+ Implementing Durability
 Database
stored redundantly on mass storage
devices to protect against media failure
 Architecture
of mass storage devices affects
type of media failures that can be tolerated
 Related
to Availability: extent to which a
(possibly distributed) system can provide
service despite failure


Non-stop DBMS (mirrored disks)
Recovery based DBMS (log)
55
+ Isolation

Serial Execution: transactions execute in sequence

Each one starts after the previous one completes.


Execution of one transaction is not affected by the
operations of another since they do not overlap in time
The execution of each transaction is isolated from all
others.

If the initial database state and all transactions are
consistent, then the final database state will be
consistent and will accurately reflect the real-world
state, but

Serial execution is inadequate from a performance
perspective
56
+
57
Isolation
 Concurrent
execution offers performance benefits:

A computer system has multiple resources capable of
executing independently (e.g., cpu’s, I/O devices), but

A transaction typically uses only one resource at a time

Hence, only concurrently executing transactions can
make effective use of the system

Concurrently executing transactions yield interleaved
schedules
+
begin trans
..
op1,1
..
op1,2
..
commit
Concurrent Execution
T1
op1,1 op1.2
sequence of db
operations output by T1
local computation
op1,1 op2,1 op2.2 op1.2
T2
op2,1 op2.2
DBMS
interleaved sequence of db
operations input to DBMS
local variables
58
+ Isolation
59
 Interleaved
execution of a set of consistent
transactions offers performance benefits, but
might not be correct
 Example:
course registration; cur_reg is number
local computation
of current registrants
not seen by DBMS
T1: r(cur_reg : 29)…………w(cur_reg : 30) commit
T2:
r(cur_reg : 29)……………..…w(cur_reg : 30) commit
time 
Result: Database state no longer corresponds to
real-world state, integrity constraint violated
cur_reg <> |list_of_registered_students|
+
Interaction of Atomicity and
Isolation
T1: r(bal:10) w(bal:1000010)
abort
T2:
r(bal:1000010) w(yes!!!) commit
time 

T1 deposits $1000000

T2 grants credit and commits before T1 completes

T1 aborts and rolls balance back to $10

T1 has had an effect even though it aborted!
60
+
61
Isolation

An interleaved schedule of transactions is isolated
if its effect is the same as if the transactions had
executed serially in some order (serializable)
T1: r(x)
w(x)
T2:
r(y)
w(y)

It follows that serializable schedules are always
correct (for any application)

Serializable is better than serial from a performance
point of view

DBMS uses locking to ensure that concurrent
schedules are serializable
+
62
Isolation in the Real World
 SQL
supports SERIALIZABLE isolation level,
which guarantees serializability and hence
correctness for all applications
 Performance
of applications running at
SERIALIZABLE is often not adequate
 SQL
also supports weaker levels of isolation
with better performance characteristics
 But
beware! -- a particular application might not run
correctly at a weaker level
+
63
Summary
 Application
programmer is responsible for
creating consistent transactions and
choosing appropriate isolation level
 The
system is responsible for creating the
abstractions of atomicity, durability, and
isolation
 Greatly
simplifies programmer’s task since she
does not have to be concerned with failures or
concurrency
+
Up next: Slides from FCDB
More examples and details of transactions
Also, views and indexes
64
+


ACID Transactions
ACID transactions are:

Atomic : Whole transaction or none is done.

Consistent : Database constraints preserved.

Isolated : It appears to the user as if only one process executes at a time.

Durable : Effects of a process survive a crash.
Optional: weaker forms of transactions are often supported as well.
65
+
66
COMMIT

The SQL statement COMMIT causes a transaction to complete.

The database modifications are now permanent in the database.
+
67
ROLLBACK

The SQL statement ROLLBACK also causes the transaction to
end, but by aborting.


No effects on the database.
Failures like division by 0 or a constraint violation can also
cause rollback, even if the programmer does not request it.
+
Example: Interacting Processes

Assume the usual Sells(canteen,product,price) relation, and
suppose that Joe’s canteen sells only Coke for $2.50 and OJ for
$3.00.

Sally is querying Sells for the highest and lowest price Joe
charges.

Joe decides to stop selling Coke and OJ, but to sell only
Chocolate Milk at $3.50.
68
+
69
Sally’s Program

Sally executes the following two SQL statements called (min)
and (max) to help us remember what they do.
(max) SELECT MAX(price) FROM Sells
WHERE canteen = ‘Joes canteen’;
(min)
SELECT MIN(price) FROM Sells
WHERE canteen = ‘Joes canteen’;
+
70
Joe’s Program

At about the same time, Joe executes the following steps: (del) and
(ins).
(del)
DELETE FROM Sells
WHERE canteen = ’Joes canteen’;
(ins)
INSERT INTO Sells
VALUES(’Joes canteen’, ’ChocolateMilk’, 3.50);
+
71
Interleaving of Statements

Although (max) must come before (min), and (del) must
come before (ins), there are no other constraints on the order
of these statements, unless we group Sally’s and/or Joe’s
statements into transactions.
+
72
Example: Strange Interleaving

Suppose the steps execute in the order (max)(del)(ins)(min).
Joe’s Prices:
Statement:
{2.50,3.00} {2.50,3.00}
Result:
(max)
3.00

Sally sees MAX < MIN!
(del)
{3.50}
(ins)
(min)
3.50
+
Fixing the Problem With
Transactions

If we group Sally’s statements (max)(min) into one
transaction, then she cannot see this inconsistency.

She sees Joe’s prices at some fixed time.

Either before or after he changes prices, or in the middle, but the
MAX and MIN are computed from the same prices.
73
+
74
Another Problem: Rollback

Suppose Joe executes (del)(ins), not as a transaction, but after
executing these statements, thinks better of it and issues a
ROLLBACK statement.

If Sally executes her statements after (ins) but before the
rollback, she sees a value, 3.50, that never existed in the
database.
+
75
Solution

If Joe executes (del)(ins) as a transaction, its effect cannot be
seen by others until the transaction executes COMMIT.

If the transaction executes ROLLBACK instead, then its effects can
never be seen.
+
76
Isolation Levels

SQL defines four isolation levels = choices about what
interactions are allowed by transactions that execute at about
the same time.

Only one level (“serializable”) = ACID transactions.

Each DBMS implements transactions in its own way.
+
77
Choosing the Isolation Level
Within a transaction, we can say:

SET TRANSACTION ISOLATION LEVEL X
where X =
1.
SERIALIZABLE
2.
REPEATABLE READ
3.
READ COMMITTED
4.
READ UNCOMMITTED
+
78
Serializable Transactions

If Sally = (max)(min) and Joe = (del)(ins) are each
transactions, and Sally runs with isolation level SERIALIZABLE,
then she will see the database either before or after Joe runs,
but not in the middle.
+
Isolation Level Is Personal Choice

Your choice, e.g., run serializable, affects only how you see the
database, not how others see it.

Example: If Joe Runs serializable, but Sally doesn’t, then Sally
might see no prices for Joe’s canteen.

i.e., it looks to Sally as if she ran in the middle of Joe’s transaction.
79
+
80
Read-Commited Transactions

If Sally runs with isolation level READ COMMITTED, then she
can see only committed data, but not necessarily the same
data each time.

Example: Under READ COMMITTED, the interleaving
(max)(del)(ins)(min) is allowed, as long as Joe commits.

Sally sees MAX < MIN.
+
81
Repeatable-Read Transactions

Requirement is like read-committed, plus: if data is read
again, then everything seen the first time will be seen the
second time.

But the second and subsequent reads may see more tuples as
well.
+
82
Example: Repeatable Read

Suppose Sally runs under REPEATABLE READ, and the order
of execution is (max)(del)(ins)(min).

(max) sees prices 2.50 and 3.00.

(min) can see 3.50, but must also see 2.50 and 3.00, because they
were seen on the earlier read by (max).
+
83
Read Uncommitted

A transaction running under READ UNCOMMITTED can see
data in the database, even if it was written by a transaction
that has not committed (and may never).

Example: If Sally runs under READ UNCOMMITTED, she
could see a price 3.50 even if Joe later aborts.
+ Views

A view is a relation defined in terms of stored tables (called
base tables ) and other views.

Two kinds:
1.
Virtual = not stored in the database; just a query for
constructing the relation.
2.
Materialized = actually constructed and stored.
84
+ Declaring Views

Declare by:
CREATE [MATERIALIZED] VIEW <name> AS <query>;

Default is virtual.
85
+
86
Example: View Definition
 CanDrink(drinker, product)
is a view “containing” the
drinker-product pairs such that the drinker frequents at
least one canteen that serves the product:
CREATE VIEW CanDrink AS
SELECT drinker, product
FROM Frequents, Sells
WHERE Frequents.canteen = Sells.canteen;
+
87
Example: Accessing a View

Query a view as if it were a base table.


Also: a limited ability to modify views if it makes sense as a
modification of one underlying base table.
Example query:
SELECT product FROM CanDrink
WHERE drinker = ’Sally’;
+
88
Triggers on Views

Generally, it is impossible to modify a virtual view, because it
doesn’t exist.

But an INSTEAD OF trigger lets us interpret view modifications
in a way that makes sense.

Example: View Synergy has (drinker, product, canteen) triples
such that the canteen serves the product, the drinker frequents
the canteen and likes the product.
+
89
Example: The View
CREATE VIEW Synergy AS
Pick one copy of
each attribute
SELECT Likes.drinker, Likes.product, Sells.canteen
FROM Likes, Sells, Frequents
WHERE Likes.drinker = Frequents.drinker
AND Likes.product = Sells.product
AND Sells.canteen = Frequents.canteen;
Natural join of Likes,
Sells, and Frequents
+
90
Interpreting a View Insertion

We cannot insert into Synergy --- it is a virtual view.

But we can use an INSTEAD OF trigger to turn a (drinker,
product, canteen) triple into three insertions of projected
pairs, one for each of Likes, Sells, and Frequents.

Sells.price will have to be NULL.
+
91
The Trigger
CREATE TRIGGER ViewTrig
INSTEAD OF INSERT ON Synergy
REFERENCING NEW ROW AS n
FOR EACH ROW
BEGIN
INSERT INTO LIKES VALUES(n.drinker, n.product);
INSERT INTO SELLS(canteen, product) VALUES(n.canteen, n.product);
INSERT INTO FREQUENTS VALUES(n.drinker, n.canteen);
END;
+
92
Materialized Views

Problem: each time a base table changes, the materialized
view may change.


Cannot afford to recompute the view with each change.
Solution: Periodic reconstruction of the materialized view,
which is otherwise “out of date.”
+
Example: Class Mailing List

The class mailing list all-2141 could be a materialized view of
the class enrollment in Banner.

Could be* updated at periodic intervals

Possible enroll in the class and miss an email sent out after you
enroll but before the materialized view is updated
93
+
94
Example: A Data Warehouse

CanteensRus stores every sale at every canteen in a database.

Overnight, the sales for the day are used to update a data
warehouse = materialized views of the sales.

The warehouse is used by analysts to predict trends and move
goods to where they are selling best.
+
95
Typical Data Warehouse
architecture
+
96
Database Index

A data structure that improves the speed of data retrieval
operations on a database table at the cost of slower writes and
the use of more storage space.

Indices can be created using one or more columns of a database
table, providing the basis for both rapid random lookups and
efficient access of ordered records.

In a relational database, an index is a copy of one part of a table.
Some databases extend the power of indexing by allowing
indices to be created on functions or expressions.


For example, an index could be created on upper(last_name), which
would only store the upper case versions of the last_name field in the
index.
Another option sometimes supported is the use of "filtered"
indices, where index entries are created only for those records
that satisfy some conditional expression.
+
97
Indexes

Index = data structure used to speed access to tuples of a
relation, given values of one or more attributes.

Could be a hash table, but in a DBMS it is always a balanced
search tree with giant nodes (a full disk page) called a B-tree.
+
98
Declaring Indexes

No standard!

Typical syntax:
CREATE INDEX productIndex ON products(manf);
CREATE INDEX SellIndex ON Sells(canteen, product);
+
99
Using Indexes

Given a value v, the index takes us to only those tuples that
have v in the attribute(s) of the index.

Example: use productIndex and SellIndex to find the prices
of products manufactured by Pete and sold by Joe. (next
slide)
+ Example of Using Indexes
SELECT price FROM products, Sells
WHERE manf = ‘Pete’ AND
products.name = Sells.product AND
canteen = ’Joes canteen’;
1.
Use productIndex to get all the products made by Pete.
2.
Then use SellIndex to get prices of those products, with
canteen = ’Joes canteen’
100
+
101
Database Tuning

A major problem in making a database run fast is deciding
which indexes to create.

Pro: An index speeds up queries that can use it.

Con: An index slows down all modifications on its relation
because the index must be modified too.
+
102
Example: Tuning
Suppose the only things we did with our products
database was:


1.
Insert new facts into a relation (10%).
2.
Find the price of a given product at a given canteen (90%).
Then SellInd on Sells(canteen, product) would be
wonderful, but productInd on products(manf) would be
harmful.
+
103
Tuning Advisors
A major research thrust.


Because hand tuning is so hard.
An advisor gets a query load, e.g.:

1.
Choose random queries from the history of queries run on
the database, or
2.
Designer provides a sample workload.
+
104
Tuning Advisors

The advisor generates candidate indexes and evaluates each
on the workload.

Feed each sample query to the query optimizer, which assumes
only this one index is available.

Measure the improvement/degradation in the average running
time of the queries.