CS206 --- Electronic Commerce
Download
Report
Transcript CS206 --- Electronic Commerce
+
Constraints and
Transactions and
ACID
CSCI 2141 – W2013
Slides from Ullman (A First Course in Database Systems:
http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/cons.ppt
http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/trans-view-index.ppt
www.cs.sunysb.edu/~cse515/Fall07/slides/ch21.ppt
1
+
2
Housekeeping
Missed 2 classes this week so are on a sprint to the finish
Today will focus on Transactions and ACID,
Included in today’s slide deck are details on constraints, triggers, and
transaction implementation
I will cover the high level and leave it to you to learn the implementation (db
dependent)
Assignment 5 is out and due on March 27th (Wed)
Practice more complex SQL and implementation of checks/constraints
Quiz on March 27th – same as assignment + ACID + concept of
transactions
Monday 25th: noSQL
April 1-3: Query optimization
April 5: DB security
April 8: Quiz (query optimization) + general review
+
3
Constraints and Triggers
A constraint is a relationship among data elements that the DBMS is
required to enforce.
Example: key constraints.
Triggers are only executed when a specified condition occurs, e.g.,
insertion of a tuple.
Easier to implement than complex constraints.
+
4
Kinds of Constraints
Keys.
Foreign-key, or referential-integrity.
Value-based constraints.
Tuple-based constraints.
Constrain values of a particular attribute.
Relationship among components.
Assertions: any SQL boolean expression.
+
Single-Attribute Keys
Place PRIMARY KEY or UNIQUE after the type in the declaration
of the attribute.
Example:
CREATE TABLE Product (
);
name
CHAR(20) UNIQUE,
manf
CHAR(20)
5
+
6
Multi-attribute Key
The
canteen and product together are the key for Sells:
CREATE TABLE Sells (
canteen
CHAR(20),
product
VARCHAR(20),
priceREAL,
PRIMARY KEY (canteen, product)
);
+
7
Foreign Keys
Values appearing in attributes of one relation must appear
together in certain attributes of another relation.
Example: in Sells(canteen, product, price), we might expect that
a product value also appears in Product.name .
+
8
Expressing Foreign Keys
Use keyword REFERENCES, either:
1.
After an attribute (for one-attribute keys).
2.
As an element of the schema:
FOREIGN KEY (<list of attributes>)
REFERENCES <relation> (<attributes>)
Referenced attributes must be declared PRIMARY KEY or
UNIQUE.
+
9
Example: With Attribute
CREATE TABLE products (
name
CHAR(20) PRIMARY KEY,
manf
CHAR(20) );
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20) REFERENCES products(name),
price
REAL );
+
Example: As Schema Element
CREATE TABLE products (
name CHAR(20) PRIMARY KEY,
manf CHAR(20) );
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
price
REAL,
FOREIGN KEY(product) REFERENCES products(name));
10
+
Enforcing Foreign-Key Constraints
If there is a foreign-key constraint from relation R to
relation S, two violations are possible:
1.
An insert or update to R introduces values not found in S.
2.
A deletion or update to S causes some tuples of R to
“dangle.”
11
+
12
How should we handle this?
Example: suppose R = Sells, S = products.
An insert or update to Sells that introduces a nonexistent product
must be rejected.
A deletion or update to products that removes a product value
found in some tuples of Sells can be handled in three ways (next
slide).
+
13
3 possible actions
1.
Default : Reject the modification.
2.
Cascade : Make the same changes in Sells.
3.
Deleted product: delete Sells tuple.
Updated product: change value in Sells.
Set NULL : Change the product to NULL.
+
14
Example: Cascade
Delete the Coke tuple from products:
Then delete all tuples from Sells that have product = ’Coke’.
Update the Coke tuple by changing ’Coke’
to ’CokeClassic’:
Then change all Sells tuples with product = ’Coke’ to product
= ’CokeClassic’.
+
15
Example: Set NULL
Delete the Coke tuple from products:
Change all tuples of Sells that have product = ’Coke’ to have
product = NULL.
Update the Coke tuple by changing ’Coke’
to ’CokeClassic’:
Same change as for deletion.
+
16
Choosing a Policy
When we declare a foreign key, we may choose policies SET NULL or
CASCADE independently for deletions and updates.
Follow the foreign-key declaration by:
ON [UPDATE, DELETE][SET NULL CASCADE]
Two such clauses may be used.
Otherwise, the default (reject) is used.
+ Example: Setting Policy
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
price
REAL,
FOREIGN KEY(product)
REFERENCES products(name)
ON DELETE SET NULL
ON UPDATE CASCADE
);
17
+
18
Attribute-Based Checks
Constraints on the value of a particular attribute.
Add CHECK(<condition>) to the declaration for the attribute.
The condition may use the name of the attribute, but any
other relation or attribute name must be in a subquery.
+
Example: Attribute-Based Check
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20)
CHECK ( product IN
(SELECT name FROM products)),
price
);
REAL CHECK ( price <= 5.00 )
19
+
20
Timing of Checks
Attribute-based checks are performed only when a value for
that attribute is inserted or updated.
Example: CHECK (price <= 5.00) checks every new price and
rejects the modification (for that tuple) if the price is more than $5.
Example: CHECK (product IN (SELECT name FROM products))
not checked if a product is deleted from products (unlike foreign-keys).
+
21
Tuple-Based Checks
CHECK (<condition>) may be added as a relation-schema
element.
The condition may refer to any attribute of the relation.
But other attributes or relations require a subquery.
Checked on insert or update only.
+
22
Example: Tuple-Based Check
Only
Joe’s canteen can sell product for more than $5:
CREATE TABLE Sells (
canteen
CHAR(20),
product
CHAR(20),
priceREAL,
CHECK (canteen = ‘Joes canteen’ OR
price <= 5.00)
);
+
23
Assertions
These are database-schema elements, like relations or views.
Defined by:
CREATE ASSERTION <name>
CHECK (<condition>);
Condition may refer to any relation or attribute in the
database schema.
+
24
Example: Assertion
In Sells(canteen, product, price), no canteen may charge an
average of more than $5.
CREATE ASSERTION NoRipoffcanteens CHECK (
NOT EXISTS (
SELECT canteen FROM Sells
GROUP BY canteen
HAVING 5.00 < AVG(price)
));
canteens with a
average price
above $5
+
25
Example: Assertion
In
Customers(name, addr, phone) and
canteens(name, addr, license), there cannot be
more canteens than Customers.
CREATE ASSERTION FewCanteen CHECK (
(SELECT COUNT(*) FROM canteens) <=
(SELECT COUNT(*) FROM Customers)
);
+
26
Timing of Assertion Checks
In principle, we must check every assertion after every
modification to any relation of the database.
A clever system can observe that only certain changes could cause
a given assertion to be violated.
Example: No change to products can affect FewCanteen. Neither can an
insertion to Customers.
+
27
Triggers: Motivation
Assertions are powerful, but the DBMS often can’t tell when
they need to be checked.
Attribute- and tuple-based checks are checked at known times,
but are not powerful.
Triggers let the user decide when to check for any condition.
+
28
Event-Condition-Action Rules
Another name for “trigger” is ECA rule, or event-conditionaction rule.
Event : typically a type of database modification, e.g., “insert
on Sells.”
Condition : Any SQL boolean-valued expression.
Action : Any SQL statements.
+
Preliminary Example: A Trigger
Instead of using a foreign-key constraint and rejecting
insertions into Sells(canteen, product, price) with unknown
products, a trigger can add that product to products, with a
NULL manufacturer.
29
+
30
Example: Trigger Definition
CREATE TRIGGER productTrig
The event
AFTER INSERT ON Sells
REFERENCING NEW ROW AS NewTuple
FOR EACH ROW
WHEN (NewTuple.product NOT IN
The condition
(SELECT name FROM products))
INSERT INTO products(name)
VALUES(NewTuple.product);
The action
+
31
Options: CREATE TRIGGER
CREATE TRIGGER <name>
Or:
CREATE OR REPLACE TRIGGER <name>
Useful if there is a trigger with that name and you want to modify
the trigger.
+
32
Options: The Event
AFTER can be BEFORE.
Also, INSTEAD OF, if the relation is a view.
A clever way to execute view modifications: have triggers
translate them to appropriate modifications on the base tables.
INSERT can be DELETE or UPDATE.
And UPDATE can be UPDATE . . . ON a particular attribute.
+
33
Options: FOR EACH ROW
Triggers are either “row-level” or “statement-level.”
FOR EACH ROW indicates row-level; its absence indicates
statement-level.
Row level triggers : execute once for each modified tuple.
Statement-level triggers : execute once for a SQL statement,
regardless of how many tuples are modified.
+
34
Options: REFERENCING
INSERT statements imply a new tuple (for row-level) or new
table (for statement-level).
The “table” is the set of inserted tuples.
DELETE implies an old tuple or table.
UPDATE implies both.
Refer to these by
[NEW OLD][TUPLE TABLE] AS <name>
+
35
Options: The Condition
Any boolean-valued condition.
Evaluated on the database as it would exist before or after the
triggering event, depending on whether BEFORE or AFTER is
used.
But always before the changes take effect.
Access the new/old tuple/table through the names in the
REFERENCING clause.
+
36
Options: The Action
There can be more than one SQL statement in the action.
Surround by BEGIN . . . END if there is more than one.
But queries make no sense in an action, so we are really
limited to modifications.
+
37
Another Example
Using Sells(canteen, product, price) and a unary relation
Ripoffcanteens(canteen), maintain a list of canteens that raise
the price of any product by more than $1.
+
38
The Trigger
CREATE TRIGGER PriceTrig
AFTER UPDATE OF price ON Sells
The event –
only changes
to prices
REFERENCING
Updates let us
OLD ROW AS ooo
talk about old
and new tuples
NEW ROW AS nnn
We need to consider
FOR EACH ROW
each price change
WHEN(nnn.price > ooo.price + 1.00)
INSERT INTO Ripoffcanteens
VALUES(nnn.canteen);
Condition:
a raise in
price > $1
When the price change
is great enough, add
the canteen to Ripoffcanteen
+
Transactions, Views, Indexes
Controlling Concurrent Behavior
Virtual and Materialized Views
Speeding Accesses to Data
39
+
40
Why Transactions?
Database systems are normally being accessed by many
users or processes at the same time.
Both queries and modifications.
Unlike operating systems, which support interaction of
processes, a DMBS needs to keep processes from
troublesome interactions.
+
41
Example: Bad Interaction
Two people with a joint back account each take $100 from
different ATMs at about the same time.
The DBMS better make sure one account deduction doesn’t get
lost.
Compare: An OS allows two people to edit a document at the
same time. If both write, one’s changes get lost.
+
42
Transactions
Transaction = process involving database queries and/or
modification.
Normally with some strong properties regarding
concurrency.
Formed in SQL from single statements or explicit
programmer control.
+ Transactions
43
The
execution of each transaction must
maintain the relationship between the
database state and the enterprise state
Therefore
additional requirements are placed
on the execution of transactions beyond those
placed on ordinary programs:
Atomicity
Consistency
Isolation
Durability
ACID properties
+
44
Database Consistency
Enterprise
(Business) Rules limit the
occurrence of certain real-world events
Student
cannot register for a course if the current
number of registrants equals the maximum allowed
Correspondingly,
allowable database states are
restricted
cur_reg <= max_reg
These
limitations are called (static) integrity
constraints: assertions that must be satisfied
by all database states (state invariants).
+
45
Database Consistency
(state invariants)
Other static consistency requirements are
related to the fact that the database might
store the same information in different ways
cur_reg
= |list_of_registered_students|
Such
limitations are also expressed as integrity
constraints
Database
is consistent if all static integrity
constraints are satisfied
+ Transaction Consistency
46
A
consistent database state does not necessarily
model the actual state of the enterprise
A deposit transaction that increments the balance by the
wrong amount maintains the integrity constraint balance
0, but does not maintain the relation between the
enterprise and database states
A
consistent transaction maintains database
consistency and the correspondence between the
database state and the enterprise state (implements its
specification)
Specification of deposit transaction includes
balance = balance + amt_deposit ,
(balance is the next value of balance)
+
47
Dynamic Integrity Constraints
(transition invariants)
Some
constraints restrict allowable state
transitions
A
transaction might transform the database from
one consistent state to another, but the
transition might not be permissible
Example:
A letter grade in a course (A, B, C, D, F)
cannot be changed to an incomplete (I)
Dynamic
constraints cannot be checked by
examining the database state
+
48
Transaction Consistency
Consistent
transaction: if DB is in consistent
state initially, when the transaction
completes:
All
static integrity constraints are satisfied (but
constraints might be violated in intermediate states)
New
No
state satisfies specifications of transaction
dynamic constraints have been violated
+ Checking Integrity Constraints
Automatic:
Embed constraint in schema.
CHECK, ASSERTION for static constraints
TRIGGER for dynamic constraints
Increases confidence in correctness and decreases
maintenance costs
Not always desirable since unnecessary checking
(overhead) might result
Deposit transaction modifies balance but cannot violate
constraint balance 0
Manual:
Perform check in application code.
Only necessary checks are performed
Scatters references to constraint throughout application
Difficult to maintain as transactions are modified/added
49
+
50
Atomicity
A
real-world event either happens or does not
happen
Student
either registers or does not register
Similarly,
the system must ensure that either
the corresponding transaction runs to
completion or, if not, it has no effect at all
Not
true of ordinary programs. A crash could leave
files partially updated on recovery
+
51
Commit and Abort
If
the transaction successfully completes
it is said to commit
The
system is responsible for ensuring that all
changes to the database have been saved
If
the transaction does not successfully
complete, it is said to abort
The
system is responsible for undoing, or
rolling back, all changes the transaction has
made
+
52
Reasons for Abort
System
crash
Transaction
Execution
aborted by system
cannot be made atomic (a site is down)
Execution
did not maintain database consistency
(integrity constraint is violated)
Execution
was not isolated
Resources
not available (deadlock)
Transaction
requests to roll back
+
53
API for Transactions
DBMS
and Transaction Processing (TP)
monitor provide commands for setting
transaction boundaries. Example:
begin transaction
commit
rollback
The
commit command is a request
The
system might commit the transaction, or it
might abort it for one of the reasons on the
previous slide
The
rollback command is always satisfied
+
54
Durability
The
system must ensure that once a transaction
commits, its effect on the database state is not
lost in spite of subsequent failures
Not
true of ordinary programs. A media failure after a
program successfully terminates could cause the file
system to be restored to a state that preceded the
program’s execution
+ Implementing Durability
Database
stored redundantly on mass storage
devices to protect against media failure
Architecture
of mass storage devices affects
type of media failures that can be tolerated
Related
to Availability: extent to which a
(possibly distributed) system can provide
service despite failure
Non-stop DBMS (mirrored disks)
Recovery based DBMS (log)
55
+ Isolation
Serial Execution: transactions execute in sequence
Each one starts after the previous one completes.
Execution of one transaction is not affected by the
operations of another since they do not overlap in time
The execution of each transaction is isolated from all
others.
If the initial database state and all transactions are
consistent, then the final database state will be
consistent and will accurately reflect the real-world
state, but
Serial execution is inadequate from a performance
perspective
56
+
57
Isolation
Concurrent
execution offers performance benefits:
A computer system has multiple resources capable of
executing independently (e.g., cpu’s, I/O devices), but
A transaction typically uses only one resource at a time
Hence, only concurrently executing transactions can
make effective use of the system
Concurrently executing transactions yield interleaved
schedules
+
begin trans
..
op1,1
..
op1,2
..
commit
Concurrent Execution
T1
op1,1 op1.2
sequence of db
operations output by T1
local computation
op1,1 op2,1 op2.2 op1.2
T2
op2,1 op2.2
DBMS
interleaved sequence of db
operations input to DBMS
local variables
58
+ Isolation
59
Interleaved
execution of a set of consistent
transactions offers performance benefits, but
might not be correct
Example:
course registration; cur_reg is number
local computation
of current registrants
not seen by DBMS
T1: r(cur_reg : 29)…………w(cur_reg : 30) commit
T2:
r(cur_reg : 29)……………..…w(cur_reg : 30) commit
time
Result: Database state no longer corresponds to
real-world state, integrity constraint violated
cur_reg <> |list_of_registered_students|
+
Interaction of Atomicity and
Isolation
T1: r(bal:10) w(bal:1000010)
abort
T2:
r(bal:1000010) w(yes!!!) commit
time
T1 deposits $1000000
T2 grants credit and commits before T1 completes
T1 aborts and rolls balance back to $10
T1 has had an effect even though it aborted!
60
+
61
Isolation
An interleaved schedule of transactions is isolated
if its effect is the same as if the transactions had
executed serially in some order (serializable)
T1: r(x)
w(x)
T2:
r(y)
w(y)
It follows that serializable schedules are always
correct (for any application)
Serializable is better than serial from a performance
point of view
DBMS uses locking to ensure that concurrent
schedules are serializable
+
62
Isolation in the Real World
SQL
supports SERIALIZABLE isolation level,
which guarantees serializability and hence
correctness for all applications
Performance
of applications running at
SERIALIZABLE is often not adequate
SQL
also supports weaker levels of isolation
with better performance characteristics
But
beware! -- a particular application might not run
correctly at a weaker level
+
63
Summary
Application
programmer is responsible for
creating consistent transactions and
choosing appropriate isolation level
The
system is responsible for creating the
abstractions of atomicity, durability, and
isolation
Greatly
simplifies programmer’s task since she
does not have to be concerned with failures or
concurrency
+
Up next: Slides from FCDB
More examples and details of transactions
Also, views and indexes
64
+
ACID Transactions
ACID transactions are:
Atomic : Whole transaction or none is done.
Consistent : Database constraints preserved.
Isolated : It appears to the user as if only one process executes at a time.
Durable : Effects of a process survive a crash.
Optional: weaker forms of transactions are often supported as well.
65
+
66
COMMIT
The SQL statement COMMIT causes a transaction to complete.
The database modifications are now permanent in the database.
+
67
ROLLBACK
The SQL statement ROLLBACK also causes the transaction to
end, but by aborting.
No effects on the database.
Failures like division by 0 or a constraint violation can also
cause rollback, even if the programmer does not request it.
+
Example: Interacting Processes
Assume the usual Sells(canteen,product,price) relation, and
suppose that Joe’s canteen sells only Coke for $2.50 and OJ for
$3.00.
Sally is querying Sells for the highest and lowest price Joe
charges.
Joe decides to stop selling Coke and OJ, but to sell only
Chocolate Milk at $3.50.
68
+
69
Sally’s Program
Sally executes the following two SQL statements called (min)
and (max) to help us remember what they do.
(max) SELECT MAX(price) FROM Sells
WHERE canteen = ‘Joes canteen’;
(min)
SELECT MIN(price) FROM Sells
WHERE canteen = ‘Joes canteen’;
+
70
Joe’s Program
At about the same time, Joe executes the following steps: (del) and
(ins).
(del)
DELETE FROM Sells
WHERE canteen = ’Joes canteen’;
(ins)
INSERT INTO Sells
VALUES(’Joes canteen’, ’ChocolateMilk’, 3.50);
+
71
Interleaving of Statements
Although (max) must come before (min), and (del) must
come before (ins), there are no other constraints on the order
of these statements, unless we group Sally’s and/or Joe’s
statements into transactions.
+
72
Example: Strange Interleaving
Suppose the steps execute in the order (max)(del)(ins)(min).
Joe’s Prices:
Statement:
{2.50,3.00} {2.50,3.00}
Result:
(max)
3.00
Sally sees MAX < MIN!
(del)
{3.50}
(ins)
(min)
3.50
+
Fixing the Problem With
Transactions
If we group Sally’s statements (max)(min) into one
transaction, then she cannot see this inconsistency.
She sees Joe’s prices at some fixed time.
Either before or after he changes prices, or in the middle, but the
MAX and MIN are computed from the same prices.
73
+
74
Another Problem: Rollback
Suppose Joe executes (del)(ins), not as a transaction, but after
executing these statements, thinks better of it and issues a
ROLLBACK statement.
If Sally executes her statements after (ins) but before the
rollback, she sees a value, 3.50, that never existed in the
database.
+
75
Solution
If Joe executes (del)(ins) as a transaction, its effect cannot be
seen by others until the transaction executes COMMIT.
If the transaction executes ROLLBACK instead, then its effects can
never be seen.
+
76
Isolation Levels
SQL defines four isolation levels = choices about what
interactions are allowed by transactions that execute at about
the same time.
Only one level (“serializable”) = ACID transactions.
Each DBMS implements transactions in its own way.
+
77
Choosing the Isolation Level
Within a transaction, we can say:
SET TRANSACTION ISOLATION LEVEL X
where X =
1.
SERIALIZABLE
2.
REPEATABLE READ
3.
READ COMMITTED
4.
READ UNCOMMITTED
+
78
Serializable Transactions
If Sally = (max)(min) and Joe = (del)(ins) are each
transactions, and Sally runs with isolation level SERIALIZABLE,
then she will see the database either before or after Joe runs,
but not in the middle.
+
Isolation Level Is Personal Choice
Your choice, e.g., run serializable, affects only how you see the
database, not how others see it.
Example: If Joe Runs serializable, but Sally doesn’t, then Sally
might see no prices for Joe’s canteen.
i.e., it looks to Sally as if she ran in the middle of Joe’s transaction.
79
+
80
Read-Commited Transactions
If Sally runs with isolation level READ COMMITTED, then she
can see only committed data, but not necessarily the same
data each time.
Example: Under READ COMMITTED, the interleaving
(max)(del)(ins)(min) is allowed, as long as Joe commits.
Sally sees MAX < MIN.
+
81
Repeatable-Read Transactions
Requirement is like read-committed, plus: if data is read
again, then everything seen the first time will be seen the
second time.
But the second and subsequent reads may see more tuples as
well.
+
82
Example: Repeatable Read
Suppose Sally runs under REPEATABLE READ, and the order
of execution is (max)(del)(ins)(min).
(max) sees prices 2.50 and 3.00.
(min) can see 3.50, but must also see 2.50 and 3.00, because they
were seen on the earlier read by (max).
+
83
Read Uncommitted
A transaction running under READ UNCOMMITTED can see
data in the database, even if it was written by a transaction
that has not committed (and may never).
Example: If Sally runs under READ UNCOMMITTED, she
could see a price 3.50 even if Joe later aborts.
+ Views
A view is a relation defined in terms of stored tables (called
base tables ) and other views.
Two kinds:
1.
Virtual = not stored in the database; just a query for
constructing the relation.
2.
Materialized = actually constructed and stored.
84
+ Declaring Views
Declare by:
CREATE [MATERIALIZED] VIEW <name> AS <query>;
Default is virtual.
85
+
86
Example: View Definition
CanDrink(drinker, product)
is a view “containing” the
drinker-product pairs such that the drinker frequents at
least one canteen that serves the product:
CREATE VIEW CanDrink AS
SELECT drinker, product
FROM Frequents, Sells
WHERE Frequents.canteen = Sells.canteen;
+
87
Example: Accessing a View
Query a view as if it were a base table.
Also: a limited ability to modify views if it makes sense as a
modification of one underlying base table.
Example query:
SELECT product FROM CanDrink
WHERE drinker = ’Sally’;
+
88
Triggers on Views
Generally, it is impossible to modify a virtual view, because it
doesn’t exist.
But an INSTEAD OF trigger lets us interpret view modifications
in a way that makes sense.
Example: View Synergy has (drinker, product, canteen) triples
such that the canteen serves the product, the drinker frequents
the canteen and likes the product.
+
89
Example: The View
CREATE VIEW Synergy AS
Pick one copy of
each attribute
SELECT Likes.drinker, Likes.product, Sells.canteen
FROM Likes, Sells, Frequents
WHERE Likes.drinker = Frequents.drinker
AND Likes.product = Sells.product
AND Sells.canteen = Frequents.canteen;
Natural join of Likes,
Sells, and Frequents
+
90
Interpreting a View Insertion
We cannot insert into Synergy --- it is a virtual view.
But we can use an INSTEAD OF trigger to turn a (drinker,
product, canteen) triple into three insertions of projected
pairs, one for each of Likes, Sells, and Frequents.
Sells.price will have to be NULL.
+
91
The Trigger
CREATE TRIGGER ViewTrig
INSTEAD OF INSERT ON Synergy
REFERENCING NEW ROW AS n
FOR EACH ROW
BEGIN
INSERT INTO LIKES VALUES(n.drinker, n.product);
INSERT INTO SELLS(canteen, product) VALUES(n.canteen, n.product);
INSERT INTO FREQUENTS VALUES(n.drinker, n.canteen);
END;
+
92
Materialized Views
Problem: each time a base table changes, the materialized
view may change.
Cannot afford to recompute the view with each change.
Solution: Periodic reconstruction of the materialized view,
which is otherwise “out of date.”
+
Example: Class Mailing List
The class mailing list all-2141 could be a materialized view of
the class enrollment in Banner.
Could be* updated at periodic intervals
Possible enroll in the class and miss an email sent out after you
enroll but before the materialized view is updated
93
+
94
Example: A Data Warehouse
CanteensRus stores every sale at every canteen in a database.
Overnight, the sales for the day are used to update a data
warehouse = materialized views of the sales.
The warehouse is used by analysts to predict trends and move
goods to where they are selling best.
+
95
Typical Data Warehouse
architecture
+
96
Database Index
A data structure that improves the speed of data retrieval
operations on a database table at the cost of slower writes and
the use of more storage space.
Indices can be created using one or more columns of a database
table, providing the basis for both rapid random lookups and
efficient access of ordered records.
In a relational database, an index is a copy of one part of a table.
Some databases extend the power of indexing by allowing
indices to be created on functions or expressions.
For example, an index could be created on upper(last_name), which
would only store the upper case versions of the last_name field in the
index.
Another option sometimes supported is the use of "filtered"
indices, where index entries are created only for those records
that satisfy some conditional expression.
+
97
Indexes
Index = data structure used to speed access to tuples of a
relation, given values of one or more attributes.
Could be a hash table, but in a DBMS it is always a balanced
search tree with giant nodes (a full disk page) called a B-tree.
+
98
Declaring Indexes
No standard!
Typical syntax:
CREATE INDEX productIndex ON products(manf);
CREATE INDEX SellIndex ON Sells(canteen, product);
+
99
Using Indexes
Given a value v, the index takes us to only those tuples that
have v in the attribute(s) of the index.
Example: use productIndex and SellIndex to find the prices
of products manufactured by Pete and sold by Joe. (next
slide)
+ Example of Using Indexes
SELECT price FROM products, Sells
WHERE manf = ‘Pete’ AND
products.name = Sells.product AND
canteen = ’Joes canteen’;
1.
Use productIndex to get all the products made by Pete.
2.
Then use SellIndex to get prices of those products, with
canteen = ’Joes canteen’
100
+
101
Database Tuning
A major problem in making a database run fast is deciding
which indexes to create.
Pro: An index speeds up queries that can use it.
Con: An index slows down all modifications on its relation
because the index must be modified too.
+
102
Example: Tuning
Suppose the only things we did with our products
database was:
1.
Insert new facts into a relation (10%).
2.
Find the price of a given product at a given canteen (90%).
Then SellInd on Sells(canteen, product) would be
wonderful, but productInd on products(manf) would be
harmful.
+
103
Tuning Advisors
A major research thrust.
Because hand tuning is so hard.
An advisor gets a query load, e.g.:
1.
Choose random queries from the history of queries run on
the database, or
2.
Designer provides a sample workload.
+
104
Tuning Advisors
The advisor generates candidate indexes and evaluates each
on the workload.
Feed each sample query to the query optimizer, which assumes
only this one index is available.
Measure the improvement/degradation in the average running
time of the queries.