16. Concurrency Control and Recovery (only for DBs with updates…..!)- Review  Concurrency Control  Transaction  ACID  Isolation • Schedules • Guaranteeing isolation • • • • • 11/6/2015 Serializability Serializability ⇔ Isolation Locking Strict Two.

Download Report

Transcript 16. Concurrency Control and Recovery (only for DBs with updates…..!)- Review  Concurrency Control  Transaction  ACID  Isolation • Schedules • Guaranteeing isolation • • • • • 11/6/2015 Serializability Serializability ⇔ Isolation Locking Strict Two.

16. Concurrency Control and Recovery
(only for DBs with updates…..!)- Review

Concurrency Control
 Transaction
 ACID
 Isolation
• Schedules
• Guaranteeing isolation
•
•
•
•
•
11/6/2015
Serializability
Serializability ⇔ Isolation
Locking
Strict Two Phase Locking
Strict 2PL ⇒Serializable
PSU’s CS 587
1
Learning Objectives
Define ACID, schedule, isolated, equivalent,
serializable, S2PL, conflict serializable,
precedence graph, recoverable.
 Know the implications on slide 25 and when
the converses hold
 Explain lock management, multiple
granularity locks, phantoms, locking in
BTrees, optimistic concurrency control

11/6/2015
PSU’s CS 587
2
Example Transaction

Transfer $100 from A to B
 Read A; Verify A; Write A-100; then
 Read B; Verify B; Write B+100
Are all 6 steps necessary?
 Which steps require disc access?
 When can an abort occur without damage?

 Write is as in a program’s write
What damage can an abort cause?
 How can you avoid such damage?

11/6/2015
PSU’s CS 587
3
Transaction (cont.)
User (application developer) must indicate:
 Begin transaction
 read/write/modify statements intermixed with
other programming language statements

plus either
 commit - indicates successful completion or
 abort - indicates program wants to roll back (erase
the transaction)

11/6/2015
All or nothing! (Atomic)
PSU’s CS 587
4
Supporting the ACID Properties of
Transactions
Recovery  Atomicity: All actions in a transaction happen in
their entirety or not at all.
System

Programmers
Consistency: If the DB starts in a consistent state, (this
notion is defined by the user; some of it
may be enforced by integrity constraints)
and if a transaction executes with no other
queries active, then the DB ends up
in a consistent state.
Concurrency
 Isolation:
Each transaction is isolated from other
Control
transactions. The effect on the DB is as
if each transaction executed by itself.
System
Recovery  Durability: If a transaction commits, its changes
to the database state persist.
System
11/6/2015
PSU’s CS 587
5
Isolation/Concurrency Control
T1:
BEGIN A+=100, B-=100 END
T2:
BEGIN A=1.06*A, B=1.06*B END
What is each of these transactions doing?
 A schedule of T1 and T2 is an interleaving of
the steps of these transactions so that each
transaction’s order is preserved.

11/6/2015
PSU’s CS 587
6
Which of these is a Schedule of T1 and T2?
T1: A+=100, B-=100
T2:
A=1.06*A, B=1.06*B
T1:
A+=100, B-=100
T2: A=1.06*A, B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,
B=1.06*B
T1: A+=100,
B-=100
T2:
B=1.06*B,
A=1.06*A
T1: A+=100,
B-=100
T2:
A=1.06*A,B=1.06*B
11/6/2015
PSU’s CS 587
7
Isolated Schedules
A schedule is isolated if its effect on the DB is as
if each transaction executed by itself, serially.
 Which of the schedules on the next page is
isolated?
 Hint: Calculate the effect of the schedule on a
sample state of the DB, for example A has
$1,000, B has $500. This won’t tell you the
effect on all states, but it’s helpful information.

11/6/2015
PSU’s CS 587
8
Which Schedules are Isolated?
T1: A+=100, B-=100
T2:
A=1.06*A, B=1.06*B
T1:
A+=100, B-=100
T2: A=1.06*A, B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,
B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,B=1.06*B
T1:
A+=100,B-=100
T2: A=1.06*A,
B=1.06*B

Goal of Concurrency Control subsystem:
Guarantee only isolated schedules.
11/6/2015
PSU’s CS 587
9
Equivalent Schedules
Two schedules are equivalent if given any
starting DB state, they produce the same result.
 Which of these schedules is equivalent?

T1: A+=100, B-=100
T2:
A=1.06*A, B=1.06*B
T1:
A+=100, B-=100
T2: A=1.06*A, B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,
B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,B=1.06*B
11/6/2015
PSU’s CS 587
10
Serializable Schedules
A schedule is serializable if it is equivalent to a
serial schedule.
 Which of these schedules is serializable?

T1: A+=100, B-=100
T2:
A=1.06*A, B=1.06*B
T1:
A+=100, B-=100
T2: A=1.06*A, B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,
B=1.06*B
T1: A+=100,
B-=100
T2:
A=1.06*A,B=1.06*B
11/6/2015
PSU’s CS 587
11
The goal of Concurrency Control


Recall the goal of concurrency control: To
ensure that all schedules are isolated
Theorem: A schedule is Serializable ⇔ it is
Isolated
⇒: Serializable ⇒ equivalent to some serial schedule,
and in a serial schedule, each Xact. is isolated
⇐: If each xact runs alone, the schedule must be serial

But serializability is hard to verify
 How can we, in real time, check each schedule?

11/6/2015
So the Concurrency Control Subsystem needs
more work.
PSU’s CS 587
12
Locking

Transaction must get a lock – before it can read or update
data

There are two kinds of locks:
shared (S) locks and exclusive (X) locks

To read a record you MUST get an S lock
To modify or delete a record you MUST get an X lock

Lock info maintained by a “lock manager”
11/6/2015
PSU’s CS 587
13
How Locks Work


11/6/2015
If a Xact has an S lock on a
data object , new transactions
can get S locks on that object,
but not X locks.
If a Xact has an X lock, no
other Xact can get any lock (S
or X) on that data object.
If a transaction can’t get a
lock, it waits (in a queue).
PSU’s CS 587
lock on data item
-lock you want

S
X
--
ok ok ok
S
ok ok no
X
ok no no
Lock compatibility
14
Strict Two Phase Locking Protocol
(S2PL)
Strict 2PL is a way of managing locks during a transaction
 A Xact gets (S and X) locks gradually, as needed
 The Xact holds all locks until end of transaction
(commit/abort)
All locks
are released
at the end,
upon commit or abort
5
4
3
# of locks
held by a
transaction T 2
1
0
11/6/2015
time 
PSU’s CS 587
15
Strict 2PL guarantees serializability
Idea of the Proof: a Strict 2PL schedule is
equivalent to the serial schedule in which each
transaction runs instantaneously at the time that
it commits
 This is huge: A property of each transaction
(S2PL) implies a property of any set of
transactions (serializability)

 No need to check serializability of any schedules
Real DBMSs use S2PL to enforce serializability
 In reality, users can and do choose lower levels
of concurrency for all but the most sensitive
transactions

11/6/2015
PSU’s CS 587
16
17. Concurrency Control

Conflicts





Conflicting Actions
Conflict Equivalent
Conflict Serializable
Conf. Ser. ⇒ Serializable
Precedence Graph
 Conf. Serializable
⇔Precedence graph is
acyclic

Locks
 Management
 Deadlocks
• Waits-for
 Multiple Granularity

Phantoms
• Predicate, Index
locking
Locking in B+ Trees
 Strict 2PL ⇒Recoverable
 Optimistic CC
 2PL ¬⇒ Recoverable
• Inefficiency of locking

• Optimistic CC idea
11/6/2015
PSU’s CS 587
17
17. CC
17.1 Conflict Serializable Schedules

Conflicting actions: Actions that access the same
data and at least one of which is a write
 Note that changing the order of these two actions might
yield different results.

Two schedules are conflict equivalent if:



They involve the same actions of the same transactions in
the same order
Every pair of conflicting actions is ordered the same way
Schedule S is conflict serializable if S is conflict
equivalent to some serial schedule
11/6/2015
PSU’s CS 587
18
17. CC
Which are conflict serializable?
T1: R(A),W(A),
R(B),W(B)
T2:
R(A),W(A),
R(B),W(B)
T1:
R(A),
T2: R(A),
W(A), R(B)
W(A)
T1:
R(B),
T2: R(A),
W(A), R(B)
W(A), W(B)
T1: R(A),
W(A)
T2:
W(A)
T3:
W(A)
11/6/2015
PSU’s CS 587
19
Conflict Serializable ⇒ Serializable


If two actions do not conflict, then commuting
them results in an equivalent schedule.
Suppose S is conflict serializable. Then there is a
sequence of commuting actions I = {I1,…,In} so that
a) Each of the Ii commutes nonconflicting actions
b) I applied to S is a serial schedule

Because of (a), I does not change the state of any
database. Thus S, and I applied to S, are equivalent
and I applied to S is serial (b), so S is serializable.
11/6/2015
PSU’s CS 587
20
Serializable does NOT imply
Conflict Serializable
T1: R(A),
W(A)
T2:
W(A)
T3:
W(A)

Equivalent to what serial schedule?
 Therefore it is a serializable schedule

11/6/2015
Why is it not conflict serializable? (for now
just give an intuitive reason, later we will
have a proof)
PSU’s CS 587
21
17. CC
Precedence graphs

T1:
T2:
Why is this graph not conflict serializable?
R(A), W(A),
R(A), W(A), R(B), W(B)
R(B), W(B)
A
T1
T2
Precedence graph
B

11/6/2015
The cycle in the graph illustrates the problem.
T1 must precede T2, and T2 must precede T1,
in any conflict equivalent serial schedule.
PSU’s CS 587
22
17. CC
Precedence Graph
Precedence graph: One node per Xact; edge
from Ti to Tj if an action of Ti precedes and
conflicts with an action of Tj.
 Theorem: Schedule is conflict serializable if
and only if its precedence graph is acyclic

 ⇒If there is a cycle in the graph, it cannot be
serializable (see previous page & generalize)
 ⇐ If the graph is acyclic, the schedule is
equivalent to a topologically sorted order of the
actions.
11/6/2015
PSU’s CS 587
23
Example of acyclic graph
T1
T2
T4
T3
Is this graph acyclic?
 What is a topological sort of it?
 Is a schedule, for which this is a precedence graph,
equivalent to a serial schedule?

 Can we move all actions of T4 to occur before T2,
without reversing conflicting actions?
 How about T1 before T4?
11/6/2015
PSU’s CS 587
24
Summary
Each Xact in a schedule
is Isolated

The schedule is Serializable

Serializable Schedule: Same result as a
serial schedule

Conflict Serializable : Conflict
Equivalent to a Serializable Schedule

Strict 2PL: There is a locking schedule
where all locks are held until EOT

Deadlock: There is a cycle in the
Waitsfor graph.

The schedule is Conflict
Serializable
Isolated Xact: same
results as if it ran alone


The Schedule’s Precedence
Graph is Acyclic

The schedule is consistent
with Strict 2PL.

Deadlock is possible
11/6/2015
PSU’s CS 587
25
Strict 2PL⇒Recoverable
A schedule is recoverable if, during it, all
transactions commit only after all transactions
whose data they have read commit.
 Why is recoverability desirable? Otherwise,
T1 may read the data of T2 ( a so-called dirty
read), then T1 commit, then T2 abort and roll
back. Then T1 has read a value that does not
exist.

11/6/2015
PSU’s CS 587
26
17. CC
Two-Phase Locking (2PL)

Two-Phase Locking Protocol


Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on object
before writing.
A transaction can not request additional locks
once it releases any locks.
2PL implies that all schedules have acyclic
precedence graphs, so are serializable.
 However, they are not recoverable, so Strict
2PL is used in practice.

11/6/2015
PSU’s CS 587
27
17. CC
Lock Management


Lock and unlock requests are handled by the lock
manager
Lock table entry:



IDs of transactions currently holding a lock
Type of lock held (shared or exclusive)
Pointer to queue of lock requests
•

If there is an S lock on an object O and T1 requests an X lock,
what happens? What if then T2 requests an S lock?
Locking and unlocking have to be atomic operations
 How is this enforced?

11/6/2015
Lock upgrade: transaction that holds a shared lock
can be upgraded to hold an exclusive lock if no one
else has a shared lock.
PSU’s CS 587
28
Managing a new lock (simplified)
New Lock
Type of lock?
S
X
N
 lock?
N
Grant S
lock
Y
S
EnQ
N
Y
EnQ
Y
 lock?
Queue
empty?
Grant X
lock
Type of
lock?
X
EnQ
11/6/2015
PSU’s CS 587
29
Managing a lock release (simplified)
Release Lock
 Other locks?
Y
Exit
N
DeQ xact from Q,
give it a lock
Y
Is there an S
lock on top
of the Q?
S
N
What type of
lock was it?
X
Exit
11/6/2015
PSU’s CS 587
30
17. CC
Deadlocks
Deadlock: Cycle of transactions waiting for
locks to be released by each other.
 Two ways of dealing with deadlocks:



11/6/2015
Deadlock prevention
Deadlock detection
PSU’s CS 587
31
17. CC
Deadlock Prevention

Theory
 Assign priorities based on timestamps.
• Older transactions get higher priority.
 Assume Ti wants a lock that Tj holds. Two policies are
possible:
•
•
Wait-Die: It Ti has higher priority, Ti waits for Tj; otherwise Ti
aborts
Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti
waits
 If a transaction re-starts, make sure it has its original
timestamp

11/6/2015
Practice:
http://dev.mysql.com/doc/refman/5.0/en/innodbdeadlocks.html
PSU’s CS 587
32
17. CC
Deadlock Detection

Create a waits-for graph:


Nodes are transactions
There is an edge from Ti to Tj if Ti is waiting for Tj
to release a lock
Periodically check for cycles in the waits-for
graph
 Note that waits-for graph is opposite
direction of precedence graph.

11/6/2015
PSU’s CS 587
33
17. CC
Deadlock Detection (Continued)
Example:
T1: S(A), R(A),
S(B)
T2:
X(B),W(B)
X(C)
T3:
S(C), R(C)
X(A)
T4:
X(B)
11/6/2015
T1
T2
T1
T2
T4
T3
T4
T3
PSU’s CS 587
34
17. CC
Multiple-Granularity Locks
Hard to decide what granularity to lock
(tuples vs. pages vs. tables).
 Shouldn’t have to decide!
 Data “containers” are nested:

Database
contains
Tables
Pages
Tuples
11/6/2015
PSU’s CS 587
35
17. CC
Solution: New Lock Modes, Protocol

v
v
v
11/6/2015
Allow Xacts to lock at each level, but with a
special protocol using new “intention” locks:
Before locking an item, Xact
must set “intention locks”
on all its ancestors.
IX(IS): Intend to X(S) lock a
subset.
SIX: S & IX at the same
time. Used to scan and
update selected records.
PSU’s CS 587
--
IS IX S
X





IS 



IX 





--
S
X

36
17. CC
Multiple Granularity Lock Protocol
Each Xact starts from the root of the hierarchy.
 To get S or IS lock on a node, must hold IS or IX
on parent node.
 To get X or IX or SIX on a node, must hold IX or
SIX on parent node.
 Must release locks in bottom-up order.
 Sometimes hard to decide granularity of locks.
Can start small and use lock escalation.

11/6/2015
PSU’s CS 587
37
17. CC
Examples

T1 scans R, and updates a few tuples:
 T1 gets an SIX lock on R, then repeatedly gets an S
lock on tuples of R, and occasionally upgrades to
X on the tuples.

T2 uses an index to read only part of R:
 T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R.

T3 reads all of R:
 T3 gets an S lock on R.
 OR, T3 could behave like T2; can
use lock escalation to decide which.
11/6/2015
PSU’s CS 587
--
IS IX S
X





IS 



IX 





--
S
X

38
17. CC
Dynamic Databases: Phantoms

If we allow updates, even Strict 2PL will
not assure serializability:
 T1 finds oldest sailor in each rank
 T2 inserts(Rohi,1,27) and deletes John
 Schedule is
T1 rank 1
Name Rank Age
Pehr
1
25
John
2
26
Lorr
2
23
T1 rank 2
T2 inserts Rohi, deletes John

This schedule is Strict 2PL, but not
serializable!
 Result of this schedule is (1,Pehr)(2,Lorr)
 Result of T1;T2 is (1,Pehr)(2,John)
 Result of T2;T1 is (1,Rohi)(2,Lorr)
11/6/2015
PSU’s CS 587
39
17. CC
The Problem
When T1 retrieved the oldest sailor of rank 1,
it locked each sailor of rank 1 with a read
lock.
 None of these locks applied to the new
record (a phantom) inserted by T2.
 We need a mechanism to prevent phantoms;
to allow T1 to lock present and future sailors
with rank 1.
 There are two such mechanisms, index
locking and predicate locking.

11/6/2015
PSU’s CS 587
40
Data
Index
Index Locking

r=1
If there is a dense index on the rating field
using Alternative (2), T1 should lock the
index page(s) containing the data entries with
rating = 1.
 If there are no records with rating = 1, T1 must
lock the index page where such a data entry would
be, if it existed!

If there is no suitable index, T1 must lock all
pages, and lock the file/table to prevent new
pages from being added, to ensure that no
new records with rating = 1 are added.
11/6/2015
PSU’s CS 587
41
17. CC
Predicate Locking
Grant lock on all records that satisfy some
logical predicate, e.g. age > 2*salary.
 Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock.

 What is the predicate in the sailor example?

11/6/2015
In general, predicate locking has a lot of
locking overhead.
PSU’s CS 587
42
17. CC
Locking in B+ Trees

How can we efficiently lock a B+ tree?
 Btw, don’t confuse this with multiple granularity
locking!
One solution: Ignore the tree structure, just lock
pages while traversing the tree, following 2PL.
 This has terrible performance!

 Root node (and many higher level nodes) become
bottlenecks because every tree access begins at the
root. This single threads all updates to the tree.
11/6/2015
PSU’s CS 587
43
17. CC
Two Useful Observations
Higher levels of the tree only direct searches
for leaf pages.
 For inserts/deletes, a node on a path from
root to modified leaf must be locked (in X
mode, of course), only if a split can propagate
up to it from the modified leaf.
 We can exploit these observations to design
efficient locking protocols that guarantee
serializability even though they violate 2PL.

11/6/2015
PSU’s CS 587
44
17. CC
A Tree Locking Algorithm
Search: Start at root and go down;
repeatedly, S lock child then unlock parent.
 Insert/Delete: Start at root and go down,
obtaining X locks as needed. Once child is
locked, check if it is safe:

 If child is safe, release all locks on ancestors.

Safe node: Node such that the change will
not propagate up beyond this node.
 Inserts: Node is not full.
 Deletes: Node is not half-empty.
11/6/2015
PSU’s CS 587
45
ROOT
A
20
Example
B
35 35
23 35
23
20*
11/6/2015
22*
38 35 44
38
44
F
H
G
23*
24*
Do:
1) Search 38*
2) Delete 38*
3) Insert 45*
4) Insert 25*
I
35*
36*
PSU’s CS 587
C
D
38*
41*
E
44*
46
17. CC
Optimistic CC (Kung-Robinson)
Locking is a conservative approach in which
conflicts are prevented. Disadvantages:
 Lock management overhead.
 Deadlock detection/resolution.
 Lock contention for heavily used objects.
 If conflicts are rare, we might be able to gain
concurrency by not locking, and instead
checking for conflicts before Xacts commit.
 A version of this optimistic approach is used
by PostgreSQL and Oracle

11/6/2015
PSU’s CS 587
47
17. CC
Kung-Robinson Model

Xacts have three phases:
 READ: Xacts read from the database, but
make changes to private copies of objects.
 VALIDATE: Check for conflicts.
 WRITE: Make local copies of changes
public.
old
modified
objects
11/6/2015
ROOT
new
PSU’s CS 587
48
17. CC
Validation

Each Xact is assigned a numeric id.
 Just use a timestamp.
Xact ids assigned at end of READ phase, just
before validation begins.
 ReadSet(Ti): Set of objects read by Xact Ti.
 WriteSet(Ti): Set of objects modified by Ti.

11/6/2015
PSU’s CS 587
49
17. CC
Test 1

For all i and j such that TSi < TSj, check that Ti
completes before Tj begins.
Ti
R
V
Tj
W
R
11/6/2015
PSU’s CS 587
V
W
50
17. CC
Test 2

For all i and j such that Ti < Tj, check that:
 Ti completes before Tj begins its Write phase +
 WriteSet(Ti)
ReadSet(Tj) is empty.
Ti
R
V
W
R
V
W
Tj
Does Tj read dirty data? Does Ti overwrite Tj’s writes?
11/6/2015
PSU’s CS 587
51
17. CC
Test 3

Ti
For all i and j such that Ti < Tj, check that:
 Ti completes Read phase before Tj does +
 WriteSet(Ti)
ReadSet(Tj) is empty +
 WriteSet(Ti)
WriteSet(Tj) is empty.
R
V
R
W
V
W
Tj
Does Tj read dirty data? Does Ti overwrite Tj’s writes?
11/6/2015
PSU’s CS 587
52
17. CC
Overheads in Optimistic CC

Must record read/write activity in ReadSet and
WriteSet per Xact.
 Must create and destroy these sets as needed.

Must check for conflicts during validation, and
must make validated writes ``global’’.
 Critical section can reduce concurrency.
 Scheme for making writes global can reduce clustering
of objects.

Optimistic CC restarts Xacts that fail validation.
 Work done so far is wasted; requires clean-up.
11/6/2015
PSU’s CS 587
53