Transaction Management

Download Report

Transcript Transaction Management

CS 440
Database Management Systems
Concurrency Control
1
Concurrent access to data
Flight (fltNo, fltDate, seatNo, seatStatus)
Database: seats 22A and 22B are available.
1.
2.
3.
4.
John checks for availability and gets seat 22A
John books seat 22A
Mary checks for availability and gets seat 22B
Mary books seat 22B
Database: seats 22A and 22B are reserved.
2
Concurrent access to data: problems
Database: seats 22A and 22B are available.
1.
2.
3.
4.
John checks for availability and gets seat 22A
Mary checks for availability and gets seat 22A
John books seat 22A
Mary books seat 22A
Database: double booking on 22A!!!
• Database is not consistent.
• Solution: Run John’s before Mary’s (or vice versa)
– serial schedule
3
Serial schedules
Database: seats 22A and 22B are available.
1. Admin scans Flight to generate some reports.
2. Mary checks the flights for tomorrow.
It takes a long time!
Database: seats 22A and 22B are available.
• Nobody can use the database until Admin is done!!
How to run programs concurrently without them
messing up each other’s results?
4
Transaction
• a “program” of atomic database operations.
• atomic unit of database transformation.
START TRANSACTION
SQL-statement 1;
SQL-statement 2;
…
SQL-statement n;
COMMIT;
5
The ACID properties
•
•
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
• I solation:
Execution of one Xact is isolated from that of other
Xacts.
•
D urability:
If a Xact commits, its effects persist.
Atomicity
Balance transfer from Acct. 170 to Acct. 103:
1.
2.
3.
4.
5.
6.
check if account #170 has at least $200.
if #170 does not have enough fund => abort transaction
reduce the balance of #170 by $200.
check eligibility of #103, if not eligible => abort transaction
increase the balance of #103 by $200.
commit
• roll_back command to abort the transaction
– database system undoes all transaction’s modifications.
7
Atomicity
Balance transfer from Acct. 170 to Acct. 103:
1.
2.
3.
4.
check if account #170 has at least $200.
reduce the balance of #170 by $200.
increase the balance of #103 by $200.
commit
• Power outage in before SQL statement #3
– transaction aborts
– violates atomicity
– database system should automatically undo all transaction’s
modifications.
8
Isolation
Database: seats 22A and 22B are available.
1.
2.
3.
4.
John checks for availability and gets seat 22A
Mary checks for availability and gets seat 22A
John books seat 22A
Mary books seat 22A
Database: double booking on 22A!!
• Concurrency control:
– ensures that transactions are interleaved correctly.
9
Durability
Balance transfer from Acct. 170 to Acct. 103:
1.
2.
3.
4.
check if account 170 has at least $200.
reduce the balance of 170 by $200.
increase the balance of 103 by $200.
commit
• Database system says commit was successful, but the data is in
buffer.
• Power outage right after commit
– modifications of the transaction are gone!
• violates Durability
• Database system must make sure that the effects of committed
transactions persist.
10
The ACID properties
•
•
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
• I solation:
Xacts.
•
D urability:
Execution of one Xact is isolated from that of other
Today’s topic
If a Xact commits, its effects persist.
Transaction Interleaving
• Before: A = 0, B = 0, then?
• Correct schedule? Wrong schedule? Why?
Xact T1
read(A)
A=A+1
write(A)
Xact T2
Xact T1
read(A)
A=A+1
write(A)
read(A)
A=A*2
write(A)
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
read(B)
B=B+1
write(B)
read (B)
B=B*2
write (B)
12
schedule 1
Xact T2
read (B)
B=B+1
write (B)
schedule 2
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
schedule 3
Transaction Interleaving
•
Schedule 1 (interleaved, more concurrent) = Schedule 3 (serial)
– (A=0,B=0) --- T1 --> (A=1,B=1) -- T2 --> (A=2,B=2)
– Consistency and isolation: transform DB in serial
•
•
Serializable schedule: a concurrent schedule that impacts DB like a serial schedule.
How to check if a concurrent schedule is serializable?
Xact T1
read(A)
A=A+1
write(A)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B+1
write(B)
read (B)
B=B*2
write (B)
13
schedule 1
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
schedule 3
Conflicting Operations
Transaction 1:
Add $100 to
account A
Transaction 2:
Add $200 to
account A
read(A)
A= A+100
Time
read(A)
A= A+200
write(A)
write(A)
lost update => the order of write(A)
and read(A) matters
14
Conflicting operations
• We denote operations of transaction i as writei or readi.
• Conflicting operations:
– operations from different transactions such that:
• two writes of the same data item: writei(A); witej(A).
• a read and a write on the same data item: readi(A); writej(A).
• Change the relative order of conflicting operators =>
change the final state of DB.
– A serializable schedule has the same order of conflicting operators
as a serial schedule.
15
Serialization graph
• Define dependencies between transactions with conflicting operations
• Serialization (precedence) graph: Ti --> Tj for a dependency from Ti to Tj
• Serializable if serialization graph does not have any cycle
Xact T1
read(A)
A=A+1
write(A)
Xact T2
T1
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
16
schedule 1
T1
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
read(A)
A=A*2
write(A)
read (B)
B=B*2
write (B)
Xact T2
T2
read (B)
B=B+1
write (B)
schedule 2
T2
Guaranteeing isolation
• Scheduler guarantees serializability
– restricts the access of transactions on data items.
– enforces some order on conflicting operations.
• Two approaches:
– Pessimistic: There are many conflicting transactions.
– Optimistic: There are a few conflicting transactions.
Locking Protocol
• A “protocol” for accessing data
– well-formed transactions lock/unlock “access units”
before/after using them
– lock manager grants/manages locks
• Goals of locking protocol
– ensure serializability
– preserve high concurrency
• Parameters of a locking protocol?
18
Locking Protocol: Parameters
• What “modes” of locks to provide?
Compatibility?
– e.g., S for shared, X for exclusive
• How to “well-behave” to obtain and hold
locks?
– in what sequence?
– how long to hold?
• What “units” to lock?
– database? table? tuple? what else?
19
Lock modes and compatibility
• Shared lock = read lock= S
– multiple transactions hold a shared lock over a data item.
• Exclusive lock = write lock = X
– at most one transaction holds an exclusive lock over a data item.
• Lock manager gives locks based on compatibility matrix:
X
S
X
N
S
N
N
Y
Motivation: a “simple” protocol
• Lock modes:
– S for shared and X for exclusive access
– compatibility: (S, S) = T, otherwise F
• Behavior:
– lock (the maximum mode) before access
– release lock immediately after
• Unit:
– a relation
21
Simple Protocol: what’s wrong?
Xact T1
Xact T2
X.lock(A’s relation)
read(A)
A=A+1
write(A)
X.release-lock(A’s relation)
X.lock(A’s relation)
read (A)
A=A*2
write (A)
X.release-lock(A’s relation)
X.lock(B’s relation)
read(B)
B=B*2
write(B)
X. release-lock(B’s relation)
X.lock(B’s relation)
read(B)
B=B+1
write(B)
X.release-lock(B’s relation)
22
Solution: 2 Phase Locking (2PL)
• Each transaction has two phases:
– Getting locks (growing)
• acquire lock of the required mode (S or X)
• can only lock data items during this phase.
– may also upgrade the locks (from S-lock to X-lock).
• read/ write the locked data items.
• no release-lock in this phase
– Releasing locks (shrinking)
• can only release locks on the data items.
– may also downgrade the locks (from X-lock to S-lock).
• the phase starts with the first release-lock.
• no locking after the first release-lock.
• Rule: Transactions do not get any new lock after giving up one.
2PL Schedule
Xact: T1
Xact: T2
X.lock(B)
write(B)
S-lock(A)
read(A)
X.release-lock(A)
shrinking
write(B)
X.release-lock(B)
S.lock(A)
read(A)
X.lock(A)
write(A)
X.release-lock(A)
upgrade
Simple protocol => 2PL protocol
• 2PL --> serializable
Xact T1
X.lock(A)
…
Xact T2
X.release-lock(A)
X.lock(A)
…
X.release-lock(A)
X.lock(B)
…
X.release-lock(B)
X.lock(B)
…
X.release-lock(B)
25
Xact T1
S.lock(A)
…
S.lock(B)
…
X.release-lock(A)
Xact T2
X.lock(A)
…
…
X.release-lock(B)
X.lock(B)
…
X. release-lock(A)
X. release-lock(B)
Why 2PL schedule is serializable?
• Locks of conflicting operations are not compatible.
• 2PL does not allow the swap of conflicting operations.
– serial order between conflicting operations
– all conflicting operations of T1 before T2.
• It is possible to swap non-conflicting operations.
– provides concurrency
• Some serializable schedules are not allowed in 2PL!
– 2PL schedules are a subset of all serializable schedules
– 2PL does not provide the ideal degree of concurrency.
Locking Protocol: Granularity?
• Unit of locking: How to increase concurrency?
– coarse units? fine units?
• Granularity:
– concurrency vs. overhead
– hierarchical lockable units:
• Database, relations/ files, pages, tuples, attributes
• Correctness problem:
– T1 S.locks a tuple, T2 X.locks the file?
27
Granularity Locking Example
T1: I
DB
T2: Grant S?
T1: I
Relation Student
Relation Enrollment
T1: X
Tuple
Tuple
Tuple
Tuple
• Transaction T1 writes on a tuple in Student => X lock on tuple
• Transaction T2 wants to scan the whole DB => S lock on DB, allowed??
• T1 has to implicitly notify other transactions
28
– puts “intentions locks” on the tuple’s parents in the lockable units hierarchy
Granularity Locking Example, contd.
T1: IS
DB
T2: Grant S?
T1: IS
Relation Student
Relation Enrollment
T1: S
Tuple
Tuple
Tuple
Tuple
• T1 reads some tuples from Student => S on tuple, I on its parents
• T2 wants to scan the whole DB => S lock on DB, allowed??
• We need different types of intention locks to increase concurrency
– IS and IX
29 – also SIX (intension to upgrade), why is it useful?
Lock Compatibility Table
NL
IS
IX
S
SIX
X
NL
YES
YES
YES
YES
YES
YES
IS
YES
YES
YES
YES
YES
NO
IX
YES
YES
YES
NO
NO
NO
S
YES
NO
YES
NO
NO
SIX YES YES
NO
NO
NO
NO
NO
NO
NO
NO
X
YES
YES
NO
X
SIX
S
IX
IS
NL
privilege ordering
30
Compatibility Example
SIX
SIX
Relation Student
X
Grant S? IS? IX?
DB
SIX
Relation Enrollment
X
Tuple
Tuple
Tuple
• Questions:
– (SIX, S) = No? (SIX, IS) = Yes? (SIX, IX) = No?
31
Tuple
Granularity Locking
• Database: as hierarchy of lockable units
• Locking: to lock a unit,
– first lock all containing units with “intension”
– intension locks: IS, IX, SIX (intension to upgrade)
• Unlocking:
– release all relevant locks at once, or leaf to root
– why this order?
32
Granularity Locking: DAG
• Generalization: DAG of units:
– S locks at least one path to the node
– X locks all paths to the node
33
How “long” to hold a lock?
• 2PL requirement:
– only “shrink” after “growing”
• End of transaction:
– unlock (to make data accessible) at xact commit
– why?
34
Cascading rollback
T1
T2
write(B)
read(A)
write(A)
read(A)
write(B)
commit
T2 Aborts
We should never
have let T1 commit
Handling cascading rollback
• Commit T only after all transactions that wrote
data that T read commit.
• T reads an item after the transaction that last
wrote the item commits.
• Strict 2PL
– 2PL + transaction releases locks after commit.
Consistency: Dirty-Data Based
T does not overwrite dirty data of other xacts
0
T does not commit any writes until EOT
1
T does not read dirty data from other xacts
2
other xacts do not dirty any data read by T
before T completes
3
• Sacrificing semantic guarantees for performance
• How to lock for each degree?
37
Examples
• Garbage reads
– T1: w(A)
A?
– T2: w(A)
– value of A is not clear => prevented by degree 0.
• Lost updates
– T1: w(A)
abort
– T2:
w(A)
– degree 1 can prevent T1.abort from masking T2.w(A).
• Dirty reads:
– T1: w(A)
abort
– T2:
r(A)
– degree 2 can prevent T2.r(A) from reading bogus A.
• Inconsistent (unrepeatable) reads
– T1:
w(A)
– T2: r(A)
r(A)
– degree 3 can ensure repeated reads will be consistent.
38
Degrees of Consistency:
Example of a Control System
• The transaction that reads a gauge and writes values in the database
– degree 0, for performance reasons.
• The transaction that reads the data and computes mean and var.
– degree 1, because mean and var. should be consistent.
• The transaction that reads the mean and prints it.
– degree 2, we do not show an “undone” mean
• The transaction that reads both mean and var.
– degree 3, insures that mean and var. are consistent
39
Degrees of Consistency:
Locking-Based Definition
• We can define and implement degrees of consistency
using locking.
• Types of locks based on their duration:
– short: held for the duration of a single operation (read/
write)
– long: held for the duration of transaction (till EOT)
• All combinations of short/ long and shared (read) /
exclusive (write) locks are possible.
– example: short shared (read) lock, long exclusive (write)
lock
40
Degrees of Consistency:
Locking-Based Definition, cont’d
• T does not overwrite dirty data of other xacts
– set short write locks on dirty data (well-formed on w)
• T does not commit any writes until EOT
– set long write locks (2P/EOT on w)
• T does not read dirty data from other xacts
– set short read locks (well-formed on r)
• Other xacts do not dirty any data read by T before T
completes
– set long read locks (2P/EOT on r)
41
Additional issues
• Deadlock?
– prevent: every transaction must lock all items it needs in
advance!!?
– detect:
• Constitute a wait-for graph
• Check the waiting transactions and break the deadlock by aborting
one of them.
• Concurrency control by timestamps (optimistic=assume
rare unserializable behavior)
– Timestamping
– Validation
42
What you should know
•
•
•
•
•
43
Transaction and ACID properties
Serializability
What is 2PL and granularity of locking?
What are the four degrees of consistency?
How are the degrees of consistency related to
locking?