CS186: Introduction to Database Systems

Download Report

Transcript CS186: Introduction to Database Systems

CS 405G: Introduction to
Database Systems
25 Exercise
Chen Qian
University of Kentucky
Project presentation




Demonstrate all functionalities of your application
Complete some requests
Introduce any additional features.
The bonus part, if you have completed it.
7/17/2015
Chen Qian @ University of Kentucky
2
Exercise: Functional dependency

Suppose you are given a relation R with four attributes ABCD.
For of the following sets of FDs, assuming those are the only
dependencies that hold for R, do the following: (a) Identify the
candidate key(s) for R. (b) Identify the best normal form that R
satisfies (1NF, 2NF, 3NF, or BCNF).

Candidate keys: B
R is in 2NF but not 3NF.

7/17/2015
Chen Qian @ University of Kentucky
3
Review


WorkOn (EID, Ename, email, PID, hour)
We say X -> Y is a partial dependency if there exist a X’
 X such that X’ -> Y


e.g. EID, PID-> Ename
Otherwise, X -> Y is a full dependency

e.g. EID, PID -> hours
EID
PID
Ename
email
Pname
Hours
1234
10
John Smith
[email protected]
B2B platform
10
1123
9
Ben Liu
[email protected]
CRM
40
1234
9
John Smith
[email protected]
CRM
30
1023
10
Susan Sidhuk
7/17/2015
[email protected] B2B platform
Chen Qian @ University of Kentucky
40
4
2nd Normal Form

Note about 2nd Normal Form


by definition, every nonprimary attribute is functionally
dependent on every key of R
In other words, R is in its 2nd normal form if we could not
find a partial dependency of a nonprimary key to a key in
R.
7/17/2015
Chen Qian @ University of Kentucky
5
Third normal form
• 3NF requires that there are no non-trivial
functional dependencies of non-key attributes on
something other than a superset of a candidate key.
• Recall: non-trivial FD means LHS has no
intersection with RHS.
• In summary, all non-key attributes are mutually
independent.
7/17/2015
Chen Qian @ University of Kentucky
6
Exercise: Functional dependency

Suppose you are given a relation R with four attributes ABCD.
For of the following sets of FDs, assuming those are the only
dependencies that hold for R, do the following: (a) Identify the
candidate key(s) for R. (b) Identify the best normal form that R
satisfies (1NF, 2NF, 3NF, or BCNF).

Candidate keys: BD
R is in 1NF but not 2NF.

7/17/2015
Chen Qian @ University of Kentucky
7
Exercise: Functional dependency

Suppose you are given a relation R with four attributes ABCD.
For of the following sets of FDs, assuming those are the only
dependencies that hold for R, do the following: (a) Identify the
candidate key(s) for R. (b) Identify the best normal form that R
satisfies (1NF, 2NF, 3NF, or BCNF).

3. ABC → D, D → A
Candidate keys: ABC, BCD
R is in 3NF but not BCNF


7/17/2015
Chen Qian @ University of Kentucky
8
Boyce-Codd normal form (BCNF)
• BCNF requires that there are no non-trivial
functional dependencies of attributes on something
other than a superset of a candidate key (called a
superkey).
• All attributes are dependent on a key, a whole key
and nothing but a key (excluding trivial
dependencies, like A->A).
7/17/2015
Chen Qian @ University of Kentucky
9
Exercise: Functional dependency

Suppose you are given a relation R with four attributes ABCD.
For of the following sets of FDs, assuming those are the only
dependencies that hold for R, do the following: (a) Identify the
candidate key(s) for R. (b) Identify the best normal form that R
satisfies (1NF, 2NF, 3NF, or BCNF).

4. A → B, BC → D, A → C
Candidate keys: A
R is in 2NF but not 3NF
(because of the FD: BC → D).



7/17/2015
Chen Qian @ University of Kentucky
10
Exercise: Functional dependency

Suppose you are given a relation R with four attributes ABCD.
For of the following sets of FDs, assuming those are the only
dependencies that hold for R, do the following: (a) Identify the
candidate key(s) for R. (b) Identify the best normal form that R
satisfies (1NF, 2NF, 3NF, or BCNF).

5. AB → C, AB → D, C → A, D → B
(a) Candidate keys: AB, BC, CD, AD
R is in 3NF but not BCNF
(because of the FD: C → A).



7/17/2015
Chen Qian @ University of Kentucky
11
Exercise: Concurrency control




Consider the following actions taken by transaction T 1 on
database objects X and Y :
R(X), W(X), R(Y), W(Y)
1. Give an example of another transaction T 2 that, if run
concurrently to transaction T without some form of concurrency
control, could interfere with T 1.
If the transaction T2 performed W(Y ) before T1 performed R(Y ),
and then T2 aborted, the value read by T1 would be invalid and
the abort would be cascaded to T1 (i.e. T1 would also have to
abort).
7/17/2015
Chen Qian @ University of Kentucky
12
Exercise: Concurrency control




Consider the following actions taken by transaction T 1 on
database objects X and Y :
R(X), W(X), R(Y), W(Y)
2. Explain how the use of Strict 2PL would prevent interference
between the two transactions.
Strict 2PL would require T2 to obtain an exclusive lock on Y
before writing to it. This lock would have to be held until T2
committed or aborted; this would block T1 from reading Y until
T2 was finished, thus there would be no interference.
7/17/2015
Chen Qian @ University of Kentucky
13
Exercise: disks




Explain the terms seek time, rotational delay, and
transfer time.
1. Seek time is the time taken to move the disk heads to
the track on which a desired block is located.
2. Rotational delay is the waiting time for the desired
block to rotate under the disk head; it is the time
required for half a rotation on average, and is usually
less than the seek time.
3. Transfer time is the time to actually read or write the
data in the block once the head is positioned, i.e., the
time for the disk to rotate over the block.
7/17/2015
Chen Qian @ University of Kentucky
14
Exercise: disks

If you have a large file that is frequently scanned
sequentially, explain how you would store the pages in
the file on a disk.

A: The pages in the file should be stored ‘sequentially’
on a disk. We should put two ‘logically’ adjacent pages
as close as possible. In decreasing order of closeness,
they could be on the same track, the same cylinder, or an
adjacent cylinder.
7/17/2015
Chen Qian @ University of Kentucky
15
Exercise: disks





Consider a disk with a sector size of 512 bytes, 2000 tracks
per surface, 50 sectors per track, five double-sided platters,
and average seek time of 10 msec.
1. What is the capacity of a track in bytes? What is the
capacity of each surface? What is the capacity of the disk?
bytes/track = bytes/sector × sectors/track = 512 × 50 = 25K
bytes/surface = bytes/track × tracks/surface = 25K × 2000 =
50, 000K
bytes/disk = bytes/surface× surfaces/disk = 50, 000K × 5 ×
2 = 500, 000K
7/17/2015
Chen Qian @ University of Kentucky
16
Exercise: disks


Consider a disk with a sector size of 512 bytes, 2000 tracks
per surface, 50 sectors per track, five double-sided platters,
and average seek time of 10 msec.
3. If the disk platters rotate at 5400 rpm (revolutions per
minute), what is the maximum rotational delay?

If the disk platters rotate at 5400rpm, the time required for one
complete rotation, which is the maximum rotational delay, is

The average rotational delay is half of the rotation time, 0.006
seconds.
7/17/2015
Chen Qian @ University of Kentucky
17
Exercise: disks


Consider a disk with a sector size of 512 bytes, 2000 tracks
per surface, 50 sectors per track, five double-sided platters,
and average seek time of 10 msec.
4. If one track of data can be transferred per revolution, what is
the transfer rate?

The capacity of a track is 25K bytes. Since one track of data can
be transferred per revolution, the data transfer rate is
7/17/2015
Chen Qian @ University of Kentucky
18
Exercise: disks

Consider … average seek time of 10 msec. suppose that a
block size of 1024 bytes is chosen. Suppose that a file containing
100,000 records of 100 bytes each is to be stored on such a disk
and that no record is allowed to span two blocks.



5. What time is required to read a file containing 100,000 records
of 100 bytes each sequentially?
A file containing 100,000 records of 100 bytes needs 40 cylinders
or 400 tracks in this disk. The transfer time of one track of data is
0.011 seconds. Then it takes 400 × 0.011 = 4.4seconds to transfer
400 tracks.
This access seeks the track 40 times. The seek time is 40 × 0.01 =
0.4seconds. Therefore, total access time is 4.4+ 0.4 = 4.8seconds.
7/17/2015
Chen Qian @ University of Kentucky
19
Exercise: disks



6. What is the time required to read a file containing 100,000
records of 100 bytes each in a random order? Assume that each
block request incurs the average seek time and rotational delay.
For any block of data, averageaccesstime = seektime +
rotationaldelay + transfertime
The average access time for a block of data would be 16.44 msec.
For a file containing 100,000 records of 100 bytes, the total
access time would be 164.4 seconds.
7/17/2015
Chen Qian @ University of Kentucky
20
Exercise: disks



6. What is the time required to read a file containing 100,000
records of 100 bytes each in a random order? Assume that each
block request incurs the average seek time and rotational delay.
For any block of data, averageaccesstime = seektime +
rotationaldelay + transfertime
The average access time for a block of data would be 16.44 msec.
For a file containing 100,000 records of 100 bytes, the total
access time would be 164.4 seconds.
7/17/2015
Chen Qian @ University of Kentucky
21
Tree structure


Each intermediate node can hold up to five pointers and four key
values. Each leaf can hold up to four records
Name all the tree nodes to be fetched to answer the following
query: “Get all records with search key greater than 38.”
7/17/2015
Chen Qian @ University of Kentucky
22
Tree structure


Name all the tree nodes to be fetched to answer the following
query: “Get all records with search key greater than 38.”
I1, I2, and everything in the range [L2..L8].
7/17/2015
Chen Qian @ University of Kentucky
23
Tree structure

inserting a record with search key 109
7/17/2015
Chen Qian @ University of Kentucky
24

inserting a record with search key 109
7/17/2015
Chen Qian @ University of Kentucky
25
Tree structure

deleting the record with search key 81 from the original tree.
7/17/2015
Chen Qian @ University of Kentucky
26
7/17/2015
Chen Qian @ University of Kentucky
27
Tree structure

Name a search key value such that inserting it into the (original)
tree would cause an increase in the height of the tree.
7/17/2015
Chen Qian @ University of Kentucky
28
Tree structure

We can infer several things about subtrees A, B, and C. First of
all, they each must have height one, since their “sibling” trees
(those rooted at I2 and I3) have height one. Also, we know the
ranges of these trees (assuming duplicates fit on the same leaf):
subtree A holds search keys less than 10, B contains keys ≥ 10
and < 20, and C has keys ≥ 20 and < 30. In addition, each
intermediate node has at least 2 key values and 3 pointers.
7/17/2015
Chen Qian @ University of Kentucky
29
Tree structure

Suppose that this is an ISAM index. What is the minimum
number of insertions needed to create a chain of three overflow
pages?
7/17/2015
Chen Qian @ University of Kentucky
30
Tree structure

If this is an ISAM tree, we would have to insert at least nine
search keys in order to develop an overflow chain of length three.
These keys could be any that would map to L4, L5, L7, or L8, all
of which are full and thus would need overflow pages on the next
insertion. The first insert to one of these pages would create the
first overflow page, the fifth insert would create the second
overflow page, and the ninth insert would create the third
overflow page (for a total of one leaf and three overflow pages).
7/17/2015
Chen Qian @ University of Kentucky
31


Want to present on 4/21?
HW4 due on 4/21
7/17/2015
Chen Qian @ University of Kentucky
32