Chapter 6 Tutorial

Download Report

Transcript Chapter 6 Tutorial

Chapter 6
Tutorial
Q6
A database has 5 transactions. Let min sup = 60% and
min conf = 80%.
a) Find all frequent itemsets using Apriori and FB-growth.
b) List all of the strong association rules (with support s and
confidence c) matching the following metarule, where X is a
variable representing customers, and item i denotes variables
representing items (e.g., “A”, “B”, etc.):
Q6.a
Apriori algorithm
• Finally resulting in the complete set of frequent itemsets:
{ e, k, m, o, y, ke, oe, mk, ok, ky, oke }
Q6.a
FB-Growth algorithm
1. Scan DB once, find frequent 1-itemset (single item
pattern) their support => 3
M
3
O
3
N
2
K
5
E
4
Y
3
D
1
A
1
U
1
C
2
I
1
After
checking
support
TID
T100
T200
T300
T400
T500
K
5
E
4
M
3
O
3
Y
3
items bought
{M, O, N, K, E, Y}
{D, O, N, K, E, Y }
{M, A, K, E}
{M, U, C, K, Y}
{C, O, O, K, I ,E}
(ordered) Frequent items
K,E,M,O,Y
K,E,O,Y
K,E,M
K, M, Y
K,E,O
Q6.a
FB-Growth algorithm
• Generate FB-tree
• Generate FB-tree – order table
Q6.b
• buys(X,k) Λ buys(X,o) => buys(X, e)
[60%,100%]
• buys(X,e) Λ buys(X,o) => buys(X, k)
[60%,100%]
Exercise 1
Support( A  B)  P( A  B)
Confidence( A  B)  P(B A)
Confidence( A  B)  P( B A) 
support( A  B) support_count( A  B)

support( A)
support_count( A)
• Show an example association rule that matches (a1, a2, a3,
a4, itemX) -> (itemY) [min_support = 2, min_confidence=70%]
•
For association rule a1->a6, compute the confidence
confidence = p(a1 a6)/p(a1) = (2/5)/(3/5) = 2/3=0.67
Exercise 2
Activity
• a dataset has eight transactions. Let minimum
support = 50 %.
• Find all frequent itemsets using FP-Growth
TID
Item bought
T1
{W, O, R, N}
T2
{W, T, U, G}
T3
{X , T, U, G}
T4
{S ,N, T, U, G}
T5
{B ,R, G, T, D}
T6
{T, X, I, L, U}
T7
{G, U, R, T, X}
T8
{X, O, N, G, T}