Mining Uncertain Data with Probabilistic Guarantees
Download
Report
Transcript Mining Uncertain Data with Probabilistic Guarantees
MINING UNCERTAIN DATA WITH
PROBABILISTIC GUARANTEES
Liwen Sun,Reynold Cheng, DavidW.Cheung,
Jiefeng Cheng
SIGKDD 2010
1
Outline
Motivation
Problem Definition
Method
P-Apriori Algorithm
TODIS Algorithm
Probabilistic Association Rules
Experimental Result
Conclusion
2
Motivation
The goals of this paper are:
(1) propose a definition of frequent patterns and
association rules for the tuple uncertainty model.
(2) develop efficient algorithms for mining
patterns and rules that are correct under PWS.
3
The Possible World
P(W2)=T1.p*(1-T2.p)*(1-T3.p)*T4.p=0.6*0.5*0.3*1=0.09
P(W4)=(1-T1.p)*(1-T2.p)*T3.p*T4.p=0.4*0.5*0.7*1=0.
4
Probabilistic Frequent Patterns(P-FP)
X:pattern
sup(X):support count of pattern X
sup({a})=1→0.29
5
Probabilistic Frequent Patterns(P-FP)
6
Probabilistic Association Rule(P-AR)
7
Pruning Infrequent Patterns
8
Dynamic-Programming Algorithm
9
Divide-and-Conquer Algorithm
10
Inverted Probability List
Lx:Inverted Probability List
11
TODIS Algorithm
Phase 1: Generate candidate patterns.
Phase 2: Top-down support inheritance.
12
Computing Association Rule Probability
13
Computing Association Rule Probability
14
Deriving Association Rules
15
Experimental Result
16
Conclusion
We studied efficient algorithms for extracting
frequent patterns from probabilistic databases.
The TODIS algorithm, when used together with
DC, yields the best performance.
17