Combining Proactive and Reactive Predictions for Data Streams KDD’05

Download Report

Transcript Combining Proactive and Reactive Predictions for Data Streams KDD’05

2020/4/27

Combining Proactive and Reactive Predictions for Data Streams

Ying Yang, Xindong Wu and Xingquan Zhu KDD ’ 05 1

Introduction

► Main challenge of mining data stream :   Data grow without limit, hard to retain long history Underlying concept may change over time ► The paper set in context of classification learning 2020/4/27 2

Introduction

► Problem for now prediction data stream :    Keep recent history of raw instance Discard outdated concept No devoted to foreseeing a bigger picture 2020/4/27 3

Introduction

► The goals involve    (1) organizing history of raw data into history of compact concepts by identifying new concepts as well as re-appearing historical ones (2) learning patterns of concept transition from the concept history (3) carrying out effective and efficient prediction at two levels, a general level of predicting each oncoming concept and a specific level of predicting each instance ’ s class.

2020/4/27 4

Terminology

► Data stream   Sequence of instances Each instance is vector of attribute value with class label ► Concept  Represented by the learning result of classification algorithm 2020/4/27 5

Terminology

► Concept change  Concept drift  Concept shift  Sampling change ► Change of data distribution that lead to revising the current model 2020/4/27 6

Building concept history

► Key components     Classification algorithm ► Abstract concept form raw data Trigger detection algorithm ► Find instance, across underlying concept changed and prediction model should modified Conceptual equivalence measure ► Check whether a concept is historical or new Stable learning size ► Specifies # of instances which the learned concept can deemed stable 2020/4/27 7

2020/4/27 8

Building prediction model

► window size is 10 ► stable learning size is 30 ► trigger error threshold is 55% ► represents an instance where a stable trigger is detected ► represents an instance where a temporary trigger is detected ► ► represents a correctly classified instance represents a wrongly classified instance.

2020/4/27 9

2020/4/27 10

2020/4/27 11

Proactive mode

► Predict what the new concept will be when concept change take place ► Prepare prediction strategies in advance ► Before trigger detection and independent of trigger window ► Advantage :   Quick response Stable prediction model 2020/4/27 12

Proactive mode

► Concept history treat as Markov Chain, each distinct concept is a state ► Example : a sequence of arriving weather concepts: spring, summer, autumn, winter, spring, summer, hurricane, autumn, winter, spring, flood, summer, autumn, winter, spring, summer, autumn, winter, spring, summer, hurricane, autumn,...

2020/4/27 13

Reactive mode

► Reactive mode wait until concept has changed to construct a prediction model on trigger instance ► It can either contemporary or historical 2020/4/27 14

► System RePro incorporate proactive and reactive prediction

RePro

2020/4/27 15

Conclusion

► Proposed novel mechanism to organized data into concept history ► Proposed RePro to predict for concept changing data streams 2020/4/27 16

2020/4/27

Thank you very much~

17