Process Mining: Data Science in Action

Download Report

Transcript Process Mining: Data Science in Action

Process Mining
Data Science in Action
Wil van der Aalst
Scientific director of the DSC/e
Dutch Data Science Summit, Eindhoven, 4-5-2014.
Process Mining
Data Science in Action
https://www.coursera.org/course/procmin
statistics
stochastics
data
mining
machine
learning
databases
algorithms
data
science
process
mining
large scale
distributed
computing
industrial
engineering
behavioral/
social
sciences
privacy
domain
knowledge
visualization
visual
analytics
statistics
stochastics
data
mining
machine
learning
databases
algorithms
data
science
process
mining
large scale
distributed
computing
industrial
engineering
behavioral/
social
sciences
privacy
domain
knowledge
visualization
visual
analytics
statistics
stochastics
data
mining
machine
learning
business
process reengineering
business
process
management
formal
methods
model
checking
process
mining
process
science
concurrency
Petri nets
BPMN
databases
algorithms
data
science
large scale
distributed
computing
industrial
engineering
behavioral/
social
sciences
privacy
domain
knowledge
visualization
visual
analytics
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event data
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event data
Internet of
Content
“Big
Data”
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event data
Internet of
Internet of
Content
People
“Big
“social”
Data”
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event data
Internet of
Internet of
Internet of
Content
People
Things
“Big
“social”
“cloud”
Data”
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event data
Internet of
Internet of
Internet of
Internet of
Content
People
Things
Places
“Big
“social”
“cloud”
“mobility”
Data”
Internet of Events
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Starting point for process mining:
every row is an event
Event data
(here: an exam attempt)
student name
course name
exam date
mark
Peter Jones
Business Information systems
16-1-2014
8
Sandy Scott
Business Information systems
16-1-2014
5
Bridget White
Business Information systems
16-1-2014
9
John Anderson
Business Information systems
16-1-2014
8
Sandy Scott
BPM Systems
17-1-2014
7
Bridget White
BPM Systems
17-1-2014
8
Sandy Scott
Process Mining
20-1-2014
5
Bridget White
Process Mining
20-1-2014
9
John Anderson
Process Mining
20-1-2014
8
…
…
…
…
case id
activity name
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
timestamp
other data
Another event log: order handling
order
number
activity
timestamp
user
product
quantity
9901
register order
[email protected]
Sara Jones
iPhone5S
1
9902
register order
[email protected]
Sara Jones
iPhone5S
2
9903
register order
[email protected]
Sara Jones
iPhone4S
1
9901
check stock
[email protected]
Pete Scott
iPhone5S
1
9901
ship order
[email protected]
Sue Fox
iPhone5S
1
9903
check stock
[email protected]
Pete Scott
iPhone4S
1
9901
handle payment
[email protected]
Carol Hope
iPhone5S
1
9902
check stock
[email protected]
Pete Scott
iPhone5S
2
9902
cancel order
[email protected]
Carol Hope
iPhone5S
2
…
…
…
…
…
…
case id
activity name
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
timestamp
resource
other data
Another event log: patient treatment
patient
activity
timestamp
doctor
age
cost
5781
make X-ray
[email protected]
Dr. Jones
45
70.00
5541
blood test
[email protected]
Dr. Scott
61
40.00
5833
blood test
[email protected]
Dr. Scott
24
40.00
5781
blood test
[email protected]
Dr. Scott
45
40.00
5781
CT scan
[email protected]
Dr. Fox
45
1200.00
5833
surgery
[email protected]
Dr. Scott
24
2300.00
5781
handle payment
[email protected]
Carol Hope
45
0.00
5541
radiation therapy
[email protected]
Dr. Jones
61
140.00
5541
radiation therapy
[email protected]
Dr. Jones
61
140.00
…
…
…
…
…
…
case id
activity name
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
timestamp
resource
other data
Let's play
Case
432
432
432
432
432
Activity
register travel request (a)
get support from local manager (b)
check budget by finance (d)
decide (e)
accept request (g)
Timestamp
18-3-2014:9.15
18-3-2014:9.25
19-3-2014:8.55
19-3-2014:9.36
19-3-2014:9.48
Resource
John
Mary
John
Sue
Mary
Play-In
Replay
get support
from local
manager (b)
Play-Out
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Play-Out
Case
432
432
432
432
432
Activity
register travel request (a)
get support from local manager (b)
check budget by finance (d)
decide (e)
accept request (g)
Timestamp
18-3-2014:9.15
18-3-2014:9.25
19-3-2014:8.55
19-3-2014:9.36
19-3-2014:9.48
Resource
John
Mary
John
Sue
Mary
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Play Out: A possible scenario
abdeg
XORjoin
register travel
request (a)
start
XORsplit
get support
from local
manager (b)
ANDsplit
get detailed
motivation
letter (c)
XORjoin
XORsplit
ANDjoin
accept
request (g)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
Case
432
432
432
432
432
XORjoin
Activity
register travel request (a)
get support from local manager (b)
check budget by finance (d)
decide (e)
accept request (g)
Timestamp
18-3-2014:9.15
18-3-2014:9.25
19-3-2014:8.55
19-3-2014:9.36
19-3-2014:9.48
Resource
John
Mary
John
Sue
Mary
end
Play Out: Another scenario
get support
from local
manager (b)
accept
request (g)
get detailed
motivation
letter (c)
register travel
request (a)
decide (e)
start
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
adcefbdeh
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Play Out: Process model allows for many
more scenarios
get support
from local
manager (b)
register travel
request (a)
accept
request (g)
get detailed
motivation
letter (c)
adcefcdefbdefbdeg
adbeh
adceg
adbeh
acdefcdefbdeh
acbefbdeg
abdeg
abdeg
acdefcdefbdeh
abdeg
abcefbdeh
acdefcdefbdeh
adbeh
acbefbdeh
acbefbdeg
abdeg
adceh
adceh
adcefcdefbdefbdeg
adcefcdefbdefbdeg
start
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Case
432
432
432
432
432
Activity
register travel request (a)
get support from local manager (b)
check budget by finance (d)
decide (e)
accept request (g)
Timestamp
18-3-2014:9.15
18-3-2014:9.25
19-3-2014:8.55
19-3-2014:9.36
19-3-2014:9.48
Play-In
Resource
John
Mary
John
Sue
Mary
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Loesje van
der Aalst
desire line
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Play In: Simple process allowing for 4 traces
adbeg adbeh
abdeg adbeg
abdeg abdeh
abdeh
abdeh
abdeh adbeh
abdeh
adbeh
adbeh
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
decide (e)
check budget
by finance (d)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
reject
request (h)
end
Play In: Process allowing for more traces
abdeg
adcefcdefbdefbdeg
adcefcdefbdefbdeg
adceg
adbeh
adbeh
acbefbdeg
abcefbdeh
acdefcdefbdeh
abdeg
abdeg
adcefcdefbdefbdeg
acbefbdeg
abdeg
acdefcdefbdeh
adbeh
acbefbdeh
acdefcdefbdeh
adceh adceh
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
No modeling needed!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Example Process Discovery
(Dutch housing agency, 208 cases, 5987 events)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Example process discovery for hospital
(627 gynecological oncology patients, 24331 events)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Case
432
432
432
432
432
Activity
register travel request (a)
get support from local manager (b)
check budget by finance (d)
decide (e)
accept request (g)
Timestamp
18-3-2014:9.15
18-3-2014:9.25
19-3-2014:8.55
19-3-2014:9.36
19-3-2014:9.48
Resource
John
Mary
John
Sue
Mary
Replay
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
process
model
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
event data
desire line
very safe
system
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Replay
acdeg
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
Replay
aceg
get support
from local
manager (b)
register travel
request (a)
start
?
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
check budget (d)
is missing!
end
Replay
achdeg
get support
from local
manager (b)
register travel
request (a)
start
?
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
reject request (h) is
impossible
end
Conformance Checking
(WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Replay with timestamps
a9.15
c9.20
d9.35
e10.15
g11.30
get support
from local
manager (b)
11.30
9.20
9.15
5
register travel
request (a)
start
get detailed
motivation
letter (c)
55
10.15
75
accept
request (g)
decide (e)
20
check budget
by finance (d)
9.35
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
40
reinitiate
request (f)
reject
request (h)
end
Replay with timestamps for many traces
frequencies
of activities
frequencies
of paths
register travel
request (a)
start
waiting times and
other delays between
activities
get support
from local
manager (b)
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
durations of
activities
end
Performance Analysis Using Replay
(WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Overview
supports/
controls
“world”
business
processes
people
machines
components
organizations
models
analyzes
Play-Out
software
system
records
events, e.g.,
messages,
transactions,
etc.
specifies
configures
implements
analyzes
Play-In
discovery
(process)
model
conformance
enhancement
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
event
logs
Replay
Process mining toolbox
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
examine
thoroughly
pay
compensation
examine
casually
register
request
start
decide
reject
request
check ticket
reinitiate
request
Process models can be
seen as "process maps"
end
What we can learn from maps …
abstraction: leaving
out insignificant roads
and towns
layout: positioning of
elements has a clear
meaning
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
aggregation: smaller
entities are amalgamated
into larger ones (suburbs
and cities)
size and color: highlight
more important entities
(e.g. highways have a
different color)
Compare process models to maps
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
size and
color?
reinitiate
request (f)
abstraction?
end
b
examine
thoroughly
A
A
a
aggregation?
start
c
c3
pay
compensation
e
examine
casually
register
request
decide
c5
end
h
c2
d
check ticket
layout?
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
g
M
c1
c4
reject
request
f
reinitiate
request
Can we see what matters most?
metropolis or village?
get support
from local
manager (b)
register travel
request (a)
start
accept
request (g)
get detailed
motivation
letter (c)
decide (e)
check budget
by finance (d)
reject
request (h)
reinitiate
request (f)
highway or dirt road?
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
end
"the map" does not exist …
Zoom
Subway
map
Bicycle
map
a map is a view on reality
map ≠ reality
same for process models …
Model provides a view on reality (event data), just like a map!
Multiple views
depending on
purpose
(performance,
compliance,
training, etc.).
breathing life into process models
otherwise they end up in some drawer …
Project on maps:
• traffic jams
• real estate for sale
• location of trucks/trains
• crime rates
• …
Project on process
models:
• bottlenecks
• deviations
• costs
• …
Examples
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Not that new …
Charles Minard's 1869 chart showing the number
of men in Napoleon’s 1812 Russian campaign
army, their movements, as well as the
temperature they encountered on the return path.
100.000
422.000
175.000
10.000
24.000
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Actively using
process models …
What can we lean from navigation devices?
detect
prediction
recommendation
Driven by maps, historic
information, and current information.
Flexible: Adapts to
circumstances and does
not force the driver to
take a particular route.
Can your information system do this?
Conclusion
• Process models are
like maps!
• Connecting event
data and process
models!
− better models
− live models
process model analysis
Positioning
process
mining
(simulation, verification, optimization, gaming, etc.)
performanceoriented
questions,
problems and
solutions
process
mining
data-oriented analysis
(data mining, machine learning, business intelligence)
complianceoriented
questions,
problems and
solutions
data science
process science