Business process
measurement - data
mining
.
Business process measurement
• Balanced scorecard
Clear & measurable goals Effective solutions
Measurable results
Goal-oriented development of organization’s IS
ROCE Customer Loyalty On-time Delivery Process Quality Process Cycle Time Employees Financial Customer Business Process Learning and Growth Systems
Financial perspective : goals and measures
Customer perspective : goals and measures
Process perspective :goals and measures
Learning and growth perspective : goals and measures
Mission and vision of organization
•Strategic goals
•Main tasks
Example Balanced Scorecard: Regional Airline
Mission: Dedication to the highest quality of Customer Service delivered with a sense of warmth, friendliness, individual pride, and Company Spirit.
Vision: Continue building on our unique position -- the only short haul, low-fare, high-frequency, point-to-point carrier in America.
W
W
L
L
L
Y
Y
Y
A
A
Chapter 17
Process Mining and Simulation
Moe Wynn Anne Rozinat Wil van der Aalst Arthur ter Hofstede
a university for the real world R YYY AA WW LLL 15 Y YYYY © 2009, www.yawlfoundation.org
Overview
• Introduction • Preliminaries• Process mining (with ProM)
• Process simulation for operational decision support • Tools: YAWL, ProM & CPN Tools
Y
YYYY
Introduction
• Correctness, effectiveness and efficiency of business processes are vital to an organization
• Significant gap between what is prescribed and what actually happens
• Process owners have limited info about what is actually happening
• Model-based (static) analysis – Validation
– Verification (correctness of a model)
– Performance analysis
• Process Mining – post-execution analysis • Process Simulation – „what-if‟ analysis
a university for the real world R YYY AA WW LLL 17
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
Preliminaries: Data Logging
• Keeping track of execution data – Activities that have been carried out
– Timestamps (Start and end times of activities)
– Resources involved – Data • Purposes – Audit trails – Disaster recovery – Monitoring – Data Mining – Process Mining – Process Simulation
a university for the real world R YYY AA WW LLL 19
Y
YYYY
© 2009, www.yawlfoundation.org
Preliminaries: Process Mining
• Event logs (recorded actual behaviors) • Covers a wide-range of techniques • Provide insights into
– control flow dependencies
– data usage
– resource involvement
– performance related statistics etc.
• Identify problems that cannot be identified by inspecting a static model alone
Y
YYYY
Preliminaries: Process Simulation
• Develop a simulation model at design time
• Carry out experiments under different assumptions • Used for process reengineering decisions
• Data input is time-consuming and error-prone • Requires careful interpretation
– Abstraction of the actual behavior
– Different assumptions made
– Inaccurate or Incomplete data input
a university for the real world R YYY AA WW LLL 21
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
Process Mining
• Process discovery: "What is really happening?"
• Conformance checking: "Do we do what was agreed
upon?"
• Performance analysis:
"Where are the bottlenecks?" • Process prediction: "Will this
case be late?"
• Process improvement: "How to redesign this process?" • Etc.
a university for the real world R YYY AA WW LLL 23
Y
YYYY
© 2009, www.yawlfoundation.org
Example: mining student data
• Process discovery: "What is the real curriculum?"
• Conformance checking: "Do students meet the prerequisites?"
• Performance analysis: "Where are the bottlenecks?"
• Process prediction: "Will a student complete his studies (in time)?"
Y YYYY software system process/ system model event logs models analyzes discovery records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls conformance “world” people machines organizations components business processes
a university for the real world R YYY AA WW LLL 25 Y YYYY © 2009, www.yawlfoundation.org
Where to start?
process
design
implementation/
configuration
process
enactment
diagnosis
process
control
process mining
Y
YYYY
a university for the real world R YYY AA WW LLL 27
Y
YYYY
© 2009, www.yawlfoundation.org
ProM framework
• One of the leading approaches to Process Mining http://www.processmining.org/
• Covers a wide range of analysis approaches • 250+ plug-ins
– Process Discovery
– Social Network
– Conformance Checking
• Conversion capabilities between different formalisms – Petri nets, EPCs, BPMN, BPEL, YAWL
Y
YYYY
a university for the real world R YYY AA WW LLL 29
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
a university for the real world R YYY AA WW LLL 31 Y YYYY © 2009, www.yawlfoundation.org
throughput
time
bottle-necks
flow time
from A to
B
Y
YYYY
Dotted chart analysis
time
(relative)
case
s
short
cases
long
cases
46138 eventsa university for the real world R YYY AA WW LLL 33
Y
YYYY
© 2009, www.yawlfoundation.org
ProM and YAWL
• YAWL logs workflow events and data attributes
• An extractor function available as a ProMImport plug-in
• ProM can analyze YAWL logs in MXML format
• Prom can transform YAWL models into Petri nets
<Process id="Payment_subprocess.ywl"> <ProcessInstance id="3f9dfc70-5420-40e7-b9f7-329b5c6f0ded"> <AuditTrailEntry> <WorkflowModelElement>Check_PrePaid_Shipments_10</WorkflowModelElement> <EventType>start</EventType> <Timestamp>2008-07-08T10:11:18.104+01:00</Timestamp> <Originator>JohnsI</Originator> </AuditTrailEntry> <AuditTrailEntry> <Data><Attribute name="PrePaidShipment">true</Attribute></Data> <WorkflowModelElement>Check_PrePaid_Shipments_10</WorkflowModelElement> <EventType>complete</EventType> <Timestamp>2008-07-08T10:11:28.167+01:00</Timestamp> <Originator>JohnsI</Originator> </AuditTrailEntry> </ProcessInstance> </Process>
Y
YYYY
Starting point: event logs
YAWL logs or other event logs, audit trails, databases, message logs, etc.
unified event log (MXML)
a university for the real world R YYY AA WW LLL 35
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
a university for the real world R YYY AA WW LLL 37
Y
YYYY
© 2009, www.yawlfoundation.org
Linking process mining to simulation
• Gather process statistics using process mining techniques
• Calibrate simulation experiments with this data
• Analyze simulation logs in the same way as execution logs
Y
YYYY
Data sources for process characteristics
• Design (Workflow and Organizational Models) – Control and data flow
– Organizational model
– Initial data values
– Role assignments
• Historical (Event logs)
– Data value range distributions
– Execution time distributions
– Case arrival rate
– Resource availability patterns
• State (Workflow system) – Progress state
– Data values for running cases
a university for the real world R YYY AA WW LLL 39
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
Architecture II
• YAWL
– Create and execute process models
– Maintain organizational models
– Extractor functionalities for event logs, organizational models and current state of the workflow system
• ProM
– Translate and integrate all the components into a Petri nets model
– Analyze event logs and simulation logs
• CPN Tools
– Run simulation experiments
– Incorporate current state of workflows
a university for the real world R YYY AA WW LLL 41
Y
YYYY
© 2009, www.yawlfoundation.org
Y
a university for the real world R YYY AA WW LLL 43
Y
YYYY
© 2009, www.yawlfoundation.org
Tool: Architecture
•
Use existing modelsY
YYYY
Tool: Architecture II
•
Use existing models•
Derive parameters•
Use existing modelsa university for the real world R YYY AA WW LLL 45
Y
YYYY
© 2009, www.yawlfoundation.org
Tool: Architecture III
•
Use existing models•
Derive parameters•
Consider current state•
Use existing models•
Derive parametersY
YYYY
Tool: Architecture IV
•
Use existing models•
Derive parameters•
Consider current state•
Simulation logs in MXML•
Use existing models•
Derive parameters•
Consider current statea university for the real world R YYY AA WW LLL 47 Y YYYY © 2009, www.yawlfoundation.org
Simulation: Example
Payment [Invoice required] [else] [pre-paid shipments]payment for the shipment
c: Finance Officer
o: Account Manager
customer makes the payment
c: Senior Finance Officer
Start Issue Shipment
Invoice s: Supply Admin Officer Check Pre-paid shipments Issue Shipment Remittance Advice Issue Shipment Payment Order Approve Shipment Payment Order Update Shipment Payment Order Issue Credit Adjustment issue Debit Adjustment Finalise Produce Freight Invoice Check Invoice Requirement End Process Shipment Payment Complete Invoice Requirement
[payment incorrect due to overcharge]
[payment correct] [payment incorrect due to
underpayment]
account settled
payment for the freight
o: Account Manager o: Account Manager
c: Finance Officer
c: Finance Officer
s: Supply Admin Officer
customer notified of the payment, customer makes the payment [s. order approved]
[s. order not approved]
s: Supply Admin Officer
Process Freight Payment
s: Supply Admin Officer
s: Supply Admin Officer
s: Supply Admin Officer
s: Supply Admin Officer o: Account Manager
Y
YYYY
Simulation: Example
• 13 staff members
– 5 `supply admin officers„
– 3 `finance officers'
– 2 `senior finance officers'
– 3 `account managers„
• Case arrival rate: 50 payments per week
• Throughput time: 5 working days on average
• 30% of shipments are pre-paid
• 50% of orders are approved first-time
• 20% of payments are underpaid
• 10% of payments are overpaid
• 70% of payments are correct
• 80% of orders require invoices
• 20% of orders do not require invoices
a university for the real world R YYY AA WW LLL 49
Y
YYYY
© 2009, www.yawlfoundation.org
Simulation: Scenario
• 4 weeks till the end of financial year
• A backlog of 30 payments (some for more than a week) • Goal: All payments to be processed in 4 weeks time
• Run simulation experiments to
– see if the backlog can be cleared using current resources
– evaluate the effect of avoiding underpayments
Y
YYYY
a university for the real world R YYY AA WW LLL 51
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
Four Scenarios
1. An empty initial state ( „empty‟)
2. After loading the current state file with the 30 applications currently in the system („as is‟)
3. After loading the current state file but adding 13 extra resources („to be A‟)
4. After loading the current state file but changing the model so that underpayments are no longer possible („to be B')
a university for the real world R YYY AA WW LLL 53
Y
YYYY
© 2009, www.yawlfoundation.org
Y
YYYY
Simulation for operational decision support
• Combine the real process execution log (`up to now') and the simulation log (which simulates the future `from now on')
• Look at the process execution in a unified manner • Track both the history and the future of current cases
Alpha algorithm
Process log
• Minimal information
in log: case id‟s and
task id‟s.
• Additional
information: event
type, time,
resources, and
data.
• In this log there are
three possible
sequences:
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D>,
,||,# relations
•
Direct succession
:
x>y iff for some
case x is directly
followed by y.
•
Causality
: x
y iff
x>y and not y>x.
•
Parallel
: x||y iff x>y
and y>x
•
Choice
: x#y iff not
x>y and not y>x.
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D A>B A>C B>C B>D C>B C> D E>F AB AC BD CD EF B||C C||B
Basic idea (1)
x
y
Basic idea (2)
x
y, x
z, and y||z
x
z
y
Basic idea (3)
x
y, x
z, and y#z
x
z
y
Basic idea (4)
x
z, y
z, and x||y
x
y
Basic idea (5)
x
z, y
z, and x#y
x
y
It is not that simple: Basic
alpha algorithm
Let W be a workflow log over T. a(W) is defined as follows. 1. TW = { t T | $s W t s}, 2. TI = { t T | $s W t = first(s) }, 3. TO = { t T | $s W t = last(s) }, 4. XW = { (A,B) | A TW B TW "a A"b B a W b "a1,a2 A a1#W a2 "b1,b2 B b1#W b2 }, 5. YW = { (A,B) X | "(A,B) XA A B B (A,B) = (A,B) }, 6. PW = { p(A,B) | (A,B) YW } {iW,oW},
7. FW = { (a,p(A,B)) | (A,B) YW a A } { (p(A,B),b) | (A,B) YW b
B } { (iW,t) | t TI} { (t,oW) | t TO}, and
8. a(W) = (PW,TW,FW).
Example
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B A B C D E F a(W) WDEMO
Alpha algorithm
A E G invite reviewers D get review 2 time-out 2 collect reviews H decide I accept J reject invite additional reviewer K M L get review X time-out X C B get review 1 time-out 1 G F get review 3 time-out 3 48 cases 16 performersLogging system
• Nlog
• NLog can process diagnostic messages
emitted from any .NET language (such as
C# or Visual Basic), augment them with
contextual information
(such as date/time,
severity, thread, process, environment
enviroment), format them according to
your preference and send them to one or
Supported targets
• Files - single file or multiple, with automatic file naming and archival
• Event Log - local or remote
• Database - store your logs in databases supported by .NET
• Network - using TCP, UDP, SOAP, MSMQ protocols • Command-line console - including color coding of
messages
• E-mail - you can receive emails whenever application errors occur
• ASP.NET trace
Y
YYYY
Conclusions
• Introduction
– Concise assessment of reality needed for processes
• Preliminaries
– Data logging, Process Mining, Process Simulation
• Process mining with ProM
– Understanding process characteristics
• Process simulation
– Operational decision support
– Utilizing log info for simulation experiments
• Tools: YAWL, ProM & CPN Tools – Payment example