• No results found

Chapter 4 Getting the Data

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 4 Getting the Data"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Chapter 4

Getting the Da

prof.dr.ir. Wil van der Aals

(2)

Overview

Part I: Preliminaries

Chapter 2

Process Modeling and

Analysis

Chapter 3

Data Mining

Part II: From Event Logs to Process Models

Chapter 4

Getting the Data

Chapter 5

Process Discovery: An

Introduction

Chapter 6

Advanced Process

Discovery Techniques

Part III: Beyond Process Discovery

Chapter 7

Conformance

Checking

Chapter 8

Mining Additional

Perspectives

Chapter 9

Operational Support

Part IV: Putting Process Mining to Work

Chapter 10

Tool Support

Chapter 11

Analyzing “Lasagna

Processes”

Chapter 12

Analyzing “Spaghetti

Processes”

Part V: Reflection

Chapter 13

Chapter 14

Chapter 1

Introduction

(3)

Goal of process mining

What really happened in the past?

Why did it happen?

What is likely to happen in the future?

When and why do organizations and people deviate?

How to control a process better?

How to redesign a process to improve its

performance?

(4)

Getting the data

software

system

(process)

model

event

logs

models

analyzes

discovery

records

events, e.g.,

messages,

transactions,

etc.

specifies

configures

implements

analyzes

supports/

controls

enhancement

conformance

“world”

people

machines

organizations

components

business

processes

(5)

From heterogeneous data sources to

process mining results

PAGE 4

data

source

data

source

data

source

data

source

data

source

data

warehouse

extract

ELT

ELT

unfiltered event logs

filtered event logs

(process) models

answers

discovery

conformance

enhancement

process mining

ELT

filter

coarse-grained

scoping

fine-grained

scoping

optional

Extract, Transform,

and Load (ELT)

XES, MXML, or

similar

(6)

Example log

A process consists of

cases

.

A case consists of

events

such that each

event relates to precisely

one case.

Events within a case are

ordered

.

Events can have

attributes

.

Examples of typical

attribute names are

activity

,

time

,

costs

, and

(7)

Another view

PAGE 6

process

cases

events

...

attributes

activity= register request

time = 30-12-2010:11.02

resource = Pete

costs = 50

activity= reject request

time = 07-01-2011:14.24

resource = Pete

costs = 200

35654423

35654427

35654424

...

activity= register request

time = 30-12-2010:11.32

resource = Mike

costs = 50

activity= pay compensation

time = 08-01-2011:12.05

resource = Ellen

costs = 200

35654483

35654489

35654485

...

...

1

2

...

activity= register request

time = 30-12-2010:11.32

resource = Pete

costs = 50

activity= pay compensation

time = 15-01-2011:10.45

resource = Ellen

costs = 200

35654521

35654533

35654522

...

3

...

...

...

(8)

Standard transactional life-cycle model

schedule

assign

reassign

suspend

resume

start

autoskip

manualskip

complete

abort_case

abort_activity

withdraw

successful

termination

unsuccessful

termination

(9)

Five activity instances

PAGE 8

schedule assign

start

complete

schedule assign reassign

start

suspend resume complete

a:

b:

start

suspend resume

c:

suspend abort_activity

d:

e:

start

complete

complete

schedule

assign

reassign

suspend

resume

start

autoskip

manualskip

complete

abort_case

abort_activity

withdraw

successful

termination

unsuccessful

termination

(10)

Overlapping activity instances

complete

a:

start

start

complete

complete

a:

start

start

complete

5 hours

6 hours

2 hours

9 hours

schedule

assign

reassign

suspend

resume

start

autoskip

manualskip

complete

abort_case

abort_activity

(11)

Using attributes

PAGE 10

a

start

register

request

b

examine

thoroughly

c

examine

casually

d

check ticket

decide

pay

compensation

reject

request

reinitiate

request

e

g

h

f

end

c1

c2

c3

c4

c5

A

A

A

A

A

E

M

M

Pete

Mike

Ellen

Sue

Sean

Sara

social network showing how work

flows from one person to another

Activity b

Frequency: 456

Waiting time: 15.6 +/- 2.5 hours

Service time: 1.2 +/- 0.5 hours

Costs: 412 +/- 55 euros

performance indicators per activity

Activity g

Frequency: 311

Waiting time: 12.4 +/- 2.1 hours

Service time: 0.5 +/- 0.2 hours

Costs: 198 +/- 35 euros

Activity h

Frequency: 407

Waiting time: 7.4 +/- 1.8 hours

Service time: 1.1 +/- 0.3 hours

Costs: 209 +/- 38 euros

control flow

(12)

XES (eXtensible Event Stream)

See www.xes-standard.org.

Adopted by the IEEE Task Force on Process Mining.

Predecessor: MXML and SA-MXML.

The format is supported by tools such as ProM (as of

version 6), Nitro, XESame, and OpenXES.

(13)
(14)

XES (compatible with MXML)

Event log consists of:

traces (process instances)

− events

Standard extensions:

concept (for naming)

lifecycle (for transactional properties)

org (for the organizational perspective)

time (for timestamps)

(15)

PAGE 14

extensions

loaded

every trace

has a name

every event has a

name and a transition

classifier = name + transition

start of trace (i.e.

process instance)

name of trace

name of event

(activity name)

resource

transition

timestamp

(16)

start of trace

name of trace

name of event (activity name)

resource

data associated to event

timestamp

end of trace (i.e.

process instance)

(17)

Challenges when extracting event logs

Correlation

: Events in an event log are grouped per

case. This simple requirement can be quite challenging

as it requires event correlation, i.e., events need to be

related to each other.

Timestamps

: Events need to be ordered per case.

Typical problems: only dates, different clocks, delayed

logging.

Snapshots

. Cases may have a lifetime extending beyond

the recorded period, e.g., a case was started before the

beginning of the event log.

Scoping

. How to decide which tables to incorporate?

Granularity

: the events in the event log are at a different

level of granularity than the activities relevant for end

(18)

Flattening reality into event logs

Order

Customer : CustID

Amount : Euro

Created : DateTime

Paid : DateTime

Completed : DateTime

Orderline

Product : ProdID

NofItems : PosInt

TotalWeight : Weight

Entered : DateTime

BackOrdered : DateTime

Secured : DateTime

Delivery

DelAddress : Address

Attempt

When : DateTime

1

1..*

0..*

1

0..1

1..*

OrderID : OrderID

OrderID : OrderID

OrderLineID : OrderLineID

DelID : DelID

DelID : DelID

(19)

Tables

PAGE 18

Order

Customer : CustID

Amount : Euro

Created : DateTime

Paid : DateTime

Completed : DateTime

Orderline

Product : ProdID

NofItems : PosInt

TotalWeight : Weight

Entered : DateTime

BackOrdered : DateTime

Secured : DateTime

Delivery

DelAddress : Address

Contact : PhoneNo

Attempt

When : DateTime

Successful : Bool

1

1..*

0..*

1

0..1

1..*

OrderID : OrderID

OrderID : OrderID

OrderLineID : OrderLineID

DelID : DelID

DelID : DelID

(20)

Order instance

Case id: 91245 Activity: create order Timestamp: 28-11-2011:08.12

Customer: John Amount: 100

Order:91245

Case id: 91245 Activity: pay order Timestamp: 02-12-2011:13.45

Customer: John Amount: 100

Case id: 91245 Activity: complete order Timestamp: 05-12-2011:11.33

Customer: John Amount: 100

OrderLine:112345

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.13 OrderLineID: 112345 Product: iPhone 4G NofItems: 1 TotalWeight: 0.250 DellID: 882345 Case id: 91245 Activity: secure order line Timestamp: 28-11-2011:08.55 OrderLineID: 112345 Product: iPhone 4G NofItems: 1 TotalWeight: 0.250 DellID: 882345

Delivery:882345

Attempt:882345-1

Case id: 91245 Activity: delivery attempt Timestamp: 05-12-2011:08.55 DellID: 882345 Successful: false DelAddress: 5513VJ-22a Contact: 0497-2553660

Attempt:882345-2

Case id: 91245 Activity: delivery attempt Timestamp: 06-12-2011:09.12 DellID: 882345 Successful: false DelAddress: 5513VJ-22a Contact: 0497-2553660

Attempt:882345-3

Case id: 91245 Activity: delivery attempt Timestamp: 07-12-2011:08.56 DellID: 882345 Successful: true DelAddress: 5513VJ-22a Contact: 0497-2553660

OrderLine:112346

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.14

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

Case id: 91245 Activity: create backorder Timestamp: 28-11-2011:08.55

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

Case id: 91245 Activity: secure order line Timestamp: 30-11-2011:09.06

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

OrderLine:112347

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.15

OrderLineID: 112347 Product: iPod classic

NofItems: 1 TotalWeight: 0.200

DellID: 882345

Case id: 91245 Activity: secure order line Timestamp: 29-11-2011:10.06

OrderLineID: 112347 Product: iPod classic

NofItems: 1 TotalWeight: 0.200 DellID: 882345

Delivery:882346

Attempt:882346-1

Case id: 91245

Order

Customer : CustID

Amount : Euro

Created : DateTime

Paid : DateTime

Completed : DateTime

Orderline

Product : ProdID

NofItems : PosInt

TotalWeight : Weight

Entered : DateTime

BackOrdered : DateTime

Secured : DateTime

Delivery

DelAddress : Address

Contact : PhoneNo

Attempt

When : DateTime

Successful : Bool

1

1..*

0..*

1

0..1

1..*

OrderID : OrderID

OrderID : OrderID

OrderLineID : OrderLineID

DelID : DelID

DelID : DelID

(21)

PAGE 20

Case id: 91245 Activity: create order Timestamp: 28-11-2011:08.12

Customer: John Amount: 100

Order:91245

Case id: 91245 Activity: pay order Timestamp: 02-12-2011:13.45

Customer: John Amount: 100

Case id: 91245 Activity: complete order Timestamp: 05-12-2011:11.33

Customer: John Amount: 100

OrderLine:112345

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.13 OrderLineID: 112345 Product: iPhone 4G NofItems: 1 TotalWeight: 0.250 DellID: 882345 Case id: 91245 Activity: secure order line Timestamp: 28-11-2011:08.55 OrderLineID: 112345 Product: iPhone 4G NofItems: 1 TotalWeight: 0.250 DellID: 882345

Delivery:882345

Attempt:882345-1

Case id: 91245 Activity: delivery attempt Timestamp: 05-12-2011:08.55 DellID: 882345 Successful: false DelAddress: 5513VJ-22a Contact: 0497-2553660

Attempt:882345-2

Case id: 91245 Activity: delivery attempt Timestamp: 06-12-2011:09.12 DellID: 882345 Successful: false DelAddress: 5513VJ-22a Contact: 0497-2553660

Attempt:882345-3

Case id: 91245 Activity: delivery attempt Timestamp: 07-12-2011:08.56 DellID: 882345 Successful: true DelAddress: 5513VJ-22a Contact: 0497-2553660

OrderLine:112346

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.14

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

Case id: 91245 Activity: create backorder Timestamp: 28-11-2011:08.55

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

Case id: 91245 Activity: secure order line Timestamp: 30-11-2011:09.06

OrderLineID: 112346 Product: iPod nano

NofItems: 2 TotalWeight: 0.300

DellID: 882346

OrderLine:112347

Case id: 91245 Activity: enter order line Timestamp: 28-11-2011:08.15

OrderLineID: 112347 Product: iPod classic NofItems: 1 TotalWeight: 0.200

DellID: 882345

Case id: 91245 Activity: secure order line Timestamp: 29-11-2011:10.06

OrderLineID: 112347 Product: iPod classic NofItems: 1 TotalWeight: 0.200 DellID: 882345

Delivery:882346

Attempt:882346-1

Case id: 91245 Activity: delivery attempt Timestamp: 05-12-2011:08.43

DellID: 882346 Successful: true DelAddress: 5513XG-45

(22)

Orderline instance

Case id: 112345

Activity: create order

Timestamp: 28-11-2011:08.12

Customer: John

Amount: 100

Order:91245

Case id: 112345

Activity: pay order

Timestamp: 02-12-2011:13.45

Customer: John

Amount: 100

Case id: 112345

Activity: complete order

Timestamp: 05-12-2011:11.33

Customer: John

Amount: 100

OrderLine:112345

Case id: 112345

Activity: enter order line

Timestamp: 28-11-2011:08.13

OrderLineID: 112345

Product: iPhone 4G

NofItems: 1

TotalWeight: 0.250

DellID: 882345

Case id: 112345

Activity: secure order line

Timestamp: 28-11-2011:08.55

OrderLineID: 112345

Product: iPhone 4G

NofItems: 1

TotalWeight: 0.250

DellID: 882345

Delivery:882345

Attempt:882345-1

Case id: 112345

Activity: delivery attempt

Timestamp: 05-12-2011:08.55

Attempt:882345-2

Case id: 112345

Activity: delivery attempt

Timestamp: 06-12-2011:09.12

Attempt:882345-3

Case id: 112345

Activity: delivery attempt

Timestamp: 07-12-2011:08.56

Order

Customer : CustID

Amount : Euro

Created : DateTime

Paid : DateTime

Completed : DateTime

Orderline

Product : ProdID

NofItems : PosInt

TotalWeight : Weight

Entered : DateTime

BackOrdered : DateTime

Secured : DateTime

Delivery

DelAddress : Address

Contact : PhoneNo

Attempt

When : DateTime

Successful : Bool

1

1..*

0..*

1

0..1

1..*

OrderID : OrderID

OrderID : OrderID

OrderLineID : OrderLineID

DelID : DelID

DelID : DelID

(23)

Other examples

The life cycles of reviewers, authors, papers,

reviews, PC chairs, etc.

The life cycles of job applications and vacancies.

X-ray machine logs: machine, machine day, patient,

treatment, routine, etc.?

Therefore, the selection and scoping of instances is

needed.

Like making deciding on the elements to be put on

map; there may be many maps covering partially

overlapping areas.

(24)

Extracting event logs

Not just a syntactical issue.

Different views are possible.

Important:

− Selecting the right instance notion.

− Ordering of events.

− Selection of events.

References

Related documents

In fact, it can be plausibly argued that it is because of the need for financial and welfare support from the adult children to parents and grandparents that explain the observed

The image in Quarles’s Emblemes which was significantly altered was Emblem V.14, depicting Amor Divinus, retaining his wings and halo, on a heavenly throne, crowned and seated in

Reduced capital recovery allowances and other changes in the 1986 Act will combine to raise average effective tax rates to 36 percent by 1990, compared with 31 percent in the first

On December 3, on the occasion of the International Day of People with disabilities, the Disability Research and Capacity Development Center (DRD) collaborated with the British

“The journalist asked: ‘Will global warming create more superstorms?”’ TURN THE FOLLOWING SENTENCE INTO REPORTED SPEECH:.. “The journalist asked: ‘Will global warming create

As a relatively unknown historical actor and a free man of color from colonial La Española (modern-day Dominican Republic), his story is leveraged to help Dominican and Latinx

In some cases, other countries preference for ‘Substantial Ownership and Effective Control by home country nationals’ as per US’s open skies model is important to Air New

Roll, Josh Frank, "Bicycle Traffic Count Factoring: An Examination of National, State and Locally Derived Daily Extrapolation Factors" (2013).. Dissertations