Process Mining is proving itself increasingly as a swiss-knife for a lot of analysis tasks related to business processes (but not only).
One of the maybe trivial, but always required outcome, is a number of KPIs describing the overall process performance, as well as the ability to cluster the process traces according to their characteristics.
Consider the classical purchase-to-pay area. I would like to get a statistical information about the elapsed time required to process a purchase order form entry to approval. This is a KPI (number of days from order entry to order approval) that may be calculated from the event timestamps in each trace, something that is not easily obtained by a simple report from your ERP system.
Another example is when I want to classify an order according to it’s amount: until 1000 EUR ‘low amount orders’, up to 20,000 EUR as ‘middle amount order’ and so on. This is something that can be easily obtained also from a standard report, but having this information within your process log gives to you the ability to relate this information with the specific process variants.
In several situations, me and my team have calculated these KPIs within a business intelligence tool used to provide some insights to the process under analysis.
But I believe that it may be useful to have each calculated KPI stored in the log, as a trace or event (or log) attribute.
In this way, it is possible to use the attribute within any analysis tool that we want to use to process the log.
To get the result, I have developed a library that performs some basic calculations on events and traces, and adds the results as attributes to events and traces.
These are some examples:
calc = logcalc.LogCalculator(log)
tl = calc.calculateElapsed("Purchase Order create", "Vendor invoice enter", "elapsed-invoice", stop_at_first=True, unit='days')
The ‘calc’ variable contains the formula processor for the specific event log.
The ‘calculateElapsed’ method, for example, calculates the elapsed time, in days, from the first “Purchase Order create” activity, in each trace, until the first “Vendor invoice enter” in the same trace. The result is stored, in each trace, as a new attribute “elapsed-invoice”.
The result in the processed log is the following:
<string key=”concept:name” value=”PO4500003323″/>
<string key=”order-date” value=”2010-10-12″/>
<string key=”phase” value=”RINV”/>
<string key=”vendor” value=”0000102303″/>
<string key=”purchasing-org” value=”0A01″/>
<float key=”released-value” value=”48300.0″/>
<string key=”po-type” value=”ZLDO”/>
<string key=”gr-date” value=”2010-10-31″/>
<int key=”elapsed-invoice” value=”41″/>
<string key=”concept:name” value=”Purchase Order create”/>
The same kind of information can be seen using a popular process analysis tool, Disco:
Other available fuctions cover the area of temporal logic, enriched with few details that are relevant in the real world. Consider the “eventually follows” rule: it means that, if the activity A is executed within a process, the activity B may be executed at a later time. So the rule is not fulfilled if:
- The activity B is executed before the activity A
- The activity B is executed, but the activity A is not
In the “real world” the rule may be stated as: “the vendor invoice may be entered after the order entry”. The rationale of this rule is to check that an order is not entered to “fix” the situation when you receive an invoice without having entered the purchase order. But in the real world this means that the rule is violated also if the invoice is entered after a very short time from the order entry.
So, our rule checking routine has the following signature:
calc.checkEventuallyFollows("Purchase Order create", "Vendor invoice enter",'invoice-follows-po',min_delta=10)
The min_delta parameter is the real world addition to the pure theoretical formulation: it means that we expect that the “Vendor invoice enter” should be performed at least 10 days after the order creation. The routine allows also different units of measure for time, such as months or minutes.
The result of this method will be to add to each trace a boolean attribute named invoice-follows-po with the result of the check. Different attribute types for the result, or success/failure values may be used.
A process log file in XES format can be visually analyzed also in Power BI (or directly using python based visualizations), using as a data source a python script that converts the XES file in a number of dataframes.
In this picture, the value of “invoice-follows-po” conformance checking is compared to the average values of purchase orders in a bubble chart.
The availability of a number of calculation functions operating on a XES process log can be used to enrich it, adding attributes that were not available in the original log, and may be:
- KPIs, allowing dimensional analysis on the process dynamics (such as total quantities, number of events, elapsed times, etc.)
- Conformance checking results (for temporal logic behaviour, or role correctness, segregation of duties, and so on
This step allows a fast analysis of the log using traditional business intelligence tools, that can be used in addition to process model analysis tools commonly used in process mining.
Packing everything in python is a feasible approach to obtain a very flexible data extraction and process mining pipeline, log enrichment, and various analysis.