This post is still under review, it will be completed during the following days.
Process mining is exciting, as it can show a bunch of insights on your processes that are hidden in the transactional data.
But it’s not so easy: real world process execution does not always follow a straight path, and the transactions in an ERP system are linked by logical relations that don’t fit to a one-step-after-another sequence.
In other posts, I have discussed the importance of finding the right boundaries for the process, now I will describe one approach for finding the best focal point for the process activities.
This is usually a single activity that provides the identity for the process. I use the concept of identity as a mean to identify a single process instance, or an autonomous process execution.

Sometimes it is easy to find it: in a customer claim management process, a single customer call, opening a new claim, is a natural candidate.
But if the process is an order fulfillment, the customer has sent two orders, and we are fulfilling it with a single delivery, we have to options:
- Each order identify one process instance (we have two instances)
- Each delivery identify an instance (we have one instance)
The first seems more natural but we will have a single delivery activity repeated twice in two different process instance. If we want to manage this situation, we can:
- mark the “delivery” activity as “multiple-instance”, identifying the same activity execution as included in multiple process instances
- assign to the activity execution a unique event id
The two techniques may be used together; they are mainly useful when you are going to perform some aggregate analytics on the process log, in order to avoid double counting.
But beside the tricks used for process representation, the important point is the meaning of the focal point or root activity.
Consider a vendor invoice payment process, with a typical implementation in an ERP system. The process involves the following activities (in a simplified form):

- The vendor invoice is posted
- The vendor invoice is approved for payment
- A bank payment order is created and sent to bank
- The electronic bank statement received from the bank (that includes the payment transaction) is matched with the general ledger postings
Usually, a payment order (step 3) may be related to one or more vendor invoices; an account clearing (step 4) usually matches one bank statement line with one GL posting. The following logical class diagram shows the situation:
The temporal logic rules are:
- a vendor invoice may be followed by a payment operation
- a payment operation should be preceded by one or more invoice postings
- a bank clearing operation may follow a payment operation
If a choose a straightforward activity-order approach, each vendor invoice will identify a process instance, and I will have the problem that a single payment activity will appear in the execution of multiple processes. But the analyst should question himself: “What is the process describing? A Vendor invoice management, or a payment process?”
The two approaches requires in fact different triggers:
- “A vendor invoice is received”, in the first case
- “A payment operation is started” in the second case
In the real world (or at least in the limited cases that I have seen in 42 years of career), the second statement is true. A business usually starts the payment cycle on a regular basis (weekly, monthly, or with other regular intervals).

In this process, the focus activity is the first one: the selection of payable open items (the process design is still incomplete, but was simplified for the sake of clarity). If I want to model also the posting of invoices, and their authorization, it could be modeled as a separate process.

In this wider representation, the payables data store acts as a synchronization mechanism across the two process pools.
So what?
Remember that we are reasoning about process mining. With this mindset, the analyst probably wants to have a representation of the overall process, including the two pools depicted in the last picture.
Again, having the payment execution as a focal point, we will obtain a process log where:
- multiple “post invoice activities” are represented in each case
- multiple “invoice approval” are represented in each case
- a single payment activity, and a following single clearing activity are represented for each case.
Each instance execution (case) will gain several performance attributes, for example:
- the number of invoices paid with a single payment operation
- the number of overdue payments due to delayed invoice approval
- the number of vendors paid with a single payment operation
Conclusion
I hope to have illustrated how an analysis of the process point of view performed before the beginning of log data extraction, mining and process analysis may help in obtaining analytical results that are aligned with the questions that the analysis project is expected to provide.
Stay in touch for the further development of this post!