Process mining requires a point of view

This post is still under review, it will be completed during the following days.

Process mining is exciting, as it can show a bunch of insights on your processes that are hidden in the transactional data.

But it’s not so easy: real world process execution does not always follow a straight path, and the transactions in an ERP system are linked by logical relations that don’t fit to a one-step-after-another sequence.

In other posts, I have discussed the importance of finding the right boundaries for the process, now I will describe one approach for finding the best focal point for the process activities.

This is usually a single activity that provides the identity for the process. I use the concept of identity as a mean to identify a single process instance, or an autonomous process execution.

Sometimes it is easy to find it: in a customer claim management process, a single customer call, opening a new claim, is a natural candidate.

But if the process is an order fulfillment, the customer has sent two orders, and we are fulfilling it with a single delivery, we have to options:

  • Each order identify one process instance (we have two instances)
  • Each delivery identify an instance (we have one instance)

The first seems more natural but we will have a single delivery activity repeated twice in two different process instance. If we want to manage this situation, we can:

  • mark the “delivery” activity as “multiple-instance”, identifying the same activity execution as included in multiple process instances
  • assign to the activity execution a unique event id

The two techniques may be used together; they are mainly useful when you are going to perform some aggregate analytics on the process log, in order to avoid double counting.

But beside the tricks used for process representation, the important point is the meaning of the focal point or root activity.

Consider a vendor invoice payment process, with a typical implementation in an ERP system. The process involves the following activities (in a simplified form):

Class diagram of an invoice payment
  1. The vendor invoice is posted
  2. The vendor invoice is approved for payment
  3. A bank payment order is created and sent to bank
  4. The electronic bank statement received from the bank (that includes the payment transaction) is matched with the general ledger postings

Usually, a payment order (step 3) may be related to one or more vendor invoices; an account clearing (step 4) usually matches one bank statement line with one GL posting. The following logical class diagram shows the situation:

The temporal logic rules are:

  • a vendor invoice may be followed by a payment operation
  • a payment operation should be preceded by one or more invoice postings
  • a bank clearing operation may follow a payment operation

If a choose a straightforward activity-order approach, each vendor invoice will identify a process instance, and I will have the problem that a single payment activity will appear in the execution of multiple processes. But the analyst should question himself: “What is the process describing? A Vendor invoice management, or a payment process?”

The two approaches requires in fact different triggers:

  • “A vendor invoice is received”, in the first case
  • “A payment operation is started” in the second case

In the real world (or at least in the limited cases that I have seen in 42 years of career), the second statement is true. A business usually starts the payment cycle on a regular basis (weekly, monthly, or with other regular intervals).

Payment process

In this process, the focus activity is the first one: the selection of payable open items (the process design is still incomplete, but was simplified for the sake of clarity). If I want to model also the posting of invoices, and their authorization, it could be modeled as a separate process.

In this wider representation, the payables data store acts as a synchronization mechanism across the two process pools.

So what?

Remember that we are reasoning about process mining. With this mindset, the analyst probably wants to have a representation of the overall process, including the two pools depicted in the last picture.

Again, having the payment execution as a focal point, we will obtain a process log where:

  • multiple “post invoice activities” are represented in each case
  • multiple “invoice approval” are represented in each case
  • a single payment activity, and a following single clearing activity are represented for each case.

Each instance execution (case) will gain several performance attributes, for example:

  1. the number of invoices paid with a single payment operation
  2. the number of overdue payments due to delayed invoice approval
  3. the number of vendors paid with a single payment operation


I hope to have illustrated how an analysis of the process point of view performed before the beginning of log data extraction, mining and process analysis may help in obtaining analytical results that are aligned with the questions that the analysis project is expected to provide.

Stay in touch for the further development of this post!

ERP, what’s next?

I recently found a really provocative and interesting article about the current status and trends for ERP software applications.

Without any doubt, many customers of the “Big ERP” vendors, feel the weight of complex, monolithic systems that have piled – over the years – an impressive amount of investments. Despite this situation, I believe that there are now some factors that will, soon or later, spark a completely new generation of solutions for middle, big, end very big business.

If I look at my professional history, maybe the initial growth of the main ERP solutions (SAP, JD Edwards, PeopleSoft, Oracle, etc.) was based on two, technological, disruptive factors:

– relational databases,

– availability of long-range data connections

The first one allowed to build scalable applications, able to process quickly vast amounts of data (“vast” in the scale of ’70 ’80 of the previous century).

The second one allowed Companies to dismiss the multitude of “local”, strongly customized, applications, and concentrate to a single, standard, application platform accessed remotely from all the plants, sales, offices, local branches, etc.

Today, we have, at hand, some new technologies – first of all unstructured databases (NoSQL) and strong transactional databases (Blockchain), plus cloud, service orientation, Big Data, and all the commercial hype that vendors are eager to promote.

The big – and emerging – ERP producers are reacting adding new, and fancier, user intefaces, additional modules, but, in my opinion, are failing on some key points. They are adding stuff on top of existing applications, leaving the underlying architecture untouched. 

Furthermore, they failed to recognise the growing importance of the “personal information” aspect of document processing.  This lead to attention on the “presentation” features of user intefaces, rather than on the “content” of the interface. there are still strong barriers dividing the enterprise side of the information from the personal side. This require,  to the user,  frequent context switches. 

So I tried to imagine some features for a “New ERP” software architecture.

  1. Each transaction should have an url

If you are reading my post, you can easily share it with someone else, simply passing him the link to the browser page (or using the “share” button embedded in all the mobile apps). Your friend will receive your share via mail, or some other sort of generalized messaging or notification system.

If you are watching at an invoice on your ERP system, you cannot do this. Maybe you can, if your colleague operates on the same software and he is logged on same server.

In my dream ERP, every customer order, GL posting, article master data, purchasing request should be available as a single page (web or mobile doesn’t matter) with a simple link (obviously requesting all the authentication and permission checking stuff).

  1. Single repository

All the documents (EDI, pdf, etc.), mails, chats, comments, records should be available in the same “space”.

They may be physically dispersed in several locations or cloud systems, but their index should be unique, exactly as the index of a search engine is unique, even if indexes millions of different servers.

Please,  note that,  for me,  email should be a totally integrated feature.  This mean that the mail server should be a module of the system,  and should be able to tag each incoming mail with the relevant references,  analyzing the semantic meaning of the message body and attachments.

Microsoft, for example, is pushing and claiming it’s messaging app Outlook integration with it’s NAV ERP solution.

  1. Search

The fastest way to find something should be the search box.

You type “Invoice june 2016 ACME”, and you get the list of all the invoices issued in the month of june to/from ACME. But ALSO the mail that you have exchanged with the salse rep about this invoice, maybe a report where this invoice is listed.

You will be able to browse and refine your search.

The key point is that ANYTHING should be indexed, not only the “keys” recorded on transactions. If there is a customer order with a description containing “provided by ACME Corp.”, it should be in the result list, even if the customer/supplier is not ACME.

Search should go beyond content searching; I think that also menu items, user guides, how-tos, should be searchable in the same way. So, if you have to post a new lease-out contract, simply type “new lease-out contract”: the result list will show you the link to the active page where you can post the contract, as well as a link to the guide, maybe also an alert, telling you that – from the first of july – “lease-out contracts should be posted under a new category”, or something else.

You will be able to save anything in your favourites, be it a document,  a menu item, a mail.

  1. Social

Social ERP doesn’t mean a bad Facebook clone with a different name and your Company directory preloaded.

It means that all the documents/records that you process, your comments on them, your approvals or rejections, will be part of a content stream, categorized under many keys (“Project X”, “Lead Y”, “incoming invoices”, “maintenance requests”) cleverly assigned to each item.

So, if you have a purchasing request that is approved by your boss, you will read this event in your main stream. But if your boss has some issue on it, he will simply annotate the request, you will receive this notification, and will be able to see it in the context of the request, not as a separate mail.

And a mail, coming from a vendor, will become a document, within the stream related to your purchase requisition. Your collegues will be able to comment it, and, if properly configured, each comment will become a response mail to the original author.

5. Workflow and capabilities

Workflow automation is a great tool,  when you follow a course about process modelling,  or you watch at a demo.  Why business executives dislike process models? Because they are a mess. And they are a mess not because your organisation is poorly designed,  or is overwhelmengly complex, but because reality is flexible,  fuzzy, and it has to be so.

A modern ERP should incorporate Workflow management,  but only as a “main” process path,  keeping track of milestone approvals,  allowing a flexible “I will take charge of this” pattern. The system should provide you with a clear view of the process pipeline,  highlight exceptions,  and allow a flexible collaboration on each item.

6. Related content

It means that you should be able to find,  and link,  anything that you find that is related to the content that you are viewing.

People working on items,  tend to manage them mainly as a chronological sequence,  all the other forms of archiving are useful,  but not “natural”. Then,  when they read an item,  the association mental mechanism happens,  and their memory suggest the existence of related content.  This should be naturally implemented in the system,  thus simply enhancing the natural power of the user’s mind. 

Related contents may include documents that have a functional relationship (an order with a goods receipt,  a VAT posting with an invoice) that will be managed by the system, as well as mail, memos, spreadsheets,  drawings,  that can be useful for evaluation, knowledge sharing,  and the else.

  1. Really modular applications

Localization is always an issue. ERP vendors are requested to maintain some functions (eg. VAT, Withholding tax,  property tax), or to develop some functions only for one country. This has a cost, a huge cost, and customers pay this cost, in term of maintenance fees. I live in Italy, where the regulator is often tricky, and new requirements, fiscal reports, blossom every year, and ERP vendors often don’t provide a timely, simple, and fully effective solution. Maybe, the fact that the local functions are developed offshore, by people who haven’t ever a “field” experience of our normative reality, is a factor influencing the final quality.

In my vision, a localized application for VAT will be developed only for Italy (or India, Argentina, USA…) by a local software developer, with strong ties and knowledge of the evolving regulatory landscape. When you post an invoice, you will generate several, related (see point 6), different documents:

– the invoice itself, with metadata describing only the essential information (date, number, vendor, total amount, customer);

– the general ledger document, with only account codes and amounts;

– the VAT document;

– the Withholding tax document, if required;

– the financial (account payable cash flow) document;

– the good-receipt clearing document.

This means that:

– the application footprint is small;

– user interfaces require a careful design (but this is already true for the legacy, monolithic, applications);

– different document are linked only by a “structured” link, made by the document url, and few integrity constraints based on document values

I listed only few elements that I believe will characterize the future generation of ERP systems, and are made possible by the technologies that emerged in the past years. I am convinced that a new software development approach, conceiving the application as something mixed or meshed with common use tools, like mail, document systems, social networks, will provide agility, ease of management, and – at the end – an impact on the bottom line of P&L!

Some of my thoughts are shared by key players in the ERP software industry (look,  for example, this interview with the SAP Fiori guru). But I haven’t seen yet a full conceptual design for a “New ERP”.

Re-using user interfaces

There is a lot of attention, in the software development industry, upon software reuse. It’s a good practice, also if, often, reusing “old” software, may be a compromise solution.

Now, I want to share my ideas about “reusing” user interfaces.

In my Company, we manage an ERP system. Our business architecture is quite complex, and several processes are managed with the ERP system. System users are mainly “expert” users, people who post transactions, interact with the system, during all the workday.

There is also a set of “non-expert” users: field technicians, managers, salespeople. These people follow different interaction paths: they require information, but are not aware of “codes”, “transactions”, “modules”. At the same time they should interact with the system, but only in a limited way: an approval, a feedback, etc.

Asking them to login to the system, learn how it works, its internal logic, may be hard. At the same time, each user may consume a “seat licence”. This means additional licensing and maintenance fees.

There are many solutions for this common issue: enterprise portals, or web based applications (for purchase requisitions approval, time sheets, service reports,and so on).

But, still, all these apps require that the user adopt a new “channel” to interact with the system. This means a user interface, application logic, etc. With the term “channel”, I mean the aggregate of user interface and human machine interaction paths, or execution environment. Every time that a user has to “switch” from a channel to another (consider leaving your email client and opening a spreadsheet), he will spend an amount of mental energy only because the environment changes, and his brain have to “recall” the knowledge about the new channel. This amount of energy increases if the new channel is used with much less frequency than the previous one; eventually this may cause a “rejection” of the second channel.

We tried something simple, nothing new, but a comprehensive approach, leveraging the opportunities offered by an integrated cloud service. We tried to use a number of well known “channels”, like email and intranet, to deliver some simple interactions with our ERP system.

Today, quite every business operate some basic services to support collaboration:


  • Email
  • File sharing
  • An intranet


Every user is able to interact with these systems.

My Company have migrated, few years ago, those services on a cloud platform, a project that I have managed. It was a good move; anyhow you may find a lot of documents and discussions about this kind of solutions, googling around.

But the key advantage of an integrated, cloud-based, collaboration services solution, is search.

The search engine paradigm is one of the easiest pattern for user interface: you need something, you type in a box what you need, the system returns a list of results. You can browse through results, searching what you need, and, eventually, finding something useful that you weren’t aware of.

Al, happens in a fraction of a second. Cloud-based search engines are really powerful.

I have imagined that leveraging this feature, and using the email as a “transactional” interface, may lower the ERP adoption barrier for “non-expert” users.

So, we started to architect and develop some simple software modules, following three main patterns:

  • Publish: a report, as a document or web page;
  • Push: notification about transactions posted, or status changes;
  • Ask confirmations, approvals, or information.


For the first pattern, we used document sharing or intranet; for the other two, email, or web-based forms in some cases.

For example, if a mall manager wants to open a purchasing request, he uses a web form. The systems then creates a shared folder, that will be used to store RFP, Vendor offerings, budget status reports, and all the documents required for the approval process.

The organizational units involved will receive an email, with the notice of the new request. The central purchasing administration office will create a request in the ERP system, linking it with the shared folder.

The ERP system will start publishing some documents, about the request, in the shared folder: budget status, potential vendors list, etc.

Just imagine. Users not directly involved with data entry and transactions on the ERP system are able to find all the information, searching their inbox, or their file collection. They may not be aware that an ERP system is working behind the scenes. All the documents are linked together, just as usual for web pages. The search engine can be used to find everything, typing the name of the vendor, or a project description, or a tenant name.


We have applied this approach to several processes. The final result is that a user can simply search his mail/file/intranet archive, freely, typing just few keywords, and find what he needs.

The training required was near to zero.


Now, few notes for the techies.

Creating a communication link between an ERP server buried in a closed, protected, subnetwork, and a cloud service (which usually is protected in an insane way) is not easy.

Luckily our requirements weren’t so strict, no “real-time” interaction was required.

We built our interface on two widely used standards: SMTP and JSON. They are not specific of any platform,  they are vendor neutral,  and well supported on both our subsystems. So:

  • Direct email from the ERP to the user inbox for notifications and requests;
  • Structured data in a JSON document enclosed in a email for publishing.


Some scripts running on the cloud platform do the processing required for publishing the data embedded in JSON files. We developed a simple template engine, to speed up the graphical/layout design phase.

In this way we are able to manage the ERP-to-cloud channel.

Figura per post mail ERP integration_02

To get back Data, from the cloud, toward the ERP, we still use an email,  with some control codes embedded in it,  delegating to a small java application the task of retrieving mail messages, parsing their content, and performing a remote function call on the ERP (this can be developed directly within the ERP platform, but our current release doesn’t support the required APIs). Open source tools like Gradle and Maven provided a convenient support for packaging the runtime artifacts and retrieve/align the required libraries.

Developing everything was quite easy: both systems provide simple conversions from/to JSON. Both systems have a development environment supporting interfaces for email sending/processing. Both systems have templating engines, critical to achieve “presentation” components that are easy to maintain.


With this approach, we have a certain degree of independence from each of the connected subsystems.

Clearly, we have used a specific ERP and a specific cloud solution, but I don’t want to make reference to a single product, because this kind of approach can be easily followed using any of the main products available in each segment. All the “my product is clearly the best” marketing claims overestimate the differences existing among different architectures. I think that a professional ERP manager or solution architect can obtain quite the same result working on any platform.

Hacking the value of integration

During the last two years, in my company, we have started a series of projects to enhance our SAP ECC platform, in the direction of business integration across the software modules.

This happened because we have spotted some limits in the architecture of this ERP system. SAP is a well-integrated software suite, in his main areas (Financial, Logistics, Sales, Real Estate, HR, etc.) it offers a bulk set of functions, that cover the typical business processes in a very wide set of industries. So, where are these limits?

I will try to suggest some ideas and discussion topics across the posts of this blog.

My company operates in the Construction and Real Estate industry. Our business processes start from the acquisition and development of lands, until the sale of residential properties or the management of commercial properties that are leased out. SAP ECC was installed in late 2005, replacing a previous Italian ERP solution. Initially, SAP was not a successful project. Since 2008, we have started a complete re-engineering of SAP implementation, changing the design from a module/office centric design to a business-centric design.

Let’s start now from a small development that we are carrying on now: cash management by an economic business perspective.

We have a very basic configuration of the CM module: bank and cash accounts, customers and suppliers accounts, are classified according to categories that are relevant for our business. With these settings, we are able to perform a basic analysis of financial flows; this is important, because our business relies heavily on the financial supplies from banks, and our finance dept. needs to monitor the cash levels versus the financial planning.

Unfortunately, the simple analysis of bank accounts doesn’t allow us to understand how the sources and destinations of liquidity have influenced their balances. So we need to drill down this analysis across our business units, and across the development projects that are active.

A good perspective of our business is given by the Profit Center hierarchy: it is roughly divided in the following areas:

  • commercial real estate
  • residential real estate
  • property development
  • operating expenses
  • corporate finance

Each area is divided according to the management perspective of the underlying business: residential real estate, as an example, is detailed at building level; several buildings are grouped together according to the land on which they are built.

We are simply building a set of reports that allows our financial analysts to separate incoming and outgoing cash flows according to the building, building part, or single controlling-relevant business element that originates the costs or incomes from which the income or payment comes.

The key of those reports is a function that, given a cash management relevant posting, goes back to the invoice or other FI document, with a controlling destination, that is paid or collected.

I will discuss in some following posts why this solution fits our needs. The point now is the importance to link one area of the ECC system – Controlling – with another area – cash flows analysis, giving the financial analyst a perspective that is the same shared by the financial planners, the production manager, etc.