Error management – the most important and difficult stage of automation

Robots, just like us, not always do keep an eye on our surroundings. The difference between them and us is that a robot has to anticipate all potential failures and has to be prepared (with help from a developer, of course), for the upcoming problems. I used to learn that error management is a key factor of an automation project, like 80% of it. I fully agree. Besides, when an error handling is a missing part, then I – a robot, may not work as requested. The crucial points are the interactions with humans. When input data is not extracted or downloaded from applications but delivered by a human, which means this input is prepared manually, then there is a high risk that the robot will receive erroneous data or data related to the scenario that is not handled. In that cases, preventive actions have to be introduced.

Poka-Yoke

From Lean Six Sigma methodology comes an approach called Poka-Yake (in japanize: mistake-proof).  It prevents functionality from incorrect usage of the given tools. For instance, when a robot is to be started by means of a mail sent by an operator, an Excel sheet is prepared as a template for a user to provide input data. Based on a control XLSX file, a robot processes a data validation and yet after this, it will run the planned steps. The solution can be designed based on a Visual Basic for Application or data validation procedures. As a result, a user receives an efficient tool that simplifies his work, and at the same time prevents a robot from committing errors. Naturally, it is tended to lead to the situation when erroneous orders will not get the robot at all. When a preliminary data validation is not possible on the operator’s side, then the same verification process must be triggered on the robot’s side.

Application environment and the robot’s work

The next source of procedure errors is missing control over an environment the robot works in. For instance, the status of the robot station is unknown:

  • Did the process that had been running previously leave behind a mess?
  • Were applications necessary for the run started?
  • Are licenses available for each of the applications in use?
  • Are RAM resources sufficient?

We may encounter errors during a process start as well as during running. For the process to start properly, the robot has to have a possibility to prepare a work environment, that is to close all unnecessary applications running both in the robot’s process and in other processes started for the machine robot is working on. It is important to close the applications from the relevant session when we let several sessions on the machine run in parallel. Unfortunately, the Kill option does not distinguish between an application we have just started and an application that was started by another user, so we have to be careful not to let the robot close the functions dedicated to processes different than ours.

Ready for every scenario

How to recognize an application’s bottleneck. Even if we know the way a given program works well, it does not mean that this program works identically on the client’s side. First of all, you have to work over the process with a business analyst to recognize and name critical points for which there is a risk of failure in the case of manual operations. Knowing these a good developer can define process steps that require special attention. Fortunately, there is a lot of different tools to manage such scenarios that can be combined in numerous ways. For instance, for the try-catch structure, we can select more than one action scenario depending on the type of error returned to let some of them be processed once more. So, the error of type Selector Not Found can lead to the conclusion that web application has not started processing yet. Then we have to wait for the next 30 seconds. Whilst the other errors will terminate work on transaction and robot will reach for another task from the queue.

XELTO DIGITAL Framework

Any long-lasting analysis guarantees that a robot is well prepared for any scenario. That’s why there is very important to build such a robot structure, that can anticipate a different kinds of errors unknown for now that will be handled by default. This default way of treatment can be logging out from the application operation, closing a browser, and process restart with an order next in line. We have already mentioned our Framework – click to recall – that is used in each automation we deliver. One of its key factors is this default error handling. Thanks to these facilities a job can be continued even if an unpredictable scenario arises.

Standardized error nomenclature

Last but not least issue in error handling is communication. Depending on a given scenario a problem may be temporary or require investigation from an operator or developer. Xelto Digital Framework has a mechanism built in to distinguish between system errors and business errors. Business errors are dispatched to users working with robot whilst system errors are a matter of provider repair actions. An error naming is a very important element introduced to the Framework and helps to organize error handling over the processes in which several robots are involved. This way there is of no importance in which robot a particular error is found – the error meaning and way of handling are what matters and these are the same for all robots.

Most important and most difficult

At the bottom line, error handling is one of the most difficult and crucial steps in automation design. Automation has to be multilayer starting from input data validation and online connection checking, throughout bottlenecks monitoring to unpredictable scenarios handling.

Author: Rafał Korporowicz – Senior RPA Developer