Use multiple methods to get results

This article explains how to use a Choose First evaluator to take results from multiple searches, and provide only one set of results to a field. Since the same example configuration is used to demonstrate the principles, you may find it helpful to read the Searching for Text Relative to Other Text how-to article.

You may need to configure multiple searches, each with their own rules, to get the same piece of data from a particular document type because:

  • Despite being the same business-level document type, there may be multiple possible layouts, and hence might need multiple techniques to get the same piece of data from each layout

  • If the same data is printed in multiple places within the same document layout, you may want to make the solution robust to OCR errors by having a fallback technique

The simplest way to is to connect a Choose First evaluator which runs multiple searches in the order specified, until one of them finds results. Drag one onto the canvas, and use the Provide Results/Parameters connector in the toolbox to drag from each 'value' search to the evaluator, and then from the evaluator to the field.

📘

Direction of the Arrows is Important

Ensure that the arrows from searches to be evaluated are coming into the Choose First component. Remember that Aluma starts from each field and works backwards through the flow, executing searches on a 'need to run' basis only. If multiple searches are connected to a Choose First and one of them finds results, the other connected searches will not run unless they are required elsewhere.

If you want to change the order in which incoming searches are evaluated, right-click the name of one of the connected searches in the list within the Choose First component, and select Move Up or Move Down.

The following configuration attempts to locate an amount by finding a keyword and then searching for the amount using a geometric proximity rule. If that doesn't return a result then it attempts to find the amount using a logical proximity approach.

Identification of amount from multiple methods - example configuration (click to enlarge)Identification of amount from multiple methods - example configuration (click to enlarge)

Identification of amount from multiple methods - example configuration (click to enlarge)

In this case, the order displayed is preferred because finding keywords and values in an exact geometric relation is a stronger indication that the extraction will be successful, whereas running the logical search first may produce a false positive with the 'Total' heading for documents that have this layout.