Classification output formats

The CLI uses a human-readable table format as its default output option, but offers various ways for you to format the output of the classify command. Use the --format/-f parameter to format the output of the command into one of the output types in the following table.

--format (-f)Description
tabletable with column headings
includes basic output fields.
table is the default.
csvcomma-separated values
includes basic output fields and "relative confidence" field
jsonJSON string
includes all output fields

Using table format

The table format provides output in an easy-to-read format, but only includes the most commonly-used output fields. Specifically, for the classify command, the "relative confidence" and individual document type scores are omitted.

This output format is the default, so you do not need to specify the --format parameter:

aluma classify myclassifier test/*.*
FILE                                 DOCUMENT TYPE                    CONFIDENT
00001a.xlsx                          Expenses                         true
00001b.pdf                           Invoice                          true

Using CSV format

The csv output format returns a simple text-based and comma-separated output with no headings. This format makes it easy to consume the output into other commands and tools that need to process the output in some form.

Using the preceding example with the csv option outputs the following comma-separated results:

aluma classify myclassifier test/*.* -f csv
C:\examples\00001a.xlsx,Expenses,true,1.619
C:\examples\00001b.pdf,Invoice,true,3.159

Note that the csv output includes the following fields (in this order):

  • Full file path
  • Document type
  • Confident (true or false)
  • Relative confidence

The next example shows how the csv output can be piped to the Powershell ConvertFrom-Csv cmdlet to select specific results from the output of the classify command. In this case we're selecting results with the document type is "Invoice".

aluma classify myclassifier *.* -f csv | `
   ConvertFrom-Csv -Header "File", "Type", "Confident", "Confidence" | `
   where { $_.Type -eq "Invoice" }

Using JSON format

The json output format returns a JSON string containing all available output fields. This format is designed for output into other commands and tools that need to process the output and need access to the more advanced output fields.

aluma classify myclassifier test/*.* -f json

The output is in this form (some output omitted for brevity):

[{                                                                                 
  "filename": "C:\\examples\\00001a.xlsx",
  "classification_results": {                                                      
    "document_type": "Expenses",                                             
    "is_confident": true,                                                          
    "relative_confidence": 1.6189158,                                              
    "document_type_scores": [                                                      
      {                                                                            
        "document_type": "Expenses",                                         
        "score": 49.467617                                                         
      },                                                                           
      {                                                                            
        "document_type": "Invoice",                                         
        "score": 34.63108                                                          
      }                                                                            
    ]                                                                              
  }                                                                                
},                                                                                 
{                                                                                  
  "filename": "C:\\examples\\00001b.pdf",
  ...
}]

Note that the JSON is an array of results, except when using the --multiple-files/-m parameter to write a result file per input file in which case the JSON is a single result.