Classification output formats
The CLI uses a human-readable table format as its default output option, but offers various ways for you to format the output of the classify
command. Use the --format
/-f
parameter to format the output of the command into one of the output types in the following table.
--format (-f) | Description |
---|---|
table | table with column headings includes basic output fields. table is the default. |
csv | comma-separated values includes basic output fields and "relative confidence" field |
json | JSON string includes all output fields |
Using table format
The table format provides output in an easy-to-read format, but only includes the most commonly-used output fields. Specifically, for the classify
command, the "relative confidence" and individual document type scores are omitted.
This output format is the default, so you do not need to specify the --format
parameter:
aluma classify myclassifier test/*.*
FILE DOCUMENT TYPE CONFIDENT
00001a.xlsx Expenses true
00001b.pdf Invoice true
Using CSV format
The csv
output format returns a simple text-based and comma-separated output with no headings. This format makes it easy to consume the output into other commands and tools that need to process the output in some form.
Using the preceding example with the csv
option outputs the following comma-separated results:
aluma classify myclassifier test/*.* -f csv
C:\examples\00001a.xlsx,Expenses,true,1.619
C:\examples\00001b.pdf,Invoice,true,3.159
Note that the csv
output includes the following fields (in this order):
- Full file path
- Document type
- Confident (
true
orfalse
) - Relative confidence
The next example shows how the csv
output can be piped to the Powershell ConvertFrom-Csv
cmdlet to select specific results from the output of the classify
command. In this case we're selecting results with the document type is "Invoice".
aluma classify myclassifier *.* -f csv | `
ConvertFrom-Csv -Header "File", "Type", "Confident", "Confidence" | `
where { $_.Type -eq "Invoice" }
Using JSON format
The json
output format returns a JSON string containing all available output fields. This format is designed for output into other commands and tools that need to process the output and need access to the more advanced output fields.
aluma classify myclassifier test/*.* -f json
The output is in this form (some output omitted for brevity):
[{
"filename": "C:\\examples\\00001a.xlsx",
"classification_results": {
"document_type": "Expenses",
"is_confident": true,
"relative_confidence": 1.6189158,
"document_type_scores": [
{
"document_type": "Expenses",
"score": 49.467617
},
{
"document_type": "Invoice",
"score": 34.63108
}
]
}
},
{
"filename": "C:\\examples\\00001b.pdf",
...
}]
Note that the JSON is an array of results, except when using the --multiple-files
/-m
parameter to write a result file per input file in which case the JSON is a single result.
Updated about 3 years ago