Selecting files to process

With the CLI you can process either a single file, or multiple files within a directory or directory tree.

This article applies to the aluma classify, aluma extract, aluma redact and aluma read commands.

Process a single file

To process a single file, just provide the filename within the command:

aluma classify myclassifier file1.pdf

The filename may be either relative to the current directory or a full path.

📘

If you have multiple files to process, it is much faster to process them in one command using a file pattern (as discussed below) than to process them one at a time. When processing multiple files, The CLi will parallelise requests to the API up to your account throughput limit (normally 30 files at a time).

Process multiple files

To process multiple files, you can provide a file pattern that matches one or more files. For example, the following command processes all docx files in folder test:

aluma classify myclassifier test/*.docx

File patterns are standard globbing patterns that use wildcards to match multiple files.

"*" matches any number of characters within name

If a file pattern contains a single asterisks, that will match zero or more characters inside a filename or directory name. So d*o matches doodoo, dao, and just do. The wildcard only counts inside a file or directory name, so docs/*.tif will only match .tif files in the root of the docs directory.

"**" matches any number of characters within all directory names in tree

If a file pattern contains double asterisks, that will match zero or more characters in directory names across the whole directory tree. Therefore docs/**/*.pdf will match any .pdf file in the root of the docs directory or any of its subdirectories.

📘

Displaying progress information

If you are processing multiple files and writing results to a file (or files) using the -m or -o parameters, you can display a progress bar using the -p parameter. This will show you how many files have been processed and how many are still pending.

Examples

*.*All files in the current directory
*.pdfAll files with extension .pdf in the current directory
a*.pdfAll files matching a*.pdf in the current directory
test/*.tifAll files with extension .tif in the current directory
test/**/*.tifAll files with extension .tif in test or its subdirectories