Selecting files to process
With the CLI you can process either a single file, or multiple files within a directory or directory tree.
This article applies to the aluma classify
, aluma extract
, aluma redact
and aluma read
commands.
Process a single file
To process a single file, just provide the filename within the command:
aluma classify myclassifier file1.pdf
The filename may be either relative to the current directory or a full path.
If you have multiple files to process, it is much faster to process them in one command using a file pattern (as discussed below) than to process them one at a time. When processing multiple files, The CLi will parallelise requests to the API up to your account throughput limit (normally 30 files at a time).
Process multiple files
To process multiple files, you can provide a file pattern that matches one or more files. For example, the following command processes all docx
files in folder test
:
aluma classify myclassifier test/*.docx
File patterns are standard globbing patterns that use wildcards to match multiple files.
"*" matches any number of characters within name
If a file pattern contains a single asterisks, that will match zero or more characters inside a filename or directory name. So d*o matches doodoo, dao, and just do. The wildcard only counts inside a file or directory name, so docs/*.tif will only match .tif files in the root of the docs directory.
"**" matches any number of characters within all directory names in tree
If a file pattern contains double asterisks, that will match zero or more characters in directory names across the whole directory tree. Therefore docs/**/*.pdf will match any .pdf file in the root of the docs directory or any of its subdirectories.
Displaying progress information
If you are processing multiple files and writing results to a file (or files) using the
-m
or-o
parameters, you can display a progress bar using the-p
parameter. This will show you how many files have been processed and how many are still pending.
Examples
*.* | All files in the current directory |
*.pdf | All files with extension .pdf in the current directory |
a*.pdf | All files matching a*.pdf in the current directory |
test/*.tif | All files with extension .tif in the current directory |
test/**/*.tif | All files with extension .tif in test or its subdirectories |
Updated about 3 years ago