Working with extractors

The CLI makes it easy to work with data extraction. This article explains how to use the various extraction commands. If you aren't familiar with data extraction, you may find it helpful to read the Data extraction overview article.

Uploading a custom extractor built with Extraction Builder

To upload a custom extractor that you've built with the Visual Studio Extraction Builder extension, you can use the upload extractor command:

aluma upload extractor <extractor-name> <extractor-file.fpxlc>

Creating a simple extractor from a list of modules

You can create an extractor from a list of modules with the create extractor command, as long as those modules do not require any parameters:

aluma create extractor from-modules <extractor-name> <module-id> <module-id>

Creating an extractor from modules and parameters in a template file

You can also use a variant of the same create extractor command to create an extractor from a set of modules and associated parameters defined in a template file:

aluma create extractor from-template <extractor-name> <extractor-template.json>

Listing extractors

To list all of the extractors that you've created, run this command:

aluma list extractors

Deleting an extractor

To delete an extractor you can run this command:

aluma delete extractor <extractor-name>

Extracting data from documents

To extract data from files using a extractor, you can use the following command, specifying the file (or files) to extract and the name of the extractor to use:

aluma extract <extractor-name> <file-pattern>

See Selecting files to process for examples of how to use file patterns to select multiple files.


Processing multiple files in one command is faster

If you are calling the CLI from a script or application, you will get substantially more throughput by using the extract command to process multiple files than by calling it multiple times with a single file. When processing multiple files, the CLI uses all of your account's capacity & makes parallel requests to the service resulting in faster processing of the whole set of files.

The above command writes the extractor results to the console in table format. You can specify CSV or JSON formats and also write results to a file, or files. For example, this command writes extraction results to a single output file in CSV format:

aluma extract extractor-name *.pdf -f csv -o results.csv