Extract invoice data
About this guide
Aluma makes it easy to reliably extract key data from any invoices, regardless of layout. Scanned documents are read automatically using on-demand OCR if necessary.
You can capture data such as invoice number, date, invoice amounts and supplier's tax ID without any configuration.
If you have a database available that contains details of your purchase orders and suppliers then you can also capture line items and match supplier and customer identities against your records.
In this guide we'll extract basic invoice data from some example UK invoices, including both scanned documents and digitally-created PDFs.
Working through this guide should take about 5 minutes.
Before you begin
Before you start you must have:
- Installed the Aluma CLI and logged in to connect it to your account
- Installed the example documents
If you have not done these steps, follow the Getting started guide and then return to this one.
Extract data from invoices
We'll extract some basic invoice data from the UK invoice example documents in the /invoices/uk
directory.
Before starting, make sure that you are in the directory where you installed the examples.
Aluma has a built-in invoice data extractor for UK invoices that captures the following basic fields:
- Invoice Number
- Purchase Order Number
- Invoice Date
- Tax Point
- Net Total, Tax Total and Gross Total
- Currency
- Document Type
- Supplier Tax ID, IBAN, Bank Code, Bank Account Number and Company Number
Line items and supplier identity
It's also possible to capture line items and match supplier & customer identities against your records if you have a database available with this data.
Let's use this built-in extractor to extract data from the sample invoice documents.
Enter the following command:
aluma extract aluma.invoices.gb examples/invoices/uk/*.* -f csv
The aluma extract
command streams results to the console as each file is processed. The files are processed in parallel, so the order of the results may differ. You will see output like this:
Filename,Invoice Number,PO Number,Invoice Date,Tax Point,Net Total,Tax Total,Gross Total,Currency,Document Type,Supplier Tax ID,Supplier IBAN,Supplier Bank Code,Supplier Bank Account Number,Supplier Company Number
invoices\007.pdf,2507542,,07/02/2018,07/02/2018,9.95,,9.95,GBP,Invoice,,GB75CHAS60924241033583,609242,41033583,
invoices\004.tif,45111,,29/08/2014,29/08/2014,88.50,17.70,106.20,GBP,Invoice,416673738,,201003,83106543,
invoices\006.tif,OM1204,,01/06/2017,01/06/2017,240.00,48.00,288.00,GBP,Invoice,154439894,,090222,10083798,07690103
invoices\001.tif,69792,,31/01/2017,31/01/2017,75.00,15.00,90.00,GBP,Invoice,770560628,,301355,01050669,04172508
...
Invoices from other countries
You can also use Aluma to extract data from invoices originated in other countries. We have extractors available for many countries and can make these available on request. Please reach out to us at [email protected] to request access to them.
Updated over 3 years ago