Getting started with the API

This tutorial helps get you started with the API, and introduces some core concepts.

The tutorial assumes that you are comfortable with making HTTP requests and set up to do so. For brevity we use the curl command for our examples but you can use any API client, such as Postman or Insomnia to make requests interactively.

Before you begin

  1. Download the example documents for this tutorial and extract them from the archive
  2. Create an account using the dashboard.

Generating an access token

Before you can make any API requests, you need an access token. To generate a token, we'll need some API Client credentials:

  1. Login to the dashboard
  2. Go to the API Clients page
  3. If you don't have an API Client already, click the 'Create API Client' button to create one (the name is not important).

Substituting your own API Client ID and Secret from the dashboard, you can now generate an access token as follows:

curl https://api.aluma.io/oauth/token \
-d 'client_id=<YOUR_CLIENT_ID>' \
-d 'client_secret=<YOUR_CLIENT_SECRET>' \
-X POST

To make the rest of this example easier to read, we've saved the response of this request in a variable called "auth". You will need to include the access token in an authorization header with every request in the rest of this tutorial.

Create a classifier

Now that you've got everything set up, let's create a classifier capable of classifying our example documents.

The example documents are US mortgage documents. They are split into two directories:

  • build is a set of samples organised by document type
  • test is a mixed set of different test documents but of the same types

In the following steps, we assume that you are in the directory where you extracted the examples.

Let's create a classifier called mortgage-classifier using a ZIP file of the samples. You'll need to post two requests, one to create the classifier and another to add the samples.

First, post a request including the access token, and the name of the classifier.

curl https://api.aluma.io/classifiers/mortage-classifier \
-H "Authorization: Bearer $auth" \
-X POST

Now post the following request, which includes the contents of the samples ZIP file.

curl https://api.aluma.io/classifiers/my-classifier/samples \
-H "Authorization: Bearer $auth" \
-H "Content-Type: application/zip" -X POST \
-T "./build/samples.zip"

This request will take a few seconds to complete.

Classify a file

Now that we have a classifier, let's classify one of our test files.

We need to three requests to the API to do this:

  1. Create a document from the file
  2. Classify the document
  3. Delete the document

Document resources are a fundamental building block in the API. In general a document is equivalent to a single file that you want to classify or extract data from.

Create a document resource

Let's create a document from one of the test files in the examples. Issue the following request, which POSTs the contents of the file in the body of the request.

curl https://api.aluma.io/documents \
-H "Authorization: Bearer $auth" \
-H "Content-Type: application/pdf" -X POST \
-T "./test/Assignment of Deed of Trust.pdf" \

You will receive a response similar to the one below. Note the id property at the start of the response. This is the new document's ID. You will need this in the subsequent requests.

{
    "id": "BJ9T675Ck028SBu3sfJhaA",
    "_links": {
        "document:classify": {
            "href": "/documents/BJ9T675Ck028SBu3sfJhaA/classify/{classifier_name}",
            "templated": true
        },
        "self": {
            "href": "/documents/BJ9T675Ck028SBu3sfJhaA"
        }
    },
    "_embedded": {
        "files": [
            {
                "id": "5Z3SvihqQ0e38c4XLxNNlg",
                "file_type": "PDF:ImagePlusText",
                "size": 57158,
                "sha256": "36cbc76c846be2c7b05541790d9605c43fe53ff29d336e9c3c60daade6a37734"
            }
        ]
    }
}

Classify the document

Now issue the following request, replacing the document ID in the url with the document ID from the previous response.

curl https://api.aluma.io/documents/BJ9T675Ck028SBu3sfJhaA/classify/mortgage-classifier \
-H "Authorization: Bearer $auth" \
-X POST

You will receive a response similar to the following one that includes the classification results. This response tells us that the document is an "Assignment of Deed of Trust" and that the classification was confident, which means that we can safely use the document type in any subsequent business process. You can read more about how to use classification confidence and scores in this article .

{
    "_id": "BJ9T675Ck028SBu3sfJhaA",
    "classification_results": {
        "document_type": "Assignment of Deed of Trust",
        "relative_confidence": 3.30463839,
        "is_confident": true,
        "document_type_scores": [
            {
                "document_type": "Assignment of Deed of Trust",
                "score": 66.0927658
            },
            {
                "document_type": "Substitution or Reconveyance",
                "score": 30.3728962
            },
            {
                "document_type": "Warranty Deed",
                "score": 28.0085087
            },
            {
                "document_type": "Deed of Trust",
                "score": 26.7922726
            },
            {
                "document_type": "Notice of Lien",
                "score": 26.7903061
            },
            {
                "document_type": "Correspondence",
                "score": 26.2980633
            },
            {
                "document_type": "Notice of Default",
                "score": 25.9420929
            }
        ]
    }
}

Delete the document

Finally, you should delete the document. Your API client may only have 30 documents created at any one time, so you must delete documents once you are finished with them.

The following request will delete the document (replace the document ID in the url with the document ID from the Create Document response):

curl https://api.aluma.io/documents/BJ9T675Ck028SBu3sfJhaA
-H "Authorization: Bearer $auth" \
-X DELETE