Data redaction and bookmarking

Aluma makes it easy to extract data from documents then redact some or all of the data before returning the document so it can be passed to a subsequent process or delivered into storage.

Redacted documents are always returned as PDFs. Bookmarks may also be added to documents in order to make navigation to redacted data in the document easy.

Using redaction with manual review

If you want to manually review redactions in a PDF editor (such as Adobe Acrobat) then Aluma can add the redactions to the PDF but not apply them.

The image and any text in the PDF are left unaltered by Aluma. The redaction can be reviewed and then either applied or deleted in the PDF editor. Applying the redaction in that tool will alter the image and remove the text.


Reviewing a redaction in Adobe Acrobat

Using redaction without manual review

If you don't want a manual review step then Aluma can apply the redactions to the PDF before returning it.

In this case the redacted data is completely and permanently removed from the PDF. All images underlying redaction areas are replaced with a black rectangle and any text in that area is removed.

The redaction cannot be undone if the PDF is loaded into a PDF editor such as Adobe Acrobat.


How an applied redaction appears in Adobe Acrobat


Aluma can also add bookmarks to documents in order to make navigation to redacted data in the document easy. Bookmarks can be added with arbitrary text (for example the field name) and can be added independently of redactions if you wish to include additional bookmarks.


Redacting documents using the CLI

You can redact documents using the CLI's redact command.

Redacting documents using the API

The Get redacted PDF endpoint takes a request containing details of areas to redact and bookmarks to create. This can be constructed as you wish but in most situations will be based on data extracted using the Extract document data endpoint. In this case the process is as follows:

  1. Create a document from your input file using the Create document endpoint.

  2. Call the Extract document data endpoint with an Accept header of application/vnd.waives.requestformats.redact+json. This returns a response in the redaction request format, with redactions for locations of all extracted data.

  3. If necessary, modify the returned redaction request (for example to remove redactions for any fields you do not want to redact, or to specify that redactions should not be applied).

  4. Send the redaction request to the Get redacted PDF endpoint.

  5. Retrieve the redacted PDF from the body of the response.

  6. Delete the document using the Delete document endpoint.