Get a PDF of the document with specific areas redacted

Supported file types

Redaction is supported for documents created from PDFs or TIFFs.

Redaction request

Adding Marks

The marks property is an array containing one element for each redaction to be made to the document. Each mark object looks like this:

{
  "area": {
    "top": 98,
    "left": 157,
    "bottom": 104,
    "right": 187,
    "page_number": 1
   }
}

The area co-ordinates are relative to the top left of the page and are in points (1/72 inch). The page number is one-based (i.e. the first page of a document is page 1).

Applying redactions

The apply_marks property controls how redactions are made in the PDF.

If apply_marks is true (the default) then as well as a redaction object being added to the PDF, the image underlying each field area is replaced with a black rectangle and any text in that area is removed. The redaction is permanent and cannot be undone if the PDF is loaded into a PDF editor such as Adobe Acrobat.

If apply_marks is false then a redaction object is added to the PDF but the image and any text in the PDF are left unaltered. The redaction can be reviewed and accepted or deleted in a PDF editor such as Adobe Acrobat. Accepting the redaction in that tool will alter the image and remove the text.

Adding bookmarks

The bookmarks property is an array containing one element for each bookmark to add to the PDF. This is an array containing one element for each area to redact. Each bookmark object looks like this:

{
  "text": "Address",
  "page_number": 2
}

The text property specifies the text of the bookmark that will be added. The page_number specifies the page in the document that the bookmark will link to.

Creating a redaction request based on extraction results

In most cases you will want to redact areas corresponding to the locations of data extracted using the Extract document data endpoint. Rather than building a redaction request manually you can request a response from that endpoint that you can pass straight to this endpoint.

Simply make a request to the Extract document data endpoint, specifying an Accept header with the value application/vnd.waives.requestformats.redact+json. The response you receive will be a redaction request that will redact all data extracted from the document. You can either send this directly in a request to this endpoint or edit it first. Each redaction field is labelled with the extraction field it came from to help you if you want to edit it, removing some fields for example.

PDF Text

The PDF returned in the response will contain any text generated by a read (OCR) operation due to any of the Read, Classify or Extract operations being requested for this document.

RESPONSES

200 The document was redacted and the PDF is in the response body
400 One or more properties in the request was invalid. See the response contents for details.
401 There is no Authorization header or the access token is invalid
404 The specified document does not exist
415 Redaction is not supported for documents created from this document's file type

Language