Get a PDF of the document with specific areas redacted
Supported file types
Redaction is supported for documents created from PDFs or TIFFs.
Redaction request
Adding Marks
The marks
property is an array containing one element for each redaction to be made to the document. Each mark object looks like this:
{
"area": {
"top": 98,
"left": 157,
"bottom": 104,
"right": 187,
"page_number": 1
}
}
The area co-ordinates are relative to the top left of the page and are in points (1/72 inch). The page number is one-based (i.e. the first page of a document is page 1).
Applying redactions
The apply_marks
property controls how redactions are made in the PDF.
If apply_marks
is true (the default) then as well as a redaction object being added to the PDF, the image underlying each field area is replaced with a black rectangle and any text in that area is removed. The redaction is permanent and cannot be undone if the PDF is loaded into a PDF editor such as Adobe Acrobat.
If apply_marks
is false then a redaction object is added to the PDF but the image and any text in the PDF are left unaltered. The redaction can be reviewed and accepted or deleted in a PDF editor such as Adobe Acrobat. Accepting the redaction in that tool will alter the image and remove the text.
Adding bookmarks
The bookmarks
property is an array containing one element for each bookmark to add to the PDF. This is an array containing one element for each area to redact. Each bookmark object looks like this:
{
"text": "Address",
"page_number": 2
}
The text
property specifies the text of the bookmark that will be added. The page_number
specifies the page in the document that the bookmark will link to.
Creating a redaction request based on extraction results
In most cases you will want to redact areas corresponding to the locations of data extracted using the Extract document data endpoint. Rather than building a redaction request manually you can request a response from that endpoint that you can pass straight to this endpoint.
Simply make a request to the Extract document data endpoint, specifying an Accept
header with the value application/vnd.waives.requestformats.redact+json
. The response you receive will be a redaction request that will redact all data extracted from the document. You can either send this directly in a request to this endpoint or edit it first. Each redaction field is labelled with the extraction field it came from to help you if you want to edit it, removing some fields for example.
PDF Text
The PDF returned in the response will contain any text generated by a read (OCR) operation due to any of the Read, Classify or Extract operations being requested for this document.
RESPONSES
200 The document was redacted and the PDF is in the response body
400 One or more properties in the request was invalid. See the response contents for details.
401 There is no Authorization header or the access token is invalid
404 The specified document does not exist
415 Redaction is not supported for documents created from this document's file type