PDF Redaction

With the internet and electronic documents, information can be spread worldwide in just seconds by a couple of mouse-clicks. Hence, protecting parts of a document from being exposed to the public is becoming increasingly important.

Safely removing sensitive information

Censoring sensitive information in documents is important to many professional document publishers, such as:

  • Media industry, newspapers
  • Government institutions
  • Financing and insurance
  • Health sector

Blacking out text

On paper documents, you would usually erase sensitive text by drawing over it with a black marker pen. Sensitive information would then not be revealed, but the context and meaning of the text is intact. However, this method is not safe. Because the ink is never 100% black, there is a chance that text can be recovered by using a simple scanner and software.

With electronic documents, such as PDFs, software is used to black out sensitive information. The question is: Can you be sure that the information isn't still there "behind" the black box?

Hidden text and metadata

PDFs can contain text that is not visible on the screen. When a scanned document is OCR'd and then stored as a PDF, the scanned image of a page is usually kept as is, and the recognized text is stored "behind" the image. Hence, blacking out just the scanned text will remove it visually, but the hidden text is still there.

Many PDFs also contains metadata, information about the document such as the name of the author, date of creation etc. When redacting a document, it is important to keep in mind that metadata may also contain sensitive information.

Safe redaction

Use the right software to redact your document. There have been many cases in the past, where sensitive information has been compromised due to incorrect use of software.

The redaction tool in our PixEdit® Desktop and PixView® software provides a safe way of censoring both born-digital and scanned documents. Any hidden OCR text and metadata, as well as the graphic representation of the redacted text will be permanently removed. Hence it will not be possible to retrieve the redacted text after saving and closing a redacted document.

Tutorial video

How to safely remove sensitive text from a PDF document