With the internet and electronic documents, information can be spread worldwide in just seconds by a couple of mouse-clicks. Hence, protecting parts of a document from being exposed to the public is becoming increasingly important.
Safely removing sensitive information
Censoring sensitive information in documents is important to many professional document publishers, such as:
- Media industry, newspapers
- Government institutions
- Financing and insurance
- Health sector
Blacking out text
On paper documents, you would usually erase sensitive text by drawing over it with a black marker pen. Sensitive information would then not be revealed, but the context and meaning of the text is intact. However, this method is not safe. Because the ink is never 100% black, there is a chance that text can be recovered by using a simple scanner and software.
With electronic documents, such as PDFs, software is used to black out sensitive information. The question is: Can you be sure that the information isn't still there "behind" the black box?
Hidden text and metadata
PDFs can contain text that is not visible on the screen. When a scanned document is OCR'd and then stored as a PDF, the scanned image of a page is usually kept as is, and the recognized text is stored "behind" the image. Hence, blacking out just the scanned text will remove it visually, but the hidden text is still there.
Many PDFs also contains metadata, information about the document such as the name of the author, date of creation etc. When redacting a document, it is important to keep in mind that metadata may also contain sensitive information.
Use the right software to redact your document. There have been many cases in the past, where sensitive information has been compromised due to incorrect use of software.
The redaction tool in our PixEdit® Desktop and PixView® software provides a safe way of censoring both born-digital and scanned documents. Any hidden OCR text and metadata, as well as the graphic representation of the redacted text will be permanently removed. Hence it will not be possible to retrieve the redacted text after saving and closing a redacted document.