Skip to the content

Scanning paper documents to Microsoft Word and other Office formats

Making paper documents digital and reusable

"The paperless office" has been announced for at least a couple of decades. Yet we all know that very few offices today are paperless. Letters, invoices, documentation and archived documents often exist only on paper. And the usual workflow for these has been: scanning and archiving. You can make them searchable and look at them, but if you want to make reuse of the actual content?

With new technology, you can scan and digitize paper into formatted text documents, spreadsheets, presentations and even web pages. PixEdit Desktop lets you scan documents very easily and then use its highly accurate OCR (Optical Character Recognition) technology to save the document for example in a Microsoft Office format or as a PDF with formatted text.

This article gives you a quick introduction to the file export function in PixEdit Desktop.

The File Export dialog

At first glance, the File Export dialog may seem a little overwhelming.

So let's try to get an overview first: The upper half is basically a file explorer, which lets you specify which pages of the current document you want to export and the folder and name of your exported file. In the lower half, you have all the options for how to export your document.

Start with the upper half and specify page range, name and folder.

Export Options

In the lower half, start by choosing the file format (File Type) to for your export. You will see that the dropdown contains quite a few options, from very simple text formats to office document formats and PDF. Each of these formats have their own possibilities and limitations, so you will notice that the availability of the options further down will vary depending on the format you choose. This will help you on the way of choosing the right options for your export.

Layout

The most basic option of an export is "Layout". As the name implies, it specifies how, or "to what extent", the overall layout of the original document is retained. To understand better how this option affects your export it is probably best to try all available options and have a look at the results. You will notice that, generally, "Create body text" results in a document without any layout at all, "Retain word and paragraph format" will preserve the basic layout such as paragraphs, font sizes, etc. And, finally, "Recreate source document" gives you the most accurate rendition of the layout in the original. Which option you choose is probably highly dependant on how you plan to use the exported document in your workflow. (Note that not all of these choices are available for all formats.)

Modes

The "Color Mode" specifies how (or if) colors are retained. Basically, you can choose between Black & White, Grayscale or Color. The "Uknown" option means "use colors if there are any". Now the "Retain Color Mode" is related to "Color Mode". You can choose between retaining just the color of text or the color of both text and background. The relation between the two can be explained with an example: If you have specified "Grayscales" as the "Color Mode", "Color of text" as the "Retain Color Mode" and the document contains blue text, the blue text will be converted to a grayscale.

The "Column Mode" is best understood if you try out the different options for the file format you are using. You'll notice that for example for the Microsoft Word format, specifying one of the "Detect" options will result in a normal text flow with paragraphs, while "Ignore" will result in a document with text areas for each paragraph. Please try them out and choose the one that is most convenient for your workflow.

Other options

"Merge lines into paragraphs" - Off: A return character is inserted at the end of each line in a paragraph. On: No return character is inserted within a paragraph. This results in a more natural flow if you plan to modify the document.

"Include graphics" - Off: Graphics, i.e. elements not recognized as text will be left out. On: Graphics will be included as pictures.

Other options apply only to specific formats. For now, we leave you to experiment with these on your own.

We hope this article will help you get started scanning paper documents and converting them to reusable office documents. For more details, please refer to the PixEdit Desktop User Guide. And feel free to leave any comments you may have below.

Don't have PixEdit Desktop yet?

Get a free 30-day trial in our webshop!

 

About the author

The content is produced by PixEdit AS. ©PixEdit AS.

comments powered by Disqus

How to buy PixEdit Software

Short delivery time and easy installation will get you started quickly!