Recommended Scan Settings for the Best OCR Accuracy
How to Get Optimal OCR Results with Proper Scanner Settings While in today’s world, it’s quite impossible to have an office workspace without a scanner. Every day millions of documents are scanned to be edited or sent for various reasons. With the advancement in technology happening in every sector, travel-sized scanner deeded to be made for the on the go people. But without the proper setting which we will discuss below, it is quite impossible to attain accuracy. So, you’ve implemented a document management system and get tons of scanned and digitized documents in your document repository. But, as you know, these digital documents need to be searchable and editable to be truly useful. So, you also need good OCR processing. This leads us to the question - how do you get optimal OCR results from scanned documents? What are the recommended scanner settings? In this article, we will share some tips on this.
Resolution is an import factor to control the image quality. So, what resolution should you use for better OCR accuracy? For the usual font size (10 pts or above), usually 300 DPI is recommended. For smaller font sizes, a higher DPI, like 400 DPI, would be recommended. Lower resolution will of course produce lower-quality images and this might affect text recognition. Yes, higher resolutions will produce bigger images and cause more time for image processing. But, it’s more important to have usable content than extra storage laying around.
B/W, Gray or Color
When scanning, you can choose among three color modes: black and white, grayscale and color. Which is the best color mode to yield optimal OCR accuracy? Generally, we’d recommend you use grayscale. Black-and-white would also work for most text documents using a good font and quality. But, when the font is small and image quality is not so good, the lost details of B/W images might undermine OCR recognition. In contrast, grayscale will keep significantly more details than B/W and would be a better option. If your document contains pictures and you need to save the colors, then obviously choose to scan in color mode.
File Type and Compression
There are two types of image compression - lossy and lossless. Lossless compression is the option to go with for better OCR recognition. Among the document file types, you can choose to save scanned images in uncompressed TIFF or PNG format. These allow for better future processing, for example compared with the JPEG format that loses quality with each edit and save.
The brightness setting of scanners adjusts the balance of light and dark shades in your scanned images. By adjusting the brightness, you will get lighter or darker images. A medium brightness value of 50% is suitable in most cases. Do you have any questions on getting better OCR results when scanning documents? Or do you have more tips to share? Please let us know in the comments section below.