Web Demos

BARCODE READER SDK DEMO

Explore the flexibe barcode reading settings to optimize for your specific usage scenario.

WEB TWAIN SDK DEMO

Try the most popular web scanner features: scan images, import local images and PDF files, edit, save to local, upload to database, and etc.

BARCODE READER JAVASCRIPT DEMO

Transform any camera-equipped devices into real-time, browser-based barcode and QR code scanners.

MRZ SCANNER WEB DEMO

Detects the machine-readable zone of a passport, scans the text, and parses into human-readable data.

APP STORE DEMOS

BARCODE READER SDK FOR IOS

BARCODE READER SDK FOR ANDROID

VIEW MORE DEMOS >

MRC Compression for PDF Files

Reduce the Size of Color Images and Image-Only PDFs by 90%

Image segmentation is a process to partition the pixels of an image into different classes, with each class carrying the coherent characteristics. The characteristics could be color, texture, and intensity. In the document digitization field, image segmentation is commonly used to partition an image into text blocks, line-art graphics, images, and the background (texture, shadows from poor lighting, etc).

MCR PDF Compression

What is MRC?

MRC stands for mixed raster content which is an application of image segmentation. MRC is a method to compress images which contain both binary-compressible text and continuous-tone components. [1]

A document image can be represented in different layers:

  • The first layer is the foreground layer, which stores the colors of the text blocks and line-art graphics.
  • The second layer is the text blocks and line-art graphics without the color info.
  • The third layer is the background and the images, which typically makes up a majority of the image size.

Based on the compressibility characteristics, different compression algorithms can be applied to each layer, leading to an optimal compression ratio up to 10 without degrading the visual quality.

  • JBIG2 is an efficient algorithm for compressing text blocks and graphics.
  • For the background and the colored images, JPEG and JPEG 2000 can achieve a fair level of compression without losing the smoothness and accuracy of colors.

The three layers, along with the instructions on how to reassemble and render them in one file, are stored within a single file.

File Formats that Support MRC Compression

PDF is the most common file format that supports MRC compression.

The MRC 3-layer model was also implemented in other digital document formats, including .tfx (TIFF-FX), .ldx (LuraDocument), and .djvu (DjVu). [2]

Benefits of MRC Compression

MRC compression was developed initially to compress scanned color pages/images for fax transmission. [2] Nowadays, the method is used on images from document scans and snapshots from cameras as well.

The major benefit of Mixed Raster Content compression is obvious - a smaller size. This reduces the bandwidth in transit and thus speeds up file transmission. Also, a smaller file size leads to decreased storage space in the database.

The MRC method comes with some beneficial side effects too:

  • Sharper text. With the text layer separated from the foreground and the background, it’s possible to sharpen the text to make it easier to read.
  • Cleaned-up background. The practice of 3-layer segmentation helps with cleaning up the background as well. Texture and shadows on the background could be distracting to readers. Removing the shadows or softening the texture can improve the reading experience.

MRC Improves the Accuracy of OCR

Based on the discussion above, we can understand that Mixed Raster Content compression results in clean and isolated text blocks. This improves the accuracy of the downstream OCR. With OCR turning the image PDF into a searchable PDF, employee efficiencies are further improved.

Support for MRC

Dynamic Web TWAIN will introduce the MRC compression feature soon. Stay tuned.

References

[1] https://en.wikipedia.org/wiki/Mixed_raster_content
[2] http://www.djvu-soft.narod.ru/planetdjvu/the_mrc_mixed_raster_content_model_and_djvu.htm

Subscribe Newsletter

Subscribe to our mailing list to get the monthly update.

Subscribename@email.com