How to Reduce Scanned Document File Size: DPI, Bit Depth, and Compression Explained

Scanning documents into a digital form can help companies save physical space, improve data retrieval, reduce costs, improve collaboration, etc.

Sometimes, we may come across very large scanned document files or files scanned with a low visual quality. In this article, we are going to talk about aspects related to the size optimization of scanned documents.

Samples using Dynamic Web TWAIN, a JavaScript library which enables documents scanning from browsers, are created for demonstration.

What you’ll build: A web-based document scanning workflow that reduces scanned file sizes by up to 96% using optimized DPI, bit depth, and compression settings with Dynamic Web TWAIN.

Key Takeaways

  • Scanned document file size is determined by three factors: DPI (resolution), bit depth (color depth), and the image compression format.
  • JPEG compression achieves up to 97% size reduction for color and grayscale scans, while TIFF with CCITT Group 4 is optimal for black-and-white documents.
  • PDF containers with automatic per-page compression (JBIG2 for B&W, JPEG for color) deliver the best multi-page results — 96% compression in real-world tests.
  • Dynamic Web TWAIN lets you control DPI, pixel type, bit depth, and output format programmatically in JavaScript.

Common Developer Questions

  • How do I reduce scanned document file size programmatically in JavaScript?
  • What is the best image format and compression for scanned documents — JPEG, PNG, TIFF, or PDF?
  • How does DPI affect scanned document file size and quality?

Prerequisites

To follow along with the code samples, you need:

  • A modern web browser (Chrome, Firefox, Edge).
  • A TWAIN-compatible document scanner connected to your computer.
  • A trial license for Dynamic Web TWAIN. Get a 30-day free trial license to get started.

How Scanned Image File Sizes Are Calculated

Before we get started, let’s learn about how image sizes are calculated.

image bytes = width * height * bit depth / 8

Basically, it is affected by the number of pixels and how many colors it can represent.

How DPI Affects Scanned Document File Size

DPI stands for dots per inch and is a measure of the resolution of a scanned document or photo. The higher the DPI, the higher the quality and resolution of your scan and the resulting image.

Generally, scanning a document at about 300 DPI produces an image with a reasonable size and good quality.

In Dynamic Web TWAIN, we can specify the DPI in the device configuration:

DWObject.AcquireImageAsync({IfShowUI:false,Resolution:300});

Here is a test result of scanning documents with different DPIs.

DPI Resolution Size
100 850x1100 2741.41KB
200 1700x2200 10957.03KB
300 2550x3300 24659.77KB
600 5100x6600 98613.28KB

How Bit Depth and Color Mode Affect File Size

Bit depth, also known as color depth, is the number of bits used to indicate the color of a single pixel. The bigger the bit depth, the more colors a pixel can represent.

Most document scanners provide the option to scan documents in different color modes: black & white, gray and color.

Their bit depths are as follows:

  • black & white: 1 bit.
  • gray: 8 bit.
  • color: 24 bit.

In Dynamic Web TWAIN, we can specify the color mode (or pixel type) in the device configuration:

let pixelType = Dynamsoft.DWT.EnumDWT_PixelType.TWPT_BW;
DWObject.AcquireImageAsync({IfShowUI:false,PixelType:pixelType});

We can also set the bit depth with the following code:

let imageIndex = 0;
let bitDepth = 4;
let highQuality = false;
DWObject.ChangeBitDepth(imageIndex,bitDepth,highQuality);

Here is a test result of scanning documents with different color modes.

Pixel Type Resolution Bitdepth Size
B&W 2550x3507 1 1095.94KB
Gray 2550x3507 8 8740.10KB
Color 2550x3507 24 26206.61KB

How to Compress Scanned Images with JPEG, PNG, and TIFF

We can compress the image data with different image file formats like JPEG and PNG.

In Dynamic Web TWAIN, we can get the image sizes with the following code:

let imageIndex = 0;
let width = DWObject.GetImageWidth(imageIndex);
let height = DWObject.GetImageHeight(imageIndex);
let originalSize = DWObject.GetImageSize(imageIndex,width,height);
let size = DWObject.GetImageSizeWithSpecifiedType(imageIndex,j); //size after compression with a format

Here is a test result of scanning documents in different color modes and image formats.

Pixel Type Resolution Bitdepth Size Format Compression Rate
B&W 2550x3507 1 1096.00KB BMP 0%
B&W 2550x3507 1 705.89KB JPG 35.59%
B&W 2550x3507 1 38.09KB TIF 96.52%
B&W 2550x3507 1 77.45KB PNG 92.93%
Gray 2550x3507 8 8741.15KB BMP 0%
Gray 2550x3507 8 590.68KB JPG 93.24%
Gray 2550x3507 8 1957.32KB TIF 77.61%
Gray 2550x3507 8 1574.74KB PNG 81.98%
Color 2550x3507 24 26206.66KB BMP 0%
Color 2550x3507 24 665.43KB JPG 97.46%
Color 2550x3507 24 5112.49KB TIF 80.49%
Color 2550x3507 24 3753.82KB PNG 85.68%

We can draw some points from the table:

  1. BMP is a lossless format which does not compress the image.
  2. JPEG does not work well for black & white images but it has the best compression rate for gray and color images.

How to Optimize Multi-Page Scanned Documents with TIFF and PDF

Most of the time, we need to scan documents in multiple pages. We can use TIFF or PDF as the container to save the images in one file. TIFF and PDF support many image formats and compression methods.

Since different compression methods work differently for different color modes, Dynamic Web TWAIN uses the following compression strategies to achieve an optimum result.

For TIFF, it uses the following strategy:

  • For 1-bit images, use TIFF_T6.
  • For other images, use TIFF_LZW.

For PDF, it uses the following strategy:

  • For 1-bit images, if the PDF version is over 1.4, use JBIG2 encoding, otherwise, use FAX4 (CCITT Group 4 Fax).
  • For 8-bit images, if the image is grayscale, use JPEG encoding, otherwise, use LZW (Lempel-Ziv-Welch).
  • For 24-bit and 32-bit images, use JPEG encoding.

Here is a test result of scanning documents in TIFF and PDF formats. It scans a document in black & white, gray and color.

Format Original Size Size Compression Rate Link
TIFF 36042.64KB 7109.21KB 80.28% Download
PDF 36042.64KB 1283.96KB 96.44% Download

Common Issues and Edge Cases

  • JPEG artifacts on B&W scans: JPEG compression is designed for continuous-tone images. Applying it to 1-bit black-and-white scans produces visible artifacts and poor compression (only ~35%). Use TIFF with CCITT Group 4 or PNG instead.
  • Oversized files from default scanner settings: Many scanners default to 600 DPI and 24-bit color. If the document is text-only, switching to 300 DPI and B&W or grayscale can reduce file size by over 90% with no meaningful quality loss.
  • PDF version compatibility: JBIG2 encoding for B&W pages in PDF requires PDF version 1.4 or higher. If you need to support older PDF readers, the encoder falls back to FAX4 (CCITT Group 4), which still provides strong compression for 1-bit images.

Frequently Asked Questions

What is the best DPI setting for scanning documents?

300 DPI is the standard recommendation for most document scanning workflows. It balances file size and readability well. Use 600 DPI only when you need to capture fine print or detailed graphics.

Why are my scanned PDFs so large?

Scanned PDFs are large because each page is stored as a raster image. The file size depends on the scan resolution (DPI), color mode (B&W vs. color), and whether compression is applied. Switching from 600 DPI color to 300 DPI grayscale with JPEG compression can reduce a single page from ~98 MB to under 1 MB.

Should I use JPEG or PNG for scanned documents?

Use JPEG for grayscale and color document scans — it achieves 93–97% compression. Use PNG or TIFF for black-and-white scans where JPEG performs poorly. For multi-page documents, use PDF with automatic per-page compression for the best results.

Source Code

You can find all the code and online demos in the following repo:

https://github.com/tony-xlh/scan-optimization/