How to Correct Document Image Orientation with JavaScript

Jun 19, 2024

When scanning documents via scanners, we may get misoriented document images. The use of automatic document feeding makes it happen more often. We can use image processing to detect the orientation. There are many ways to do this. For example, in Latin script text, ascenders are more likely to occur than descenders. ¹

anatomy of text line

In this article, we are going to write a web app to scan documents and correct the orientation of them with JavaScript. If the document scanner has built-in orientation correction capability, then use it. Otherwise, use Tesseract-OCR to detect the orientation and then rotate the image. Dynamic Web TWAIN SDK is used to interact with document scanners.

Online demo

Demo video:

In the demo video, I first scanned a piece of paper with the auto document orientation feature of Panasonic KV-N1058X disabled. Then, I scannned it with the feature enabled to see whether the feature works. Finally, I used Tesseract-OCR to detect which document images were scanned upside-down and rotate them.

Create a Document Scanning Web App

First, let’s write a web app to scan documents.

Create a new HTML file with the following template:

<!DOCTYPE html>
<html>
<head>
  <title>Document Scanning via TWAIN</title>
  <meta name="viewport" content="width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=0" />
</head>
<body>
  <h2>Document Scanning via TWAIN</h2>
  <script type="text/javascript">
  </script>
</body>
</html>

Include the library of Dynamic Web TWAIN in the head.

<script src="https://unpkg.com/dwt@18.5.0/dist/dynamsoft.webtwain.min.js"></script>

Initialize an instance of Dynamic Web TWAIN and bind it to a viewer. You can apply for its license here.

HTML:

<div id="dwtcontrolContainer"></div>

JavaScript:

let DWObject;
let scanners;
initDWT();

function initDWT(){
  Dynamsoft.DWT.AutoLoad = false;
  Dynamsoft.DWT.Containers = [];
  Dynamsoft.DWT.ResourcesPath = "https://unpkg.com/dwt@18.5.0/dist";
  let oneDayTrialLicense = "LICENSE-KEY";
  Dynamsoft.DWT.ProductKey = oneDayTrialLicense;  
  Dynamsoft.DWT.CreateDWTObjectEx(
    {
      WebTwainId: 'dwtcontrol'
    },
    function(obj) {
      DWObject = obj;
      DWObject.Viewer.bind(document.getElementById('dwtcontrolContainer'));
      DWObject.Viewer.height = "480px";
      DWObject.Viewer.width = "360px";
      DWObject.Viewer.show();
      DWObject.Viewer.setViewMode(2,2);
    },
    function(err) {
      console.log(err);
    }
  );
}

List connected scanners.

let scanners;
async function loadScanners(){
  scanners = await DWObject.GetDevicesAsync();
  let selScanners = document.getElementById("select-scanner");
  selScanners.innerHTML = "";
  for (let index = 0; index < scanners.length; index++) {
    const scanner = scanners[index];
    let option = new Option(scanner.displayName,index);
    selScanners.appendChild(option);
  }
}

Scan documents using the selected scanner. It will bring up the scanner’s configuration UI to perform a scanning.

HTML:

<input type="button" value="Scan" onclick="AcquireImage();" />

JavaScript:

async function AcquireImage() {
  if (DWObject) {
    const selectedIndex = document.getElementById("select-scanner").selectedIndex;
    const options = {
      IfShowUI:true,
    };
    await DWObject.SelectDeviceAsync(scanners[selectedIndex]);
    await DWObject.OpenSourceAsync();
    await DWObject.AcquireImageAsync(options);
    await DWObject.CloseSourceAsync();
  }
}

Use the Document Scanner’s Automatic Rotate Capability

Many document scanners have the capability to automatically rotate misoriented scans. We can enable it so that we can get corrected document images.

Enable Automatic Rotate via UI

We can directly set this up in the scanner’s UI interface.

The following screenshot is the UI of Panasonic KV-N1058X. We have to enable the Automatic Image Orientation option.

scanner UI

We can also specify the target language.

scanner UI - languages

Enable Automatic Rotate via Code

We can control document scanners via code using TWAIN. Dynamic Web TWAIN provides the following APIs to use the capabilities of document scanners.

We can use the following code to enable the Automatic Rotate capability. If it fails to set the capability, then the document scanner does not have this feature.

function enableOrientationCorrection(){
  return new Promise((resolve, reject) => {
    let config = {
        "exception": "fail",
        "capabilities": [
            {
                "capability": Dynamsoft.DWT.EnumDWT_Cap.ICAP_AUTOMATICROTATE,
                "curValue": 1 // 0: disabled, 1: enabled
            }
        ]
    };
    DWObject.setCapabilities(config,
    function(e){
      console.log(e);
      resolve();
    },
    function(failData){
      console.log(failData);
      reject("error");
    })
  });
}

Use Tesseract-OCR for Orientation Correction

If the document scanner does not have the automatic rotation ability, we can detect the orientation and then rotate the images to correct them. Here, we are going to use Tesseract-OCR to do this.

Include the Tesseract.js library in the page.

<script src="https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js"></script>

Create a worker using the OSD trained data and a legacy engine.

const worker = await Tesseract.createWorker('osd', 1, {
  legacyCore: true, 
  legacyLang: true,
  logger: m => console.log(m),
});

Convert the scanned document images into blob and use Tesseract to detect the orientation. If the detected orientation degree is 180, then rotate the image.

for (let index = 0; index < DWObject.HowManyImagesInBuffer; index++) {
  DWObject.SelectImages([index]);
  let image = await getBlob(index);
  const { data } = await worker.detect(image);
  if (data.orientation_degrees == 180) {
    console.log("need correction")
    DWObject.Rotate(index,180,true);
  }
}

Source Code

Get the source code of the demo to have a try:

https://github.com/tony-xlh/Document-Orientation-Correction

References

Joost van Beusekom, Faisal Shafait, and Thomas M. Breuel. 2010. Combined orientation and skew detection using geometric text-line modeling. Int. J. Doc. Anal. Recognit. 13, 2 (June 2010), 79–92. https://doi.org/10.1007/s10032-009-0109-5 ↩