Document Normalizer for Your Website - User Guide

With Dynamsoft Document Normalizer JavaScript edition, you can add to your website the ability to take pictures of documents with your camera and normalize them to obtain high-quality images for further processing or archiving purposes.

In this guide, you’ll learn step-by-step how to build such a simple solution in a web page.

Table of Contents

Document Normalizer for Your Website - User Guide

Hello World - Simplest Implementation

The solution consists of two steps

Detect the document boundaries
Normalize the document based on the detected boundaries

Understand the code

The following sample code sets up the SDK and implements boundary detection on a web page, which is just the first step in capturing a normalized image of your document. We’ll cover the second step later in Building Your Own Page.

<!DOCTYPE html>
<html lang="en">

<head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <script src="https://cdn.jsdelivr.net/npm/dynamsoft-capture-vision-bundle@2.6.1000/dist/dcv.bundle.js"></script>
</head>

<body>
    <h1>Detect the Boundary of the Document</h1>
    <button onclick="startDetection()">Start Detection</button>
    <div id="cameraViewContainer" style="width: 50vw; height: 45vh; margin-top: 10px; display: none"></div>

    <script>
        const cameraViewContainer = document.querySelector(
            "#cameraViewContainer"
        );
        let router;
        let cameraEnhancer;
        Dynamsoft.License.LicenseManager.initLicense(
                "DLS2eyJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSJ9"
        );
        Dynamsoft.Core.CoreModule.loadWasm(["DDN"]);

        (async function() {
            router = await Dynamsoft.CVR.CaptureVisionRouter.createInstance();
            let view = await Dynamsoft.DCE.CameraView.createInstance();
            cameraEnhancer = await Dynamsoft.DCE.CameraEnhancer.createInstance(
                view
            );
            cameraViewContainer.append(view.getUIElement());
            router.setInput(cameraEnhancer);
        })();
        async function startDetection() {
            cameraViewContainer.style.display = "block";
            await cameraEnhancer.open();
            await router.startCapturing("DetectDocumentBoundaries_Default");
        };
    </script>
</body>

</html>

About the code

Dynamsoft.License.LicenseManager.initLicense(): initializes the license using a license key string.
Dynamsoft.Core.CoreModule.loadWasm(["DDN"]): preloads the DocumentNormalizer module, saving time in preparing for document border detection and image normalization.
Dynamsoft.CVR.CaptureVisionRouter.createInstance(): initializes the router variable by creating an instance of the CaptureVisionRouter class. An instance of CaptureVisionRouter is the core of any solution based on Dynamsoft Capture Vision architecture.

Read more on what is CaptureVisionRouter
Dynamsoft.DCE.CameraEnhancer.createInstance(view): initializes the cameraEnhancer variable by creating an instance of the CameraEnhancer class.
setInput(): router connects to the image source through the Image Source Adapter interface with the method setInput().

The image source in our case is a CameraEnhancer object created with Dynamsoft.DCE.CameraEnhancer.createInstance(view)

In some cases, a different camera might be required instead of the default one. Also, a different resolution might work better. To change the camera or the resolution, use the CameraEnhancer instance cameraEnhancer. Learn more here.
startCapturing("DetectDocumentBoundaries_Default") : starts to run images through a pre-defined process which, in the case of “DetectDocumentBoundaries_Default”, tries to find the boundary of a document present in the image(s).

Run the example

Create a text file called “Detect-A-Document-Boundary.html”, fill it with the code above and save it. After that, open the example page in your browser, allow the page to access your camera, and the video will be displayed on the page. Afterwards, you will see the detected boundaries displayed on the video in real time.

NOTE:

The sample code requires the following to run:

Internet connection

A supported browser

An accessible Camera

Please note:

Although the page should work properly when opened directly as a file (“file:///”), it’s recommended that you deploy it to a web server and access it via HTTPS.
On first use, you need to wait a few seconds for the SDK to initialize.
The license “DLS2eyJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSJ9” used in this sample is an online license good for 24 hours and requires network connection to work. To test the SDK further, you can request a 30-day trial license via the Request a Trial License link.

If the test doesn’t go as expected, you can contact us.

Building your own page

In this section, we’ll break down and show all the steps needed to build the solution in a web page.

We’ll build on this skeleton page:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
</head>
<body>
    <h1>Detect the Boundary of the Document and Normalize it</h1>
    <script>
      // Write your code here.
    </script>
</body>
</html>

Include the SDK

To utilize the SDK, the initial step involves including the corresponding resource files.

Use a CDN

The simplest way to include the SDK is to use either the jsDelivr or UNPKG CDN.

jsDelivr

<script src="https://cdn.jsdelivr.net/npm/dynamsoft-capture-vision-bundle@2.6.1000/dist/dcv.bundle.js"></script>

UNPKG

<script src="https://unpkg.com/dynamsoft-capture-vision-bundle@2.6.1000/dist/dcv.bundle.js"></script>

Host the SDK yourself

Besides using the CDN, you can also download the SDK and host its files on your own website / server before including it in your application. When using a CDN, resources related to dynamsoft-image-processing and dynamsoft-capture-vision-std are automatically loaded over the network; When using them locally, these two packages need to be configured manually.

npm
```
npm i dynamsoft-capture-vision-bundle@2.6.1000 -E
# Compared with using CDN, you need to set up more resources.
npm i dynamsoft-capture-vision-std@1.4.21 -E
npm i dynamsoft-image-processing@2.4.31 -E
```
The resources are located at the path node_modules/<pkg>, without @<version>, so the script would be like:
```
<script src="node_modules/dynamsoft-barcode-reader-bundle/dist/dbr.bundle.js"></script>
```
Since the version tags (@<version>) are missing, you need to specify the engineResourcePaths so that the SDK can find the resources correctly.

To avoid confusion, we suggest renaming “node_modules” or moving “dynamsoft-“ packages elsewhere for self-hosting, as “node_modules” is reserved for Node.js dependencies.

Specify the location of the “engine” files (optional)

This is usually only required with frameworks like Angular or React, etc. where the referenced JavaScript files such as cvr.js, ddn.js are compiled into another file, or using the SDKs completely offline.

The purpose is to tell the SDK where to find the engine files (*.worker.js, *.wasm.js and *.wasm, etc.). The API is called Dynamsoft.Core.CoreModule.engineResourcePaths:

//The following code uses the jsDelivr CDN as an example, feel free to change it to your own location of these files
CoreModule.engineResourcePaths.rootDirectory = "https://cdn.jsdelivr.net/npm/";

Define necessary HTML elements

For this solution, we define three buttons and three <div> elements.

<button id="start-detecting" onclick="startDetecting()">Start Detecting</button>
<button id="restart-detecting" onclick="restartDetecting()" style="display: none">Restart Detecting</button>
<button id="normalize-with-confirmed-quad" disabled>Normalize</button><br />
<div id="div-ui-container" style="margin-top: 10px; height: 450px"></div>
<div id="div-image-container" style="display: none; width: 100%; height: 70vh"></div>
<div id="normalized-result"></div>

const btnStart = document.querySelector("#start-detecting");
const btnRestart = document.querySelector("#restart-detecting");
const btnNormalize = document.querySelector("#normalize-with-confirmed-quad");
const cameraViewContainer = document.querySelector("#div-ui-container");
const imageEditorViewContainer = document.querySelector("#div-image-container");
const normalizedImageContainer = document.querySelector("#normalized-result");

Prepare the SDK for the task

The following function executes as soon as the page loads to get the SDK prepared:

let cameraEnhancer = null;
let router = null;
let items;
let layer;
let originalImage;
let imageEditorView;
let promiseCVRReady;
let frameCount = 0

Dynamsoft.License.LicenseManager.initLicense("DLS2eyJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSJ9");
/* Preloads the `DocumentNormalizer` module, saving time in preparing for document border detection and image normalization.*/
Dynamsoft.Core.CoreModule.loadWasm(["DDN"])

async function initDCE() {
  const view = await Dynamsoft.DCE.CameraView.createInstance();
  cameraEnhancer = await Dynamsoft.DCE.CameraEnhancer.createInstance(view);
  imageEditorView = await Dynamsoft.DCE.ImageEditorView.createInstance(imageEditorViewContainer);
  /* Creates an image editing layer for drawing found document boundaries. */
  layer = imageEditorView.createDrawingLayer();
  cameraViewContainer.append(view.getUIElement());
}

let cvrReady = (async function initCVR() {
  await initDCE();
  router = await Dynamsoft.CVR.CaptureVisionRouter.createInstance();
  router.setInput(cameraEnhancer);
})();

The code was explained earlier. Please refer to About the Code.

Start the detection

Once the image processing is complete, the results are sent to all the registered CapturedResultReceiver objects. Each CapturedResultReceiver object may encompass one or multiple callback functions associated with various result types. In our task, we need to detect the document border and normalize it, so we use callback function onCapturedResultReceived to get both detected borders and the original image for later normalization.

Read more on CapturedResultReceiver

let cvrReady = (async function initCVR() {
  /**
   * Creates a CaptureVisionRouter instance and configure the task to detect document boundaries.
   * Also, make sure the original image is returned after it has been processed.
   */
  await initDCE();
  router = await Dynamsoft.CVR.CaptureVisionRouter.createInstance();
  router.setInput(cameraEnhancer);
  /**
   * Sets the result types to be returned.
   * Because we need to normalize a document from the original image later, here we set the return result type to
   * include both the quadrilateral and original image data.
   */  
  let newSettings = await router.getSimplifiedSettings("DetectDocumentBoundaries_Default");
  newSettings.capturedResultItemTypes = Dynamsoft.Core.EnumCapturedResultItemType.CRIT_DETECTED_QUAD | Dynamsoft.Core.EnumCapturedResultItemType.CRIT_ORIGINAL_IMAGE;
  await router.updateSettings("DetectDocumentBoundaries_Default", newSettings)
  /* Defines the result receiver for the task.*/
  const resultReceiver = new Dynamsoft.CVR.CapturedResultReceiver();
  resultReceiver.onCapturedResultReceived = handleCapturedResult;
  router.addResultReceiver(resultReceiver);
})();

async function handleCapturedResult(result) {
  /* Update the result of the latest frame to the global variable items*/
  items = result.items;
  /* Do something with the result items*/
}

And we define the function startDetecting like this:

async function startDetecting() {
  try {
    await (promiseCVRReady = promiseCVRReady || (async () => {
      await cvrReady;
      /* Starts streaming the video. */
      await cameraEnhancer.open();
      /* Uses the built-in template "DetectDocumentBoundaries_Default" to start a continuous boundary detection task. */
      await router.startCapturing("DetectDocumentBoundaries_Default");
    })());
  } catch (ex) {
    let errMsg = ex.message || ex;
    console.error(errMsg);
    alert(errMsg);
  }
}

The steps of the workflow is as follows

cameraEnhancer streams the video, captures live video frames and stores them in a buffer.
router gets the video frames from Image Source Adapter and passes them to be processed by an internal DocumentNormalizer instance. The cameraEnhancer used here is a special implementation of the Image Source Adapter.
The internal DocumentNormalizer instance returns the found document boundaries, known as quadsResultItems, to router.
The router can output all types of CapturedResults that need to be captured through the onCapturedResultReceived callback function. In this example code we use the callback function to output quadsResultItems and originalImageResultItem.

Also note that the quadsResultItems are drawn over the video automatically to show the detection in action.

Note:

router is engineered to consistently request images from the image source.
Three preset templates are at your disposal for document normalizing or border detection:

Template Name	Function
DetectDocumentBoundaries_Default	Detect document border on images.
NormalizeDocument_Default	Input an ROI and an image and normalize it.
DetectAndNormalizeDocument_Default	Detect document border on images and normalize it

Read more on the preset CaptureVisionTemplates.

Review and adjust the boundary

First we update the callback function, use a specific condition to ensure that the camera has stabilized:

async function handleCapturedResult(result) {
      /* Update the result of the latest frame to the global variable items*/
      items = result.items;
      /* Do something with the result */
      /* Saves the image data of the current frame for subsequent image editing. */
      const originalImage = result.items.filter((item) => item.type === 1);
      originalImageData = originalImage.length && originalImage[0].imageData;
      if (originalImageData) {
        if (result.items.length <= 1) {
          frameCount = 0;
          return;
        }
        frameCount++;
        /**
         * In our case, we define a good condition for "ready for normalization" as 
         * "getting the document boundary detected for 30 consecutive frames".
         * 
         * NOTE that this condition is not valid if you add a CapturedResultFilter 
         * with ResultDeduplication enabled.
         */
        if (frameCount === 30) {
          frameCount = 0;
          /* Stops the detection task since we assume we have found a good boundary. */
          router.stopCapturing();
          /* Hides the cameraView and shows the imageEditorView. */
          cameraViewContainer.style.display = "none";
          imageEditorViewContainer.style.display = "block";
          /* Draws the image on the imageEditorView first. */
          imageEditorView.setOriginalImage(originalImageData);
          quads = [];
          /* Draws the document boundary (quad) over the image. */
          for (let i = 0; i < result.items.length; i++) {
            if (result.items[i].type === Dynamsoft.Core.EnumCapturedResultItemType.CRIT_ORIGINAL_IMAGE) continue;
            const points = result.items[i].location.points;
            const quad = new Dynamsoft.DCE.QuadDrawingItem({ points });
            quads.push(quad);
            layer.addDrawingItems(quads);
          }
          btnStart.style.display = "none";
          btnRestart.style.display = "inline";
          btnNormalize.disabled = false;
        }
      }
    }

The SDK tries to find the boundary of the document in each and every image processed. This happens very fast and we don’t always get the perfect boundary for normalization. Therefore, we can refine the boundary within the ImageEditorView to enhance its quality before proceeding with the normalization process. To do this, we can record the manually edited border information by:

btnNormalize.addEventListener("click", async () => {
  /* Gets the selected quadrilateral. */
  let seletedItems = imageEditorView.getSelectedDrawingItems();
  let quad;
  if (seletedItems.length) {
    quad = seletedItems[0].getQuad();
  } else {
    quad = items[1].location;
  }
});

Now, the behavior will be

The page constantly detect the boundary of the document in the video.
When the border is successfully found for 30 consecutive frames, the page hides the video stream and draw both the image and the boundary in the “imageEditorViewer”.
The user can adjust the boundary to be more precise.

Normalize the document

After the user has adjusted the boundary or determined that the found boundary is good enough, he can press the button “Normalize” to carry out the normalization as the last step of the solution. One way to use the adjusted border is to set it as the new ROI in the template NormalizeDocument_Default. You can simply update the code like this:

btnNormalize.addEventListener("click", async () => {
  /* Gets the selected quadrilateral. */
  let seletedItems = imageEditorView.getSelectedDrawingItems();
  let quad;
  if (seletedItems.length) {
    quad = seletedItems[0].getQuad();
  } else {
    quad = items[1].location;
  }
  /* Hides the imageEditorView. */
  imageEditorViewContainer.style.display = "none";
  /* Removes the old normalized image if any. */
  normalizedImageContainer.innerHTML = "";
  /**
   * Sets the coordinates of the ROI (region of interest)
   * in the built-in template "NormalizeDocument_Default".
   */
  let newSettings = await router.getSimplifiedSettings("NormalizeDocument_Default");
  newSettings.roiMeasuredInPercentage = 0;
  newSettings.roi.points = quad.points;
  await router.updateSettings("NormalizeDocument_Default", newSettings);
  /* Executes the normalization and shows the result on the page. */
  let normalizeResult = await router.capture(originalImageData, "NormalizeDocument_Default");
  if (normalizeResult.items[0]) {
    normalizedImageContainer.append(normalizeResult.items[0].toCanvas());
  }
  layer.clearDrawingItems();
  btnNormalize.disabled = true;
});

The added behavior is:

The user hits “Normalize Image”
The image gets normalized with the adjusted boundary
The normalized image shows up on the page

Output the document as a file

We can output the document as a file with the help of the class Dynamsoft.Utility.ImageManager. To do this, we change the following line in the function “normalizeImage()”:

normalizedImageContainer.append(normalizeResult.items[0].toCanvas());

const imageManager = new Dynamsoft.Utility.ImageManager();
imageManager.saveToFile(normalizeResult.items[0].imageData, "result.jpg", true);

Then once a document has been normalized, it is downloaded as JPEG file in the browser.

Restart task

You can also add a button to restart the entire task:

async function restartDetecting() {
  /* Reset the UI elements and restart the detection task. */
  imageEditorViewContainer.style.display = "none";
  normalizedImageContainer.innerHTML = "";
  cameraViewContainer.style.display = "block";
  btnStart.style.display = "inline";
  btnRestart.style.display = "none";
  btnNormalize.disabled = true;
  layer.clearDrawingItems()
  await router.startCapturing("DetectDocumentBoundaries_Default");
}

You can also test the code above at https://jsfiddle.net/DynamsoftTeam/

System requirements

The SDK requires the following features to work:

Secure context (HTTPS deployment)

When deploying your application / website for production, make sure to serve it via a secure HTTPS connection. This is required for two reasons
- Access to the camera video stream is only granted in a security context. Most browsers impose this restriction.
  
  Some browsers like Chrome may grant the access for http://127.0.0.1 and http://localhost or even for pages opened directly from the local disk (file:///...). This can be helpful for temporary development and test.
- Dynamsoft License requires a secure context to work.
WebAssembly, Blob, URL/createObjectURL, Web Workers

The above four features are required for the SDK to work.

The following table is a list of supported browsers based on the above requirements:

Browser Name	Version
Chrome	v78+
Firefox	v68+
Safari	v14+
Edge	v79+

Apart from the browsers, the operating systems may impose some limitations of their own that could restrict the use of the SDK.

Release notes

Learn what are included in each release at https://www.dynamsoft.com/capture-vision/docs/web/programming/javascript/release-notes.

Next steps

Now that you have got the SDK integrated, you can choose to move forward in the following directions

Check out the official samples.
Learn about the available APIs.