How to Scan Documents and Store Them in IndexedDB

IndexedDB is a low-level API for client-side storage of significant amounts of structured data, including files/blobs. It can be used to save images like scanned documents on the client for persistant storage.

In this article, we are going to talk about how to use Dynamic Web TWAIN to scan documents and store them in IndexedDB.

You can check out the demo videos to see what it does.

  1. Scanning documents.

  2. Loading a previously scanned document from IndexedDB.

Build a Document Scanning Web App

Let’s first build a web app to scan documents.

New Project

Clone a webpack starter project as the template for starting a new project:

git clone https://github.com/wbkd/webpack-starter

Install Dependencies

  1. Install Dynamic Web TWAIN: npm install dwt.

    In addition, we need to copy the resources of Dynamic Web TWAIN to the public folder.

    1. Install ncp.

      npm install --save-dev ncp
      
    2. Modify package.json to copy the resources for the build and start commands.

       "scripts": {
         "lint": "npm run lint:styles; npm run lint:scripts",
         "lint:styles": "stylelint src",
         "lint:scripts": "eslint src",
      -  "build": "cross-env NODE_ENV=production webpack --config webpack/webpack.config.prod.js",
      -  "start": "webpack serve --config webpack/webpack.config.dev.js"
      +  "build": "ncp node_modules/dwt/dist public/dwt-resources && cross-env NODE_ENV=production webpack --config webpack/webpack.config.prod.js",
      +  "start": "ncp node_modules/dwt/dist public/dwt-resources && webpack serve --config webpack/webpack.config.dev.js"
       },
      
    3. Modify webpack.common.js to copy the files in the public folder to the output folder instead of the public folder inside the output folder.

       new CopyWebpackPlugin({
      -  patterns: [{ from: Path.resolve(__dirname, '../public'), to: 'public' }],
      +  patterns: [{ from: Path.resolve(__dirname, '../public'), to: '' }],
       }),
      
  2. Install localForage: npm install localforage. localForage is a library which makes it easy to use IndexedDB.

Use Dynamic Web TWAIN to Scan Documents

Here, we are going to use Dynamic Web TWAIN’s remote scan feature to scan documents. By default, Dynamic Web TWAIN needs to install a local service on desktop devices to manage scanning. But with remote scan, we only need to install the service on one device and then, we can use any device like PCs and mobile phones to scan documents. You can learn about its setup and usage by checking out the docs.

  1. Set the resources path:

    import Dynamsoft from 'dwt';
    Dynamsoft.DWT.ResourcesPath = "/dwt-resources";
    
  2. Specify the license. A one-day temporary license will be used if it is empty. You can apply for a license here.

    Dynamsoft.DWT.ProductKey = "Your license";
    
  3. Create a new remote scan object using a public proxy server.

    const serverurl = "https://demo.scannerproxy.com/";
    let DWRemoteScanObject;
    DWRemoteScanObject = await Dynamsoft.DWT.CreateRemoteScanObjectAsync(serverurl);
    
  4. List Dynamsoft Services found by the proxy service in a select.

    services = await DWRemoteScanObject.getDynamsoftService();
    if (services.length>0) {
      DWRemoteScanObject.setDefaultDynamsoftService(services[0]);
    }
    let servicesSelect = document.getElementById("services-select");
    servicesSelect.options.length = 0;
    for (let index = 0; index < services.length; index++) {
      const service = services[index];
      if (service.attrs.name.length > 0) {
        servicesSelect.options.add(new Option(service.attrs.name, index));
      } else {
        servicesSelect.options.add(new Option(service.attrs.UUID, index));
      }
    }
    
  5. List devices found by the selected service in a select.

    let selectedService = services[document.getElementById("services-select").selectedIndex];
    devices = await DWRemoteScanObject.getDevices({serviceInfo: selectedService});
    console.log(devices);
    let devicesSelect = document.getElementById("devices-select");
    devicesSelect.options.length = 0;
    for (let index = 0; index < devices.length; index++) {
      const device = devices[index];
      if (device.displayName.length > 0) {
        devicesSelect.options.add(new Option(device.displayName, index));
      } else {
        devicesSelect.options.add(new Option(device.name, index));
      }
    }
    
  6. Acquire a document image using the selected device.

    let deviceConfiguration = {
      IfFeederEnabled: false,
      IfCloseSourceAfterAcquire:true, 
      Resolution:200,
      IfShowUI:false}; // scanning configuration. Check out the docs to learn more: https://www.dynamsoft.com/web-twain/docs/info/api/WebTwain_Acquire.html#acquireimage
    await DWRemoteScanObject.acquireImage(devices[document.getElementById("devices-select").selectedIndex], deviceConfiguration);
    

Store Scanned Documents in IndexedDB

  1. Create a store named images to store scanned images.

    let imagesStore = localforage.createInstance({
      name: "images"
    });
    

    We have to get the blob of a scanned document image, convert it to arrayBuffer for iOS compatibility and then save it in the store with timestamp as its ID. Push the ID to the array of image IDs.

    let images = [];
    const blob = await DWRemoteScanObject.getImages([0],Dynamsoft.DWT.EnumDWT_ImageType.IT_PNG, Dynamsoft.DWT.EnumDWT_ImageFormatType.blob);
    await DWRemoteScanObject.removeImages([0]);
    const buffer = await blobToArrayBuffer(blob);
    const ID = Date.now().toString();
    await imagesStore.setItem(ID,buffer);
    images.push(ID);
    
  2. Create a store named metadata to store items with a unique document ID as the key and the image IDs as the value.

    let metadataStore = localforage.createInstance({
      name: "metadata"
    });
       
    function saveImagesListToIndexedDB(){
      metadataStore.setItem(documentID,images);
    }
    

Display Scanned Documents in a Viewer

We can then display the document images in customized viewer.

async function displayImagesInIndexedDB(){
  const documentViewer = document.getElementById("document-viewer");
  const pages = documentViewer.getElementsByClassName("page");
  for (let index = 0; index < images.length; index++) {
    const ID = images[index];
    const blob = await loadImageAsBlobFromIndexedDB(ID);
    let page = pages[index];
    if (page) {
      if (page.getAttribute("ID") === ID) {
        if (!page.getAttribute("src")) {
          page.src = URL.createObjectURL(blob);
        }
      }
    }else{
      page = document.createElement("img");
      page.className = "page";
      page.setAttribute("ID",ID);
      page.src = URL.createObjectURL(blob);
      page.addEventListener("click",function(){
        selectPage(ID);
      });
      documentViewer.appendChild(page);
    }
  }
}

async function loadImageAsBlobFromIndexedDB(ID){
  const buffer = await imagesStore.getItem(ID);
  const blob = arrayBufferToBlob(buffer,{type: "image/png"});
  return blob;
}

Load Scanned Documents in IndexedDB

If the user enters the page with a document ID, then load the list of images scanned and display them in the viewer.

async function loadImagesListFromIndexedDB(){
  const value = await metadataStore.getItem(documentID);
  if (value) {
    images = value;
    displayImagesInIndexedDB();
  }
}

Export to PDF

After acquiring the document images, we can export them as a PDF file.

async function exportToPDF(){
  const status = document.getElementById("status");
  status.innerText = "Exporting...";
  let indices = [];
  let j = 0;
  for (let index = 0; index < images.length; index++) {
    const ID = images[index];
    const buffer = await imagesStore.getItem(ID);
    if (buffer) {
      indices.push(j);
      const blob = arrayBufferToBlob(buffer);
      await loadImageToDWT(blob);
      j = j + 1;
    }
  }
  await DWRemoteScanObject.saveImages("scanned.pdf",indices,Dynamsoft.DWT.EnumDWT_ImageType.IT_PDF);
  await DWRemoteScanObject.removeImages(indices);
  status.innerText = "";
}

function loadImageToDWT(blob){
  return new Promise((resolve,reject)=>{
    DWRemoteScanObject._defaultDSScanClient.__LoadImageFromBytesV2(blob, 
      Dynamsoft.DWT.EnumDWT_ImageType.IT_PNG, "", true, 0, 0, false, 3, 
      function(){resolve("OK");}, 
      function(ec,es){reject(es);}
    );
  });
}

All right, we’ve covered the important parts of the app. You can use the online demo to have a try.

Source Code

https://github.com/tony-xlh/Scan-and-Save-to-Client-Side-Storage