How to Build a Next.js App to Scan an ID Card via Camera

Nov 22, 2024

An identity document or ID card is any document that may be used to prove a person’s identity. There are various forms of identity documents: driver’s license, passport and formal identity card.

Barcodes and MRZ (machine-readable zones) are often printed on an ID card so that its info can be extracted using a machine.

Driver’s license example:

driver's license

Formal ID Card example:

id card

ID cards are often scanned through a camera or a flatbed scanner. In this article, we are going to build a Next.js app to scan ID cards via cameras. Next.js is a full-stack React framework which lets you create web apps of any size.

This article is Part 3 in a 3-Part Series.

Part 1 - How to Build a Web App to Scan an ID Card using a Flatbed Scanner
Part 2 - How to Build a Web App to Scan an ID Card via Camera
Part 3 - How to Build a Next.js App to Scan an ID Card via Camera

The following SDKs by Dynamsoft are used:

Dynamsoft Camera Enhancer: access cameras and capture frames.
Dynamsoft Document Normalizer: crop ID cards in scanned document images.
Dynamsoft Barcode Reader: read PDF417 on driver’s licenses.
Dynamsoft Label Recognizer: recognize MRZ on ID cards.
Dynamsoft Code Parser: parse MRZ and barcodes to get meaningful data.

Demo video:

Online demo

Overview

The app is straightforward. You can trigger the scanner by pressing the scan button on the home page and after scanning, the scanned image and the card holder’s info are displayed. The capture is triggered automatically if the image is stable by checking the IoUs of detected results.

home

scanner

New Next.js Project

Create a new Next.js project with the following command:

npx create-next-app@latest

Install Dependencies

Install all the SDKs by Dynamsoft by installing the Dynamsoft Capture Vision bundle:

npm install dynamsoft-capture-vision-bundle

Configure Dynamsoft SDKs

Create a file named dcv.ts with the following content which is used to configure the Dynamsoft SDKs. It will set the license, load Web Assembly files, OCR models and specs for parsing. You can apply for a license here.

import "dynamsoft-license";

import "dynamsoft-barcode-reader";
import "dynamsoft-document-normalizer";
import "dynamsoft-label-recognizer";
import "dynamsoft-capture-vision-router";

import { CoreModule } from "dynamsoft-core";
import { LicenseManager } from "dynamsoft-license";
import { CodeParserModule, LabelRecognizerModule } from "dynamsoft-capture-vision-bundle";

let initialized = false;

export async function init(){
  if (initialized === false) {
    console.log("Initializing...");
    await LicenseManager.initLicense("LICENSE-KEY");
    CoreModule.engineResourcePaths.rootDirectory = "https://cdn.jsdelivr.net/npm/";
    await CoreModule.loadWasm(["DDN","DLR","DBR","DCP"]).catch((ex: any) => {
      let errMsg = ex.message || ex;
      console.error(errMsg);
      alert(errMsg);
    });
    await CodeParserModule.loadSpec("MRTD_TD1_ID");
    await CodeParserModule.loadSpec("MRTD_TD2_ID");
    await CodeParserModule.loadSpec("MRTD_TD3_PASSPORT");  
    await CodeParserModule.loadSpec("AAMVA_DL_ID");
    await LabelRecognizerModule.loadRecognitionData("MRZ");
  }
  initiazlied = true;
  return true;
}

Then in page.tsx, import it and run the initialization. Since the module imported is a singleton, the initialization will not be executed twice.

import { init } from "./dcv";

export default function Home() {
  const [initialized,setInitialized] = useState(false);
  useEffect(()=>{
    const initDynamsoft = async () => {
      try {
        const result = await init();
        if (result) {
          setInitialized(true);
        }
      } catch (error) {
        alert(error);
      }
    }
    initDynamsoft();
  },[])

  return (
    <>
    </>
  );
}

Create an ID Card Scanner Component

Next, create an ID card scanner component which captures the image of the ID card and extracts its holder’s info.

Create Scanner.css and Scanner.tsx under app/components

Scanner.tsx:

import { MutableRefObject, useEffect, useRef, useState } from 'react';
import './Scanner.css';
import { DetectedQuadResultItem } from 'dynamsoft-document-normalizer'

export interface HolderInfo {
  lastName:string;
  firstName:string;
  birthDate:string;
  sex:string;
  docNumber:string;
}

export interface ScannerProps {
  onScanned?: (blob:Blob,info?:HolderInfo) => void;
  onStopped?: () => void;
}

const Scanner: React.FC<ScannerProps> = (props:ScannerProps) => {
  let container: MutableRefObject<HTMLDivElement | null> = useRef(null);
  return (
    <div className="scanner-container" ref={container}>
    </div>
  );
};

export default Scanner;

Scanner.css:

.scanner-container {
  width: 100%;
  height: 100%;
  background: white;  
}

We will talk about how to implement it in the following parts.

Access Camera

Add a container for the camera preview.

<div className="scanner-container" ref={container}>
  <div className="dce-video-container"></div>
</div>

CSS:

.dce-video-container {
  position: absolute;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
}

Initialize Dynamsoft Camera Enhancer. Bind the container and start the camera.

useEffect((): any => {
  const init = async () => {
    if (initializing.current) {
      return;
    }
    try {
      view.current = await CameraView.createInstance(container.current!);
      dce.current = await CameraEnhancer.createInstance(view.current);
      dce.current.setResolution({width:1920,height:1080});
      await dce.current.open();
    } catch (ex: any) {
      let errMsg = ex.message || ex;
      console.error(errMsg);
      alert(errMsg);
    }
  }
     
  init();
  initializing.current = true;

  return async () => {
    dce.current?.dispose();
    console.log('Scanner Component Unmount');
  }
}, []);

Add a header toolbar to switch the camera and stop the scanner.

JSX:

<div className="header">
  <div className="switchButton" onClick={switchCamera}>
    <img className="icon" src="/switch.svg" alt="switch"/>
  </div>
  <div className="closeButton" onClick={close}>
    <img className="icon" src="/cross.svg" alt="close"/>
  </div>
</div>

JavaScript:

const switchCamera = async () => {
  if (dce.current) {
    let currentCamera = dce.current.getSelectedCamera();
    let cameras = await dce.current.getAllCameras();
    let currentCameraIndex = cameras.indexOf(currentCamera);
    let desiredIndex = 0
    if (currentCameraIndex < cameras.length - 1) {
      desiredIndex = currentCameraIndex + 1;
    }
    await dce.current.selectCamera(cameras[desiredIndex]);
  }
}
   
const close = async () => {
  if (props.onStopped) {
    props.onStopped();
  }
}

CSS:

.header {
  position: absolute;
  top: 0;
  left: 0;
  width: 100%;
  height: 30px;
  background: rgba(0, 0, 0, 0.8);
  display: flex;
  justify-content: space-between;
}

.switchButton {
  display: flex;
  align-items: center;
  text-align: center;
  width: 30px;
  height: 30px;
  padding: 5px;
}

.icon {
  width: 100%;
  height: 100%;
  pointer-events: all;
  cursor: pointer;
}

Detect Document

Create an instance of capture vision router to perform image processing using the SDKs.

let router: MutableRefObject<CaptureVisionRouter | null> = useRef(null);
router.current = await CaptureVisionRouter.createInstance();

Start an interval to capture a frame and detect documents in it if the camera is opened.

const [quadResultIte,setQuadResultItem] = useState<DetectedQuadResultItem|undefined>()
const detecting = useRef(false);
const interval = useRef<any>();
   
dce.current.on("played",async function(){
  startScanning();  
})

const startScanning = async () => {
  stopScanning();
  if (!interval.current) {
    interval.current = setInterval(captureAndDetect,150);
  }
}

const stopScanning = () => {
  clearInterval(interval.current);
  interval.current = null;
}
   
const captureAndDetect = async () => {
  if (detecting.current === true) {
    return;
  }
  if (!router.current || !dce.current) {
    return;
  } 
  if (isSteady.current) {
    return;
  }
  console.log("capture and detect");
  let results:DetectedQuadResultItem[] = [];
  detecting.current = true;
  try {
    let image = dce.current.fetchImage();
    let capturedResult = await router.current?.capture(image,"DetectDocumentBoundaries_Default");
    if (capturedResult.detectedQuadResultItems) {
      results = results.concat(capturedResult.detectedQuadResultItems);
    }
    console.log(results);
    if (results.length>0) {
      setQuadResultItem(results[0]);
      checkIfSteady(results,image);
    }else{
      setQuadResultItem(undefined);
    }
  } catch (error) {
    console.log(error);
  }
  detecting.current = false;
}

If the detected document is steady, capture and crop it, stop scanning and extract the info.

const checkIfSteady = async (results:DetectedQuadResultItem[],image:DCEFrame) => {
  if (results.length>0 && router.current) {
    let result = results[0];
    if (previousResults.current.length >= 3) {
      if (steady() == true) {
        console.log("steady");
        isSteady.current = true;
        let newSettings = await router.current.getSimplifiedSettings("NormalizeDocument_Default");
        newSettings.roiMeasuredInPercentage = false;
        newSettings.roi.points = results[0].location.points;
        await router.current.updateSettings("NormalizeDocument_Default", newSettings);
        let result = await router.current.capture(image,"NormalizeDocument_Default"); //perspective transformation to crop the image
        if (result.normalizedImageResultItems) {
          if (props.onScanned) {
            stopScanning();
            let blob = await result.normalizedImageResultItems[0].toBlob("image/png");
            let info = await extractInfo(blob);
            props.onScanned(blob,info);
          }
        }
      }else{
        console.log("shift and add result");
        previousResults.current.shift();
        previousResults.current.push(result);
      }
    }else{
      console.log("add result");
      previousResults.current.push(result);
    }
  }
}

Whether the detected document is steady is decided by checking the IoUs of three consecutive results.

const steady = () => {
  if (previousResults.current[0] && previousResults.current[1] && previousResults.current[2]) {
    let iou1 = intersectionOverUnion(previousResults.current[0].location.points,previousResults.current[1].location.points);
    let iou2 = intersectionOverUnion(previousResults.current[1].location.points,previousResults.current[2].location.points);
    let iou3 = intersectionOverUnion(previousResults.current[2].location.points,previousResults.current[0].location.points);
    if (iou1>0.9 && iou2>0.9 && iou3>0.9) {
      return true;
    }else{
      return false;
    }
  }
  return false;
}

Helper functions:

import { Point } from "dynamsoft-core";

export function intersectionOverUnion(pts1:Point[] ,pts2:Point[]) : number {
  let rect1 = getRectFromPoints(pts1);
  let rect2 = getRectFromPoints(pts2);
  return rectIntersectionOverUnion(rect1, rect2);
}

function rectIntersectionOverUnion(rect1:Rect, rect2:Rect) : number {
  let leftColumnMax = Math.max(rect1.left, rect2.left);
  let rightColumnMin = Math.min(rect1.right,rect2.right);
  let upRowMax = Math.max(rect1.top, rect2.top);
  let downRowMin = Math.min(rect1.bottom,rect2.bottom);

  if (leftColumnMax>=rightColumnMin || downRowMin<=upRowMax){
    return 0;
  }

  let s1 = rect1.width*rect1.height;
  let s2 = rect2.width*rect2.height;
  let sCross = (downRowMin-upRowMax)*(rightColumnMin-leftColumnMax);
  return sCross/(s1+s2-sCross);
}

function getRectFromPoints(points:Point[]) : Rect {
  if (points[0]) {
    let left:number;
    let top:number;
    let right:number;
    let bottom:number;
       
    left = points[0].x;
    top = points[0].y;
    right = 0;
    bottom = 0;

    points.forEach(point => {
      left = Math.min(point.x,left);
      top = Math.min(point.y,top);
      right = Math.max(point.x,right);
      bottom = Math.max(point.y,bottom);
    });

    let r:Rect = {
      left: left,
      top: top,
      right: right,
      bottom: bottom,
      width: right - left,
      height: bottom - top
    };
       
    return r;
  }else{
    throw new Error("Invalid number of points");
  }
}

export interface Rect {
  left:number;
  right:number;
  top:number;
  bottom:number;
  width:number;
  height:number;
}

Read Barcodes

Read the barcodes on the image using the barcode template.

let result = await router.current.capture(blob,"ReadBarcodes_Balance");

OCR of MRZ

Perform OCR to read the MRZ (machine-readable zone) on the image using the MRZ template. Since the MRZ template is not built-in, we have to init the settings with a JSON template.

const mrzTemplate = `
{
  "CaptureVisionTemplates": [
    {
      "Name": "ReadPassportAndId",
      "ImageROIProcessingNameArray": ["roi-passport-and-id"],
      "Timeout": 2000
    },
    {
      "Name": "ReadPassport",
      "ImageROIProcessingNameArray": ["roi-passport"],
      "Timeout": 2000
    },
    {
      "Name": "ReadId",
      "ImageROIProcessingNameArray": ["roi-id"],
      "Timeout": 2000
    }
  ],
  "TargetROIDefOptions": [
    {
      "Name": "roi-passport-and-id",
      "TaskSettingNameArray": ["task-passport-and-id"]
    },
    {
      "Name": "roi-passport",
      "TaskSettingNameArray": ["task-passport"]
    },
    {
      "Name": "roi-id",
      "TaskSettingNameArray": ["task-id"]
    }
  ],
  "TextLineSpecificationOptions": [
    {
      "Name": "tls_mrz_passport",
      "BaseTextLineSpecificationName": "tls_base",
      "StringLengthRange": [44, 44],
      "OutputResults": 1,
      "ExpectedGroupsCount": 1,
      "ConcatResults": 1,
      "ConcatSeparator": "\\n",
      "SubGroups": [
        {
          "StringRegExPattern": "(P[A-Z<][A-Z<]{3}[A-Z<]{39}){(44)}",
          "StringLengthRange": [44, 44],
          "BaseTextLineSpecificationName": "tls_base"
        },
        {
          "StringRegExPattern": "([A-Z0-9<]{9}[0-9][A-Z<]{3}[0-9]{2}[0-9<]{4}[0-9][MF<][0-9]{2}[(01-12)][(01-31)][0-9][A-Z0-9<]{14}[0-9<][0-9]){(44)}",
          "StringLengthRange": [44, 44],
          "BaseTextLineSpecificationName": "tls_base"
        }
      ]
    },
    {
      "Name": "tls_mrz_id_td2",
      "BaseTextLineSpecificationName": "tls_base",
      "StringLengthRange": [36, 36],
      "OutputResults": 1,
      "ExpectedGroupsCount": 1,
      "ConcatResults": 1,
      "ConcatSeparator": "\\n",
      "SubGroups": [
        {
          "StringRegExPattern": "([ACI][A-Z<][A-Z<]{3}[A-Z<]{31}){(36)}",
          "StringLengthRange": [36, 36],
          "BaseTextLineSpecificationName": "tls_base"
        },
        {
          "StringRegExPattern": "([A-Z0-9<]{9}[0-9][A-Z<]{3}[0-9]{2}[0-9<]{4}[0-9][MF<][0-9]{2}[(01-12)][(01-31)][0-9][A-Z0-9<]{8}){(36)}",
          "StringLengthRange": [36, 36],
          "BaseTextLineSpecificationName": "tls_base"
        }
      ]
    },
    {
      "Name": "tls_mrz_id_td1",
      "BaseTextLineSpecificationName": "tls_base",
      "StringLengthRange": [30, 30],
      "OutputResults": 1,
      "ExpectedGroupsCount": 1,
      "ConcatResults": 1,
      "ConcatSeparator": "\\n",
      "SubGroups": [
        {
          "StringRegExPattern": "([ACI][A-Z<][A-Z<]{3}[A-Z0-9<]{9}[0-9<][A-Z0-9<]{15}){(30)}",
          "StringLengthRange": [30, 30],
          "BaseTextLineSpecificationName": "tls_base"
        },
        {
          "StringRegExPattern": "([0-9]{2}[(01-12)][(01-31)][0-9][MF<][0-9]{2}[0-9<]{4}[0-9][A-Z<]{3}[A-Z0-9<]{11}[0-9]){(30)}",
          "StringLengthRange": [30, 30],
          "BaseTextLineSpecificationName": "tls_base"
        },
        {
          "StringRegExPattern": "([A-Z<]{30}){(30)}",
          "StringLengthRange": [30, 30],
          "BaseTextLineSpecificationName": "tls_base"
        }
      ]
    },
    {
      "Name": "tls_base",
      "CharacterModelName": "MRZ",
      "CharHeightRange": [5, 1000, 1],
      "BinarizationModes": [
        {
          "BlockSizeX": 30,
          "BlockSizeY": 30,
          "Mode": "BM_LOCAL_BLOCK",
          "EnableFillBinaryVacancy": 0,
          "ThresholdCompensation": 15
        }
      ],
      "ConfusableCharactersCorrection": {
        "ConfusableCharacters": [
          ["0", "O"],
          ["1", "I"],
          ["5", "S"]
        ],
        "FontNameArray": ["OCR_B"]
      }
    }
  ],
  "LabelRecognizerTaskSettingOptions": [
    {
      "Name": "task-passport",
      "ConfusableCharactersPath": "ConfusableChars.data",
      "TextLineSpecificationNameArray": ["tls_mrz_passport"],
      "SectionImageParameterArray": [
        {
          "Section": "ST_REGION_PREDETECTION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_LOCALIZATION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_RECOGNITION",
          "ImageParameterName": "ip-mrz"
        }
      ]
    },
    {
      "Name": "task-id",
      "ConfusableCharactersPath": "ConfusableChars.data",
      "TextLineSpecificationNameArray": ["tls_mrz_id_td1", "tls_mrz_id_td2"],
      "SectionImageParameterArray": [
        {
          "Section": "ST_REGION_PREDETECTION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_LOCALIZATION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_RECOGNITION",
          "ImageParameterName": "ip-mrz"
        }
      ]
    },
    {
      "Name": "task-passport-and-id",
      "ConfusableCharactersPath": "ConfusableChars.data",
      "TextLineSpecificationNameArray": ["tls_mrz_passport", "tls_mrz_id_td1", "tls_mrz_id_td2"],
      "SectionImageParameterArray": [
        {
          "Section": "ST_REGION_PREDETECTION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_LOCALIZATION",
          "ImageParameterName": "ip-mrz"
        },
        {
          "Section": "ST_TEXT_LINE_RECOGNITION",
          "ImageParameterName": "ip-mrz"
        }
      ]
    }
  ],
  "CharacterModelOptions": [
    {
      "DirectoryPath": "",
      "Name": "MRZ"
    }
  ],
  "ImageParameterOptions": [
    {
      "Name": "ip-mrz",
      "TextureDetectionModes": [
        {
          "Mode": "TDM_GENERAL_WIDTH_CONCENTRATION",
          "Sensitivity": 8
        }
      ],
      "BinarizationModes": [
        {
          "EnableFillBinaryVacancy": 0,
          "ThresholdCompensation": 21,
          "Mode": "BM_LOCAL_BLOCK"
        }
      ],
      "TextDetectionMode": {
        "Mode": "TTDM_LINE",
        "CharHeightRange": [5, 1000, 1],
        "Direction": "HORIZONTAL",
        "Sensitivity": 7
      }
    }
  ]
}
`

await router.current.initSettings(JSON.parse(mrzTemplate));
let result = await router.current.capture(blob,"ReadPassportAndId");

Parse the Result

After reading the barcodes and MRZ, use Dynamsoft Code Parser to parse them.

Create an instance of Code Parser.

let parser = await CodeParser.createInstance();

Parse the detected barcodes.

for (let index = 0; index < result.barcodeResultItems.length; index++) {
  const item = result.barcodeResultItems[index];
  if (item.format != EnumBarcodeFormat.BF_PDF417) {
    continue;
  }
  let parsedItem = await parser.parse(item.text);
  if (parsedItem.codeType === "AAMVA_DL_ID") {
    let number = parsedItem.getFieldValue("licenseNumber");
    let firstName = parsedItem.getFieldValue("firstName");
    let lastName = parsedItem.getFieldValue("lastName");
    let birthDate = parsedItem.getFieldValue("birthDate");
    let sex = parsedItem.getFieldValue("sex");
    let info:HolderInfo = {
      firstName:firstName,
      lastName:lastName,
      docNumber:number,
      birthDate:birthDate,
      sex:sex
    };
    return info;
  }
}

Parse MRZ.

let parsedItem = await parser.parse(result.textLineResultItems[0].text);
console.log(parsedItem);
if (parsedItem.codeType.indexOf("MRTD") != -1) {
  let number = parsedItem.getFieldValue("documentNumber");
  if (!number) {
    number = parsedItem.getFieldValue("passportNumber");
  }
  let firstName = parsedItem.getFieldValue("primaryIdentifier");
  let lastName = parsedItem.getFieldValue("secondaryIdentifier");
  let birthDate = parsedItem.getFieldValue("dateOfBirth");
  let sex = parsedItem.getFieldValue("sex");
  let info:HolderInfo = {
    firstName:firstName,
    lastName:lastName,
    docNumber:number,
    birthDate:birthDate,
    sex:sex
  };
  return info;
}

Use the ID Card Scanner Component

After the completion of the component, let’s use it in the home page.

Import the component using next/dynamic.

const Scanner = dynamic(() => import("./components/Scanner"), {
  ssr: false,
  loading: () => <p>Initializing ID Card Scanner</p>,
});

Add 'use client'; in the head of the file.

Use the component to scan ID cards.

JSX:

<div className="footer">
  <button className="shutter-button round" onClick={()=>{startScanning();}}>Scan</button>
</div>
{scanning && (
  <div className="fullscreen">
    <Scanner onScanned={onScanned} onStopped={onStopped}/>
  </div>
)}

JavaScript:

const [scanning,setScanning] = useState(false);
const [initialized,setInitialized] = useState(false);
const [imageURL,setImageURL] = useState("");
const [info,setInfo] = useState<HolderInfo|undefined>();

const startScanning = () => {
  setScanning(true);
}

const onScanned = (blob:Blob,_info?:HolderInfo) => {
  let url = URL.createObjectURL(blob);
  setImageURL(url);
  setInfo(_info);
  setScanning(false);
}

const onStopped = () => {
  setScanning(false);
}

The results are displayed using the following JSX:

{(imageURL && info) && (
  <div className="card">
    <div>
      Image:
      <br/>
      <img src={imageURL} alt="idcard"/>
    </div>
    <div>
      Document number:&nbsp;
      <span>{info.docNumber}</span>
    </div>
    <div>
      First name:&nbsp;
      <span>{info.firstName}</span>
    </div>
    <div>
      Last name:&nbsp;
      <span>{info.lastName}</span>
    </div>
    <div>
      Date of Birth:&nbsp;
      <span>{info.birthDate}</span>
    </div>
    <div>
      Sex:&nbsp;
      <span>{info.sex}</span>
    </div>
  </div>
)}

All right, we’ve now completed the demo.

Source Code

You can find the source code of the demo in the following repo: https://github.com/tony-xlh/NextJS-ID-Card-Scanner