Recognize Text of Passport MRZ in a WebView

Jun 01, 2022

The WebView component included in modern mobile operating systems is becoming more and more powerful. For example, it supports showing the camera preview using getUserMedia and has support for WebAssembly, which makes it possible to use Dynamsoft Label Recognizer to create a text scanner to recognize text like passports’ machine-readable zone (MRZ) with web technologies.

In this article, we are going to build a web app to recognize the text of passports’ MRZ first and then run it in an Android WebView.

Build a Web App to Recognize Text of Passport MRZ

We are going to use Dynamsoft Camera Enhancer and Dynamsoft Label Recognizer to build a web passport MRZ scanner first.

Download the JavaScript package of Dynamsoft Label Recognizer here.
Extract the zip file, rename dist to dlr and copy dce/dist to dce in a new folder.

Create a new HTML file named scanner.html, with the above JS packages included.

 <!DOCTYPE html>
 <html lang="en">
 <head>
   <meta charset="UTF-8">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   <title>Text Scanner</title>
   <script src="dlr/dlr.js"></script>
   <script src="dce/dce.js"></script>
 </head>
 <body>
 </body>
 </html>

Initialize the Label Recognizer and the Camera Enhancer when the page is loaded. The initialization will try to load the OCR model for the current setting. The initialization status is shown in a div element.

HTML:

 <div id="status">Initializing...</div>

JavaScript:

 let modelLoading;
 let initialized = false;
 // Specify a license, you can visit https://www.dynamsoft.com/customer/license/trialLicense/?product=dcv&package=cross-platform
 Dynamsoft.DLR.LabelRecognizer.license = 'LICENSE-KEY';
 let recognizer;
 let cameraEnhancer;
 // Initialize and use the library
 init();

 async function init(){
   Dynamsoft.DLR.LabelRecognizer.onResourcesLoadStarted = (resourcePath) => {
     console.log("Loading " + resourcePath);
     // Show a visual cue that a model file is being downloaded
     modelLoading = document.createElement("div");
     modelLoading.innerText = "Loading model...";
     document.body.prepend(modelLoading);
   };
   Dynamsoft.DLR.LabelRecognizer.onResourcesLoaded = (resourcePath) => {
       console.log("Finished loading " + resourcePath);
       if (modelLoading) {
         modelLoading.remove();
         modelLoading = null;
       }
   };
   recognizer = await Dynamsoft.DLR.LabelRecognizer.createInstance();
   Dynamsoft.DCE.CameraEnhancer.defaultUIElementURL = Dynamsoft.DLR.LabelRecognizer.defaultUIElementURL;
   cameraEnhancer = await Dynamsoft.DCE.CameraEnhancer.createInstance();
   recognizer.setImageSource(cameraEnhancer);
   recognizer.onImageRead = results => {
     let text = "";
     for (let result of results) {
       for (let lineResult of result.lineResults) {
         text = text + lineResult.text + "\n";
       }
     }
     console.log(text);
   };
   await recognizer.updateRuntimeSettingsFromString("video-MRZ"); // will load the MRZ model
   document.getElementById("status").remove();
   initialized = true;
 }

Here, we use await recognizer.updateRuntimeSettingsFromString("video-MRZ"); to update the runtime settings to MRZ video mode.

Create a start button to start recognizing MRZ text.

HTML:

 <button id="startButton">Start Scan</button>

JavaScript:

 async function startScan(){
   if (recognizer) {
     await recognizer.startScanning(true); // pass true to show the scanning UI
   }
 }

Show a result verification modal and pause scan if the text is recognized.

HTML:

 <div id="modal" class="modal-window">
   <div class="overflow">
     <pre id="result"></pre>
   </div>
   <button id="correctButton">Correct</button>
   <button id="rescanButton">Rescan</button>
 </div>

CSS:

 .modal-window {
   display: none;
   position: absolute;
   left: 50%;
   max-width: 80%;
   transform: translateX(-50%);
   z-index: 99999;
   background: #fff;
   padding: 20px;
   border: thick double black;
   border-radius: 5px;
   font-family: sans-serif;
   top: 50px;  
 }
 .modal-window.active {
     display: block;
 }

 .overflow {
   overflow: auto;
 }

JavaScript:

 recognizer.onImageRead = results => {
   if (results.length>0) {
     recognizer.pauseScanning();
     let text = "";
     for (let result of results) {
       for (let lineResult of result.lineResults) {
         text = text + lineResult.text + "\n";
       }
     }
     document.getElementById("result").innerText = text;
     document.getElementById("modal").className += " active";
   }
 };

If the user click the Correct button, then stop the scan and save the result. Otherwise, resume scan.

 let correctButton = document.getElementById("correctButton");
 correctButton.onclick = function(){
   var modal = document.getElementById("modal");
   modal.className = modal.className.replace("active", "");
   recognizer.stopScanning(true);
   document.getElementById("confirmedResult").innerText = "Result: " + document.getElementById("result").innerText;
 }

 let rescanButton = document.getElementById("rescanButton");
 rescanButton.onclick = function(){
   var modal = document.getElementById("modal");
   modal.className = modal.className.replace("active", "");
   recognizer.resumeScanning();
 }

The web app is now finished. We can run an HTTPS server to test it.

Demo video:

Run the Web App in a WebView

Next, we are going to run the web app in an Android WebView. The following steps are much the same as the previous article about using barcode scanner in a WebView.

New Project and Configure

Create a new project with Android Studio.

Open src\main\AndroidManifest.xml to add camera and Internet permissions.

 <uses-permission android:name="android.permission.INTERNET" />
 <uses-permission android:name="android.permission.CAMERA" />

Request Camera Permission

In MainActivity.java, request the camera permission when the app starts.

public class MainActivity extends AppCompatActivity  {
    private static final String[] CAMERA_PERMISSION = new String[]{Manifest.permission.CAMERA};
    private static final int CAMERA_REQUEST_CODE = 10;
    
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        if (hasCameraPermission() == false) {
            requestPermission();
        }
    }
    
    private boolean hasCameraPermission() {
        return ContextCompat.checkSelfPermission(
                this,
                Manifest.permission.CAMERA
        ) == PackageManager.PERMISSION_GRANTED;
    }

    private void requestPermission() {
        ActivityCompat.requestPermissions(
                this,
                CAMERA_PERMISSION,
                CAMERA_REQUEST_CODE
        );
    }
}

Add Controls

Here is the component tree of the layout:

component tree

The WebView control is used to load the web app which opens the camera, recognizes text and then returns the result.

Because we may need to scan multiple times, in order to quickly reopen the scanner, we put the WebView in the same activity. When we need to scan, we set it visible, otherwise, we set it invisible.

Set up WebView

Extra settings are needed for the WebView.

Enable JavaScript and set MediaPlaybackRequiresUserGesture to false.

 private void loadWebViewSettings(){
     WebSettings settings = webView.getSettings();
     settings.setJavaScriptEnabled(true);
     settings.setMediaPlaybackRequiresUserGesture(false);
 }

Grant permission requests.

 webView.setWebChromeClient(new WebChromeClient() {
     @Override
     public void onPermissionRequest(final PermissionRequest request) {
         MainActivity.this.runOnUiThread(new Runnable() {
             @Override
             public void run() {
                 request.grant(request.getResources());
             }
         });
     }
 });

Add and Load Web Assets using HTTPS URL

Create an assets folder under src/main, and put the files of the web app: scanner.html, scanner.css and scanner.js in it so that we can load the web page in WebView.

Then we need to use WebViewAssetLoader to load the page through HTTPS.

Add the WebViewAssetLoader dependency in the app’s build.gradle.
```
 implementation 'androidx.webkit:webkit:1.4.0'
```

Update the WebView’s settings.

 final WebViewAssetLoader assetLoader = new WebViewAssetLoader.Builder()
           .addPathHandler("/assets/", new WebViewAssetLoader.AssetsPathHandler(this))
           .build();
              
 webView.setWebViewClient(new WebViewClientCompat() {
   @Override
   @RequiresApi(21)
   public WebResourceResponse shouldInterceptRequest(WebView view, WebResourceRequest request) {
       return assetLoader.shouldInterceptRequest(request.getUrl());
   }

   @Override
   @SuppressWarnings("deprecation") // for API < 21
   public WebResourceResponse shouldInterceptRequest(WebView view, String url) {
       return assetLoader.shouldInterceptRequest(Uri.parse(url));
   }
 });

 WebSettings webViewSettings = webView.getSettings();
 // Setting this off for security. Off by default for SDK versions >= 16.
 webViewSettings.setAllowFileAccessFromFileURLs(false);
 // Off by default, deprecated for SDK versions >= 30.
 webViewSettings.setAllowUniversalAccessFromFileURLs(false);
 // Keeping these off is less critical but still a good idea, especially if your app is not
 // using file:// or content:// URLs.
 webViewSettings.setAllowFileAccess(false);
 webViewSettings.setAllowContentAccess(false);

Then, we can load the web page with the following URL:

 webView.loadUrl("https://appassets.androidplatform.net/assets/scanner.html");

Call JavaScript Functions to Start Scanning from Java

Now that we can load the web page, we are going to further integrate the web passport MRZ scanner in the Android project.

First, we need to call JavaScript functions from Java to control the scanner.

We can use WebView’s evaluateJavascript to do this.

Set the onclick event for the Recognize Text button to start the scanner.

 Button recognizeTextButton = findViewById(R.id.recognizeTextButton);
 recognizeTextButton.setOnClickListener(new View.OnClickListener() {
     @Override
     public void onClick(View view) {
         startScan();
     }
 });

  private void startScan(){
     if (initialized == false) {
         Toast.makeText(ctx,"The scanner has not been initialized.",Toast.LENGTH_SHORT).show();
     }else {
         webView.evaluateJavascript("javascript:startScan()", new ValueCallback<String>() {
             @Override
             public void onReceiveValue(String value) {
             }
         });
         webView.setVisibility(View.VISIBLE);
     }
 }

Override the onBackPressed event so that when users press the back button, stop scan.

 @Override
 public void onBackPressed() {
     stopScan();
     webView.setVisibility(View.INVISIBLE);
 }
    
 private void stopScan(){
     if (initialized) {
         webView.evaluateJavascript("javascript:stopScan()", new ValueCallback<String>() {
             @Override
             public void onReceiveValue(String value) {
             }
         });
     }
 }

The stopScan JavaScript function:

 function stopScan(){
   if (initialized) {
     recognizer.stopScanning(true); //pass true to close the scanning UI
   }
 }

Return the Text Result to Java

If the user confirms that the MRZ reading result is correct, stop scan and then display the result in a native TextView. To do this, we need to call Java methods from JavaScript using WebView’s addJavascriptInterface method.

Meanwhile, we also need to check whether the recognizer has been initialized.

Define a callback handler to return the text result and the initialization status.

 class ScanHandler {
     public void onScanned(String result) {

     }

     public void onInitialized() {

     }
 }

Define a new JSInterface class.

 public class JSInterface {
     private ScanHandler mHandler;
     JSInterface(ScanHandler handler){
         mHandler = handler;
     }

     @JavascriptInterface
     public void returnResult(String result) {
         mHandler.onScanned(result);
     }

     @JavascriptInterface
     public void onInitialized() {
         mHandler.onInitialized();
     }
 }

Add the JavaScript interface.

 webView.addJavascriptInterface(new JSInterface(new ScanHandler (){
     @Override
     public void onScanned(String result){
         runOnUiThread(new Runnable() {
             @Override
             public void run() {
                 webView.setVisibility(View.INVISIBLE);
                 textView.setText(result);
             }
         });
     }
        
     @Override
     public void onInitialized(){
         initialized = true;
     }
        
 }), "AndroidFunction");

This will expose a global AndroidFunction object which has the functions defined in the JSInterface class.

Call the Java function from JavaScript.

Send the initialization status when the recognizer has been initialized.

 async function init(){
   //...
   AndroidFunction.onInitialized();
 }

Return the text result if the user presses the Correct button to verify the result.

 let correctButton = document.getElementById("correctButton");
 correctButton.onclick = function(){
   var modal = document.getElementById("modal");
   modal.className = modal.className.replace("active", "");
   recognizer.stopScanning(true);
   AndroidFunction.returnResult(document.getElementById("result").innerText);
 }

Handle Lifecycle Events

We also need to handle lifecycle events.

When the activity is paused, stop scan. When the activity is resumed, if the webview is visible, start scan.

@Override
protected void onResume() {
    super.onResume();
    if (webView.getVisibility() == View.VISIBLE) {
        startScan();
    }
}

@Override
protected void onPause() {
    super.onPause();
    stopScan();
}

Source Code

Check out the source code to have a try:

https://github.com/xulihang/Label-Recognition-WebView

There is also an iOS example:

https://github.com/xulihang/Label-Recognition-WebView-iOS