Building a Desktop Document Scanning and Barcode Recognition Application with Qt and Python

A few months ago, I published a cross-platform desktop barcode reading application built with Qt, Python, and Dynamsoft Barcode Reader. The supported input sources include real-time camera streams, image files, and screenshots. In this article, I will demonstrate how to set a document scanner as the input source. The SDK used for document scanning is Dynamic Web TWAIN, a cross-platform JavaScript library supporting Windows, macOS, and Linux.

Prerequisites

  • Dynamic Web TWAIN: A JavaScript library for document scanning that enables embedding document scanning functionality in your web applications. It supports various image acquisition sources, including scanners, webcams, and local image files.

      npm install dwt
    
  • Dynamsoft Barcode Reader: A cross-platform barcode scanning library that enables embedding barcode reading functionality in your web, desktop, or mobile applications. It supports various barcode types, including 1D, 2D, and postal barcodes.

      pip install dbr
    
  • Qt5: A cross-platform application framework widely used for developing desktop applications. It provides a set of user interface components and tools for building desktop applications.

      pip install PyQt5
    
  • PyQtWebEngine: A set of Python bindings for The Qt Company’s Qt WebEngine framework.

      pip install PyQtWebEngine
    
  • License Keys for Dynamsoft Products

Steps to Build a Cross-Platform Document Scanning and Barcode Recognition Application

We will create a hybrid application that combines HTML5 and Python code.

1. Create a Qt Application with Qt Widgets

Here are the required Qt widgets:

  • QWebEngineView: Used to load HTML and JavaScript code. It displays the document images scanned by the Dynamic Web TWAIN API.
  • QPushButton: One for acquiring images and another for decoding barcodes.
  • QTextEdit: Used to show the barcode results.

First, create an empty Qt window:

from PyQt5.QtWidgets import *
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWebChannel import QWebChannel

app = QApplication([])
win = QWidget()
win.setWindowTitle('Dynamic Web TWAIN and Dynamsoft Barcode Reader')
win.show()
app.exec_()

Next, create a layout and add the widgets to it:

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

layout = QVBoxLayout()
win.setLayout(layout)

view = WebView()
bt_scan = QPushButton('Scan Barcode Documents')
bt_read = QPushButton('Read Barcode')
text_area = QTextEdit()

layout.addWidget(view)
layout.addWidget(bt_scan)
layout.addWidget(bt_read)
layout.addWidget(text_area)

Shortcut keys are convenient for a desktop application. Use R to reload the web view and Q to quit the application:

def keyPressEvent(event):
    if event.key() == Qt.Key.Key_Q:
        win.close()
    elif event.key() == Qt.Key.Key_R:
        refresh_page()

win.keyPressEvent = keyPressEvent

So far, the basic UI is done. You can run the application:

python app.py

2. Load HTML and JavaScript Code in the Qt Application

Initialize the QWebEngineView and load an index.html file:

import os

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)

After re-running the Python application, the web page will be displayed.

3. Establish Communication Between Python and JavaScript

Communication between Python and JavaScript is a key part of the application.

The index.html file contains the acquireImage() function for scanning documents.

<script src="qwebchannel.js"></script>
<script src="node_modules/dwt/dist/dynamsoft.webtwain.min.js"></script>
<select size="1" id="source" style="position: relative; width: 220px;"></select>
<div id="dwtcontrolContainer"></div>
<script type="text/javascript">
    function acquireImage() {
        if (!dwtObject) {
            alert("Please wait for the document to be loaded completely.");
            return;
        }
    
        if (selectSources) {
            try {
                await dwtObject.SelectDeviceAsync(sourceList[selectSources.selectedIndex]);
                await dwtObject.OpenSourceAsync();
                dwtObject.AcquireImageAsync({
                    IfDisableSourceAfterAcquire: true
                });
            } catch (e) {
                console.error(e);
            }
    
            dwtObject.CloseSource();
        } else {
            alert("No Source Available!");
        }
    }
</script>

We can execute the JavaScript function by calling runJavaScript on the Python side:

def read_barcode():
    frame = view.page()
    frame.runJavaScript('acquireImage();')

bt_scan.clicked.connect(acquire_image)

Once the scanning process is finished, the current image is kept in memory. We can convert it to a base64 string:

function getCurrentImage() {
    if (dwtObject) {
        dwtObject.ConvertToBase64(
            [dwtObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

To pass the base64 string from JavaScript to Python, we use QWebChannel. In index.html, include qwebchannel.js (found in the Qt\Examples\Qt-5.12.11\webchannel\shared folder if you have QtCreator installed). Then add the following code to send the base64 string:

var backend;
new QWebChannel(qt.webChannelTransport, function (channel) {
    backend = channel.objects.backend;
});


function getCurrentImage() {
    if (DWObject) {
        DWObject.ConvertToBase64(
            [DWObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                backend.onDataReady(result.getData(0, result.getLength()))
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

The onDataReady function needs to be implemented on the Python side:

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):
        imgdata = base64.b64decode(base64img)

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)
        self.backend = Backend(self)
        self.channel = QWebChannel(self.page())
        self.channel.registerObject('backend', self.backend)
        self.page().setWebChannel(self.channel)

def read_barcode():
    frame = view.page()
    frame.runJavaScript('getCurrentImage();')

bt_read.clicked.connect(read_barcode)

4. Decode barcodes from scanned documents

Finally, we can use Dynamsoft Barcode Reader to decode barcodes from the base64 string and display the result in the text area.

from dbr import *
import base64

# Initialize Dynamsoft Barcode Reader
reader = BarcodeReader()
# Apply for a trial license https://www.dynamsoft.com/customer/license/trialLicense?product=dbr&source=codepool
reader.init_license('DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==')

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):
        imgdata = base64.b64decode(base64img)

        try:
            text_results = reader.decode_file_stream(bytearray(imgdata), '')
            if text_results != None:
                out = ''
                for text_result in text_results:
                    out += "Barcode Format : "
                    out += text_result.barcode_format_string + '\n'
                    out += "Barcode Text : "
                    out += text_result.barcode_text + '\n'
                    out += "-------------------------------------------------" + '\n'

                text_area.setText(out)
        except BarcodeReaderError as bre:
            print(bre)

In this setup, the Backend class includes the onDataReady method, which decodes the base64-encoded image data, extracts barcode information, and displays the results in the text_area. The reader.decode_file_stream method processes the image data to find barcodes and extract their text and format, displaying the results in a formatted manner.

Qt application: document scanning and barcode reading

Source Code

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/qt