Building a Desktop Document Scanning and Barcode Recognition Application with Qt and Python

Nov 16, 2021

A few months ago, I published a cross-platform desktop barcode reading application built with Qt, Python, and Dynamsoft Barcode Reader. The supported input sources include real-time camera streams, image files, and screenshots. In this article, I will demonstrate how to set a document scanner as the input source. The SDK used for document scanning is Dynamic Web TWAIN, a cross-platform JavaScript library supporting Windows, macOS, and Linux.

Prerequisites

Dynamic Web TWAIN: A JavaScript library for document scanning that enables embedding document scanning functionality in your web applications. It supports various image acquisition sources, including scanners, webcams, and local image files.
```
  npm install dwt
```
Dynamsoft Barcode Reader: A cross-platform barcode scanning library that enables embedding barcode reading functionality in your web, desktop, or mobile applications. It supports various barcode types, including 1D, 2D, and postal barcodes.
```
  pip install dbr
```
Qt5: A cross-platform application framework widely used for developing desktop applications. It provides a set of user interface components and tools for building desktop applications.
```
  pip install PyQt5
```
PyQtWebEngine: A set of Python bindings for The Qt Company’s Qt WebEngine framework.
```
  pip install PyQtWebEngine
```
License Keys for Dynamsoft Products

Steps to Build a Cross-Platform Document Scanning and Barcode Recognition Application

We will create a hybrid application that combines HTML5 and Python code.

1. Create a Qt Application with Qt Widgets

Here are the required Qt widgets:

QWebEngineView: Used to load HTML and JavaScript code. It displays the document images scanned by the Dynamic Web TWAIN API.
QPushButton: One for acquiring images and another for decoding barcodes.
QTextEdit: Used to show the barcode results.

First, create an empty Qt window:

from PyQt5.QtWidgets import *
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWebChannel import QWebChannel

app = QApplication([])
win = QWidget()
win.setWindowTitle('Dynamic Web TWAIN and Dynamsoft Barcode Reader')
win.show()
app.exec_()

Next, create a layout and add the widgets to it:

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

layout = QVBoxLayout()
win.setLayout(layout)

view = WebView()
bt_scan = QPushButton('Scan Barcode Documents')
bt_read = QPushButton('Read Barcode')
text_area = QTextEdit()

layout.addWidget(view)
layout.addWidget(bt_scan)
layout.addWidget(bt_read)
layout.addWidget(text_area)

Shortcut keys are convenient for a desktop application. Use R to reload the web view and Q to quit the application:

def keyPressEvent(event):
    if event.key() == Qt.Key.Key_Q:
        win.close()
    elif event.key() == Qt.Key.Key_R:
        refresh_page()

win.keyPressEvent = keyPressEvent

So far, the basic UI is done. You can run the application:

python app.py

2. Load HTML and JavaScript Code in the Qt Application

Initialize the QWebEngineView and load an index.html file:

import os

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)

After re-running the Python application, the web page will be displayed.

3. Establish Communication Between Python and JavaScript

Communication between Python and JavaScript is a key part of the application.

The index.html file contains the acquireImage() function for scanning documents.

<script src="qwebchannel.js"></script>
<script src="node_modules/dwt/dist/dynamsoft.webtwain.min.js"></script>
<select size="1" id="source" style="position: relative; width: 220px;"></select>
<div id="dwtcontrolContainer"></div>
<script type="text/javascript">
    function acquireImage() {
        if (!dwtObject) {
            alert("Please wait for the document to be loaded completely.");
            return;
        }
    
        if (selectSources) {
            try {
                await dwtObject.SelectDeviceAsync(sourceList[selectSources.selectedIndex]);
                await dwtObject.OpenSourceAsync();
                dwtObject.AcquireImageAsync({
                    IfDisableSourceAfterAcquire: true
                });
            } catch (e) {
                console.error(e);
            }
    
            dwtObject.CloseSource();
        } else {
            alert("No Source Available!");
        }
    }
</script>

We can execute the JavaScript function by calling runJavaScript on the Python side:

def read_barcode():
    frame = view.page()
    frame.runJavaScript('acquireImage();')

bt_scan.clicked.connect(acquire_image)

Once the scanning process is finished, the current image is kept in memory. We can convert it to a base64 string:

function getCurrentImage() {
    if (dwtObject) {
        dwtObject.ConvertToBase64(
            [dwtObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

To pass the base64 string from JavaScript to Python, we use QWebChannel. In index.html, include qwebchannel.js (found in the Qt\Examples\Qt-5.12.11\webchannel\shared folder if you have QtCreator installed). Then add the following code to send the base64 string:

var backend;
new QWebChannel(qt.webChannelTransport, function (channel) {
    backend = channel.objects.backend;
});


function getCurrentImage() {
    if (DWObject) {
        DWObject.ConvertToBase64(
            [DWObject.CurrentImageIndexInBuffer],
            Dynamsoft.DWT.EnumDWT_ImageType.IT_JPG,
            function (result, indices, type) {
                backend.onDataReady(result.getData(0, result.getLength()))
            },
            function (errorCode, errorString) {
                console.log(errorString);
            }
        );
    }
}

The onDataReady function needs to be implemented on the Python side:

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):
        imgdata = base64.b64decode(base64img)

class WebView(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)

        # Load web page and resource files to QWebEngineView
        file_path = os.path.abspath(os.path.join(
            os.path.dirname(__file__), "index.html"))
        local_url = QUrl.fromLocalFile(file_path)
        self.load(local_url)
        self.backend = Backend(self)
        self.channel = QWebChannel(self.page())
        self.channel.registerObject('backend', self.backend)
        self.page().setWebChannel(self.channel)

def read_barcode():
    frame = view.page()
    frame.runJavaScript('getCurrentImage();')

bt_read.clicked.connect(read_barcode)

4. Decode barcodes from scanned documents

Finally, we can use Dynamsoft Barcode Reader to decode barcodes from the base64 string and display the result in the text area.

from dbr import *
import base64

# Initialize Dynamsoft Barcode Reader
reader = BarcodeReader()
# Apply for a trial license https://www.dynamsoft.com/customer/license/trialLicense?product=dbr&source=codepool
reader.init_license('DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==')

class Backend(QObject):
    @pyqtSlot(str)
    def onDataReady(self, base64img):
        imgdata = base64.b64decode(base64img)

        try:
            text_results = reader.decode_file_stream(bytearray(imgdata), '')
            if text_results != None:
                out = ''
                for text_result in text_results:
                    out += "Barcode Format : "
                    out += text_result.barcode_format_string + '\n'
                    out += "Barcode Text : "
                    out += text_result.barcode_text + '\n'
                    out += "-------------------------------------------------" + '\n'

                text_area.setText(out)
        except BarcodeReaderError as bre:
            print(bre)

In this setup, the Backend class includes the onDataReady method, which decodes the base64-encoded image data, extracts barcode information, and displays the results in the text_area. The reader.decode_file_stream method processes the image data to find barcodes and extract their text and format, displaying the results in a formatted manner.

Qt application: document scanning and barcode reading

Source Code

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/qt

LANGUAGES

PLATFORMS

FEATURED