Building a Python Flask Web Document Scanner Using Dynamsoft Document SDK

The document-scanner-sdk package provides Python bindings for the Dynamsoft C/C++ Document Scanner SDK v1.x, enabling developers to quickly create document scanner applications for Windows and Linux desktop environments. This article demonstrates how to build a web-based document scanner using Python Flask and the Python Document Scanner SDK. The application allows you to capture documents using a connected camera, process them on the server, and display the scanned results directly in a web browser.


  • Obtain a 30-day free trial license for the Dynamsoft Document Normalizer SDK.
  • Install the required dependencies:

      pip install flask document-scanner-sdk opencv-python

Creating a Scanner Class for Document Processing

Start by creating a file and defining a Scanner class to handle document image processing:

import cv2
import numpy as np
import docscanner


class Scanner(object):
    def __init__(self):
        self.scanner = docscanner.createInstance()

    def __del__(self):

    def detect_edge(self, image, enabled_transform=False):
        results = self.scanner.detectMat(image)
        normalized_image = None
        for result in results:
            x1 = result.x1
            y1 = result.y1
            x2 = result.x2
            y2 = result.y2
            x3 = result.x3
            y3 = result.y3
            x4 = result.x4
            y4 = result.y4

                image, [np.intp([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)

            if enabled_transform:
                normalized_image = self.scanner.normalizeBuffer(
                    image, x1, y1, x2, y2, x3, y3, x4, y4)
                normalized_image = docscanner.convertNormalizedImage2Mat(

        return image, normalized_image


  • initLicense: Initializes the SDK with the provided license key.
  • createInstance: Creates an instance of the document scanner.
  • setParameters: Configures the scanner parameters using the color template.
  • detectMat: Detects the document edges in the input image.
  • normalizeBuffer: Normalizes the document image based on detected edges.
  • convertNormalizedImage2Mat: Converts the normalized image to a cv2 matrix.

Implementing a Simple Desktop Document Scanner

  1. Create a file and add the following code:

     import cv2
     from document import Scanner
     cap = cv2.VideoCapture(0)
     scanner = Scanner()
     while (cap.isOpened()):
         ret, frame =
         video_frame = None
         image_frame = None
         if cv2.waitKey(10) & 0xFF == ord('q'):
         if ret:
             if cv2.waitKey(10) & 0xFF == ord('p'):
                 video_frame, image_frame = scanner.detect_edge(frame, True)
                 video_frame, _ = scanner.detect_edge(frame)
             if video_frame is not None:
                 cv2.imshow("Edge Detection", video_frame)
             if image_frame is not None:
                 cv2.imshow("Rectified Document", image_frame)

    This code captures frames from the camera using OpenCV, continuously detects document edges, and displays the rectified document when the p key is pressed. Press q to exit the application.

  2. Run the desktop document scanner:


    Python desktop document scanner

Building a Flask Web Document Scanner

  1. Create a file to manage the camera, detect document edges, and rectify the document:

     import cv2
     from document import Scanner
     class VideoCamera(object):
         def __init__(self):
             self.cap = cv2.VideoCapture(0)
             self.is_record = False
             self.out = None
             self.transformed_frame = None
             self.scanner = Scanner()
             self.cached_frame = None
         def __del__(self):
         def get_video_frame(self):
             ret, frame =
             if ret:
                 frame, _ = self.scanner.detect_edge(frame)
                 self.cached_frame = frame
                 ret, jpeg = cv2.imencode('.jpg', frame)
                 return jpeg.tobytes()
                 return None
         def capture_frame(self):
             ret, frame =
             if ret:
                 _, frame = self.scanner.detect_edge(frame, True)
                 ret, jpeg = cv2.imencode('.jpg', frame)
                 self.transformed_frame = jpeg.tobytes()
                 return None
         def get_cached_frame(self):
             return self.cached_frame
         def get_image_frame(self):
             return self.transformed_frame
  2. Create an HTML template with an image element to display the camera frames served by Flask:

     <!DOCTYPE html>
       <title>Document Scanner</title>
       <h1>Document Edge Detection and Perspective Transformation</h1>
       <div id="controller">
         <button id="capture">Capture</button>
         <script type="text/javascript" src=""></script>      
       <img id="video" src="" width="640" height="480">
       <img id="image" style="max-width:640px; max-height:480px">
  3. Encode each frame as a JPEG image and stream it to the client:

     def video_frame():
         global video_camera 
         if video_camera == None:
             video_camera = VideoCamera()
         while True:
             frame = video_camera.get_video_frame()
             if frame is not None:
                 yield (b'--frame\r\n'
                         b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')
                 yield (b'--frame\r\n'
                         b'Content-Type: image/jpeg\r\n\r\n' + video_camera.get_cached_frame() + b'\r\n\r\n')
     def video_viewer():
         return Response(video_frame(),
                         mimetype='multipart/x-mixed-replace; boundary=frame')
  4. Create a controller.js file to handle the capture event:

     var buttonCapture = document.getElementById("capture");
     buttonCapture.onclick = function() {
         var xhr = new XMLHttpRequest();
         xhr.onreadystatechange = function() {
             if (xhr.readyState == 4 && xhr.status == 200) {
                 var image = document.getElementById("image");
                 image.src = "/image_viewer?" + new Date().getTime();
         }"POST", "/capture_status");
         xhr.setRequestHeader("Content-Type", "application/json;charset=UTF-8");
         xhr.send(JSON.stringify({ status: "true" }));
  5. Stream the transformed document image to the client:

     def image_frame():
         global video_camera 
         if video_camera == None:
             video_camera = VideoCamera()
         frame = video_camera.get_image_frame()
         if frame is not None:
             yield (b'--frame\r\n'
                     b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')
     def image_viewer():
         return Response(image_frame(),
                             mimetype='multipart/x-mixed-replace; boundary=frame')
  6. Run the Flask application.

  7. Visit in your web browser to use the document scanner.

    Python Flask web document scanner

Source Code