Recognizing SEMI OCR Font with Python and Dynamsoft Capture Vision SDK
SEMI (Semiconductor Equipment and Materials International) font is a special dot matrix font used for marking silicon wafers. In this tutorial, we’ll walk through building a Python application to recognize these specialized markings using Dynamsoft Capture Vision SDK.
Demo: Recognize SEMI Font with Python
Prerequisites
- Python 3.8 or later
- Dynamsoft Capture Vision Trial License: Get a 30-Day trial license key for the Dynamsoft Capture Vision SDK.
-
Python Packages: Install the required Python packages using the following commands:
pip install dynamsoft-capture-vision-bundle opencv-pythondynamsoft-capture-vision-bundle: Python binding for Dynamsoft Capture Vision SDK.opencv-python: For displaying source images and overlaying recognition results.
Key Features
- Specialized SEMI Font Recognition: Uses a custom model trained for single-density dot matrix fonts (uppercase letters A-Z and digits 0-9).
- Visual Feedback: Draws bounding boxes around recognized text.
- Batch Processing: Processes single images or entire directories.
- Cross-Platform: Works on Windows, Linux, and macOS.

Step 1: Initialize the SDK
Create a new Python file and initialize the SDK with your license key:
from dynamsoft_capture_vision_bundle import *
err_code, err_str = LicenseManager.init_license("LICENSE-KEY")
if err_code != EnumErrorCode.EC_OK and err_code != EnumErrorCode.EC_LICENSE_CACHE_USED:
print("License initialization failed: " + err_str)
Step 2: Load the SEMI OCR Model
A custom model trained by Dynamsoft enables the Capture Vision SDK to recognize SEMI fonts:
cvr = CaptureVisionRouter()
# Load the SEMI OCR model
with open('models/semi-ocr.data', 'rb') as f:
model_data = f.read()
err_code, err_str = cvr.append_model_buffer('semi-ocr', model_data, 1)
if err_code != EnumErrorCode.EC_OK:
print("Model loading failed: " + err_str)
For model-related questions, please contact Dynamsoft Support.
Step 3: Load Recognition Settings
Besides the model file, recognition settings must be loaded from a semi-ocr.json file.
err_code, err_str = cvr.init_settings_from_file("semi-ocr.json")
if err_code != EnumErrorCode.EC_OK:
print("Configuration loading failed: " + err_str)
Step 4: Implement Image Processing
Here’s the core recognition logic that processes images and displays results:
import cv2
import numpy as np
def process_image(image_path, cvr):
cv_image = cv2.imread(image_path)
result = cvr.capture(image_path, "recognize_semi_ocr")
if result.get_error_code() != EnumErrorCode.EC_OK:
print("Error: " + str(result.get_error_code())+ result.get_error_string())
else:
items = result.get_items()
for item in items:
if isinstance(item, TextLineResultItem):
print(f"{RED}{item.get_text()}{RESET}")
location = item.get_location()
points = [(p.x, p.y) for p in location.points]
cv2.drawContours(cv_image, [np.intp(points)], 0, (0, 255, 0), 2)
cv2.putText(cv_image, item.get_text(), (points[0][0] + 10, points[0][1] + 20),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)
cv2.imshow(
os.path.basename(image_path), cv_image)
Step 5: Create the Main Application Loop
Add a loop to handle single files or directories:
import os
def main():
cvr = CaptureVisionRouter()
# ... initialization code from previous steps ...
while True:
path = input("Enter image path or directory (Q to quit): ").strip()
if path.lower() == "q":
break
if not os.path.exists(path):
print("File not found: " + path)
continue
else:
if os.path.isfile(path):
process_image(path, cvr)
elif os.path.isdir(path):
files = os.listdir(path)
for file in files:
if file.endswith(".jpg") or file.endswith(".jpeg") or file.endswith(".png"):
process_image(os.path.join(path, file), cvr)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == '__main__':
main()
Step 6: Run the Python Script
python read_semi_ocr.py

Source Code
https://github.com/yushulx/python-barcode-qrcode-sdk/edit/main/examples/official/semi_font_ocr