×

SOLUTIONS

See All Solutions

PRODUCTS

MRZ Scanner

See All Products

DEMOS

MRZ Scanner Web Demo

See All Demos

BOOK A SALES MEETING

DEVELOPERS

VISIT DEVELOPER CENTER

COMPANY

CONTACT US

DOWNLOAD 30-DAY FREE TRIAL

PRICING

USE CASES

ID Scanning

Inventory Management

Client Onboarding

MRZ Scanner

INDUSTRIES

Financial Services

Manufacturing

Logistics

Retail

Healthcare

Solutions

See All Solutions

DOCUMENT CAPTURE SDKS

Dynamic Web TWAIN

Cross-browser scanner SDK

Mobile Web Capture

Captures images of documents from mobile cameras

BARCODE SCANNING SDKS

Barcode Reader for Web

Barcode Reader for Mobile

for native apps

Barcode Reader for Server/Desktop

C / C++ / .NET / Python / Java

Customize for Batch Scanning

INTELLIGENT OCR SDKS

MRZ Scanner

for passports, ID cards See All Products

DOCUMENT CAPTURE

Scan Documents from Scanners

Capture Documents from Cameras in Mobile Browsers

View and Annotate PDF

BARCODE SCANNING

Scan Barcodes from Video

Scan Barcodes from Images

Scan Barcodes on iOS

[Mobile Native]

Scan Barcodes on Android

[Mobile Native]

INTELLIGENT OCR

MRZ Scanner Web Demo

BOOK A SALES MEETING

Request Trial License

Try Online Demo

Download Trial SDK

Documentation

GitHub Repo

Dev Blog

Developer center

Developer Center

Contact Us

About Us

News

Awards

Customer Stories

Customer Support

Blog

Partner Program

Customer Stories

Blog > tutorials > From Text Recognition to Data Control

From Text Recognition to Data Control

Jan 08, 2021

Optical Character Recognition (OCR) helps users capture and recognize text information from images. However, basic OCR technology cannot meet the growing requirement for data control, meaning that in some complex scenarios, we may need to extract critical data from a specified region. Dynamsoft has developed some new extraction technologies powered by Dynamsoft Label Recognition SDK to take control of data. Let us show you how we did it.

Auto-Detect

By default, Dynamsoft Label Recognition SDK detects text regions automatically, which means you’ll get all the text in one result. This is efficient when there is only a straight line of text. For these scenarios, Dynamsoft Label Recognition offers automatic region detection mode DLR_RPM_AUTO.

settings.regionPredetectionModes[0] = DLRRegionPredetectionMode::DLR_RPM_AUTO;

Zonal OCR

If one image contains multiple text areas, developers can run a zonal OCR feature for a specified text area using Dynamsoft Label Recognition. In this example, we are going to recognize voucher codes on the back of gift cards.

OCR gift cards

As you can see, the voucher code is fixed in the lower-left corner. Dynamsoft Label Recognition offers a flexible API to help you specify a single region to avoid capturing unwanted text. This article will show you how to use referenceRegion and textArea in RuntimeSettings to control OCR results.

    char error[512];

    DLRRuntimeSettings settings;
    dlr.GetRuntimeSettings(&settings);
    
    settings.referenceRegion= { { {0,0}, {20,0}, {20,100}, {0,100}}, 1 };
    settings.textArea = { { {0,80}, {20,80}, {20,100}, {0,100} } };
    dlr.UpdateRuntimeSettings(&settings, error, 512);

This function is based on a percentage axis. {x1,y1}, {x2,y2}, {x3,y3}, {x4,y4} are four points, usually input clockwise from the upper-left corner to the lower-left corner. X and Y take a value from 0 to 100, which means it is located at X% of x-coordinate and Y% of y-coordinate.

In this case, {0,0}, {20,0}, {20,100}, {0,100} are four points used to ensure the reference region. Thus, we indicate the specific region below.

OCR specific region

No matter how the image is scaled, this percentage area will not change. These automated batch processing capabilities help developers reduce manual work. For example, you can use a template to scan a large number of forms. It is possible to recognize only one image and save the reference region and text area as a JSON file template. Then you can use this template for other documents in subsequent workflows to save time.

Get the Source Code on Windows & Linux

Working with a Barcode

So, how do you determine the exact percentage to achieve region control? It becomes much easier when texts are around barcodes. If there is a small barcode on a large image, we recommend using the RelativeBarcodeRegions argument in pre-detection mode to speed up the localization process and recognition accuracy.

With the help of Dynamsoft Barcode Reader SDK, developers can quickly decode barcodes and store the result. Dynamsoft Label Recognition provides two APIs to work with barcode results:

RecognizeBasedOnDBRResultsByBuffer and RecognizeBasedOnDBRResultsByFile.

After getting the results, you can also compare the barcode result to the OCR result.

CLabelRecognition* recognizer = new CLabelRecognition();
recognizer->InitLicense("t0260NwAAAHV***************");
//Generate imageData from somewhere else
int errorCode = recognizer->RecognizeBasedOnDBRResultsByBuffer(imageData, "");
int errorCode = recognizer->RecognizeBasedOnDBRResultsByFile("C:\\Program Files (x86)\\Dynamsoft\\{Version number}\\Images\\Sample.png", "");
delete recognizer;

Read Texts with a Specific Background Color

In our daily lives, there are a large number of colorful labels with rich text information. An everyday use case example would be a price tag where the text is on a yellow background. Preprocessing the background area can save time for recognition.

Read Texts with a Specific Background Color

Suppose you want to specify a set of the foreground and background colors used for region detection. In that case, we recommend using the ForeAndBackgroundColours argument to define the foreground and background colors.

In Summary

Dynamsoft Label Recognition is a text recognition SDK and a data control tool. Developers can fully take control of the data and improve recognition accuracy. Learn more details about defining multiple reference regions and text areas.

Take the Next Step

Want to get more out of your images? Please check out our Online Documentation or speak to one of our Technical Support Members.