Design of the CaptureVisionTemplate Object
A CaptureVisionTemplate
object is the entry object of a parameter template in Dynamsoft Capture Vision (DCV) SDK.
{
"Name" : "CV_0",
"ImageSourceName": "ISA_0",
"ImageROIProcessingNameArray": ["TA_0" ],
"SemanticProcessingNameArray": ["SP_0"],
"OutputOriginalImage": 0,
"MaxParallelTasks" : 4,
"Timeout" : 500
}
Example 1 – Parameters of CaptureVisionTemplate
Summary of CaptureVisionTemplate top-level parameters
Parameter Name | Description |
---|---|
Name |
Represents the name of the parameter template, which serves as its unique identifier. |
ImageSourceName |
Indicates the input source name, used to refer to the ImageSource object. It is used to define the input image source of DCV. |
ImageROIProcessingNameArray |
Represents the collection of image ROI processing object names, used to refer to the TargetROIDef objects. It is used to define recognition tasks performed on ROIs of an image, including reading barcodes, recognizing labels, or detecting document quadrilaterals. |
SemanticProcessingNameArray |
Represents the collection of semantic-processing object names, used to refer to the SematicProcessing objects. It is used to define post-processing code parsing tasks performed on input text/bytes. |
OutputOriginalImage |
Indicates whether DCV finally outputs the original input image. |
MaxParallelTasks |
Indicates the maximum number of parallel tasks for the DCV runtime. |
MinImageCaptureInterval |
Specifies the minimum time interval (in milliseconds) allowed between consecutive image captures. |
Timeout |
Indicates the maximum amount of time (in milliseconds) that the recognition tasks should take per page. |
Table 1 – Parameters Summary of CaptureVisionTemplate
Input Source Configuration
In the parameter template, the ImageSourceName
parameter refers to the ImageSource
object, which defines the image input source of DCV. When DCV starts capturing, it will parse the ImageSource
parameter, convert it into an Image Source Adapter (ISA) object, and then continuously obtain images from it.
Captured Output Configuration
OuputOriginalImage
, ImageROIProcessingNameArray
and SemanticProcessingNameArray
are three different parameters that control the captured output, organized as a CapturedResult
interface in DCV. CapturedResult
represents a set of all captured result items on an image. Each type of result Item represents the output of different task types. The following figure lists the rough relationship between DCV output parameters and output results.
Figure 1 – Relationship between DCV parameters and output result item types
As illustrated in figure 1, the left column represents the DCV output parameters, while the right column indicates the type of DCV output result item types. The dashed line between the two displays the rough relationship between parameters and result items types.
The ImageROIProcessingNameArray
parameter can produce results such as BarcodeResultItem
, TextLineResultItem
, DetectedQuadResultItem
, and NormalizedImageResultItem
, while the SemanticProcessingNameArray
parameter can produce the result of ParsedResultItem
. This is because the ImageROIProcessingNameArray
parameter refers to one or more TargetROIDef
objects, and the SemanticProcessingNameArray
parameter refers to one or more SemanticProcessing
objects.
Next, we will focus on the core design of the TargetROIDef
and SemanticProcessing
object.
Core Design of TargetROIDef Object
The TargetROIDef
object is used to specify one or more recognition tasks to be performed on some regions of interest (ROIs) within an image. In simple terms, TargetROIDef
can be expressed using the following formula:
TargetROIDef = Recognition Task Definition + Spatial Location Definition
Key Concepts
The following figure and table briefly illustrates some key concepts involved.
Figure 2 – An example showing the key concepts
Concept | Description | Explanation with example |
---|---|---|
Recognition Tasks | The tasks include barcode recognition, label recognition, document boundary detection, etc. | ROI1 and ROI2 are two TargetROIDef objects. A task named barcode_task is configured on ROI1 and a task named label_task is configured on ROI2 . |
Atomic Result | Represents the atomic result of the recognition task output. It can be a color detection region, a barcode, a text line, a table cell, a detected quadrilateral etc. | T1 , T2 , T3 are three atomic result objects of TextLineResultItem type, and B1 is one atomic object of BarcodeResultItem type. |
Spatial Location | A Location configured on a TargetROIDef may generate zero or more target regions on which to perform a specific recognition task. |
ROI1.Location is defined as null which means generate only one target regions. ROI2.Location is defined as an upward offset relative to the output regions of ROI1 . |
Reference Region | A reference region is a physical quadrilateral region. It includes two types: entire image region and atomic result region. The former refers to the quadrilateral extent of the original image, and the latter refers to the quadrilateral extent of each atomic result. | ROI1 has only one reference region which is the entire image region. ROI2 has three reference regions which generated from T1 , T2 , T3 . |
Target Region | A target region is a physical quadrilateral region, which is calculated from a reference region and offset. | ROI1 has only one target region, which is equal to the reference region. ROI2 has three target regions, which are calculated by offsets from quadrilateral regions of T1 , T2 , T3 . |
Table 2 – Key Concepts of TargetROIDef
Workflow Design Based on Reference/Target Regions
Using the recursive definition of the reference relationship between different TargetROIDef
objects, a complex workflow can be constructed to meet the requirements of complex scenarios.
Let’s consider the following example: a user wants to read barcodes below P/N
and above L/N
in the image below. Upon analysis, we find the following positional dependencies:
- The rough location of the target barcode (blue box) can be calculated separately from the location of the text lines (
P/N
andL/N
). - The text lines (
P/N
andL/N
) are on the first and fifth lines inside the label border (green box).
Figure 3 – A sample image illustrating workflow design
The following json is a parameter template fragment that configures ROI dependencies to solve the above problems.
{
"TargetROIDefOptions" : [
{
"Name" : "ddn_roi",
"TaskSettingNameArray": [ "ddn_task" ],
"Location" : null
},
{
"Name" : "dlr_roi",
"TaskSettingNameArray": [ "dlr_task" ],
"Location":
{
"ReferenceObjectFilter" : {
"ReferenceTargetROIDefNameArray": ["ddn_roi"],
},
"Offset": null
}
},
{
"Name" : "dbr_roi1",
"TaskSettingNameArray": [ "dbr_task" ],
"Location":
{
"ReferenceObjectFilter" : {
"ReferenceTargetROIDefNameArray": ["dlr_roi"],
},
"Offset":{
// offset downwards
}
}
},
{
"Name" : "dbr_roi2",
"TaskSettingNameArray": [ "dbr_task" ],
"Location":
{
"ReferenceObjectFilter" : {
"ReferenceTargetROIDefNameArray": ["dlr_roi"],
},
"Offset":{
// offset upwards
}
}
}
]
}
Example 2 – DCV parameter template fragment illustrating workflow design
- It configures four
TargetROIDef
objects:ddn_roi
,dlr_roi
,dbr_roi1
, anddbr_roi2
. ddn_roi
is configured to depend on the entire image region directly.dlr_roi
is configured to depend onddn_roi
and has anull
offset, which means it uses the same region as the output region ofddn_roi
.dbr_roi1
is configured to depend ondlr_roi
and has an offset configured to be shifted downwards from the output region ofdlr_roi
.dbr_roi2
is also configured to depend ondlr_roi
and has an offset configured to be shifted upwards from the output region ofdlr_roi
.
Construct a Dependency Graph
When DCV parses the TargetROIDef
objects in the above parameter template, it constructs a directed dependency graph based on the configured dependencies between different TargetROIDef
objects. During actual execution, the tasks are executed in the order specified by the dependency graph. The following figure illustrates the generated dependency graph after parsing the above template fragment.
Figure 4 – Dependency Graph
Filter Out the Desired Reference Regions
In practical applications, if a TargetROIDef
depends on another TargetROIDef
object named roi1
, it may not want to depend unconditionally on all the reference regions generated by roi1
. Instead, it may want to depend only on those regions that meet certain filtering conditions, such as the text meeting a specific regular expression or the barcode type meeting specific formatting requirements.
Based on Example 2
, regular expression filtering conditions are added to dbr_roi1
and dbr_roi2
:
{
"TargetROIDefOptions" : [
//......
{
"Name" : "roi_dbr1",
//......
"Location":
{
"ReferenceObjectFilter" : {
"ReferenceTargetROIDefNameArray": ["roi_dlr"],
"TextLineFilteringCondition":
{
"LineStringRegExPattern": "^P/N"
}
},
//......
}
},
{
"Name" : "roi_dbr2",
//......
"Location":
{
"ReferenceObjectFilter" : {
"ReferenceTargetROIDefNameArray": ["roi_dlr"],
"TextLineFilteringCondition":
{
"LineStringRegExPattern": "^L/N"
}
},
//......
}
}
]
}
roi_dbr1
is configured to only depend on the reference region generated byroi_dlr
that meet the regular expression “^P/N”.roi_dbr2
is configured to only depend on the reference region generated byroi_dlr
that meet the regular expression “^L/N”.
At runtime, reference regions that do not meet the filtering conditions will be discarded and will not be passed as inputs to subsequent TargetROIDef
for further processing.
For more details about filtering reference objects, please refer to ReferenceObjectFilter
Core Design of SemanticProcessing Object
The SemanticProcessing
object is used to specify one or more tasks to analyze and extract information from image ROI processing results. The whole workflow typically involves following concepts.
Prerequisites
A SemanticProcessing
object will take effect when its name is referenced in CaptureVisionTemplate.SemanticProcessingNameArray
.
Launch Timing
The semantic process is triggered only after all the recognition tasks referenced in CaptureVisionTemplate.ImageROIProcessingNameArray
have been completed.
Data Filtering
In many cases, the process may involve filtering data to select only the relevant sources, such as a label text meeting a specific regular expression or a barcode meeting a specific format. ReferenceObjectFilter
is used to specify such data filtering criteria.
Task Execution
This is the main part of the workflow where the actual tasks are defined. TaskSettingNameArray
is used to specify such tasks by referencing the name of a CodeParserTaskSetting
object.
Results Reporting
Currently, semantic-processing supports code parsing tasks, so the result is returned with callback OnParsedResultsReceived
.