Evaluating MRZ Scanning Accuracy: Insights from Scanned and Photographed Passports

Aug 27, 2025 · Admin

The Machine Readable Zone (MRZ) is the section of passports and IDs designed for automated reading. Despite its restricted character set (A–Z, 0–9, <), reliable recognition is not always straightforward in practice.

Common challenges include:

  • Character confusables such as 0 vs. O, 1 vs. I, and 5 vs. S.
  • Image capture issues like skew, glare, and perspective distortion.
  • Preprocessing needs, since correctly detecting and normalizing the MRZ region is essential before recognition.

This evaluation explores how Dynamsoft Capture Vision (DCV) performs on MRZ recognition for both scanned and photographed passport images.

Methodology

We used MRZ samples from the MIDV-2020 dataset[1], covering both clean scans and challenging photo captures.

Each MRZ string (88 characters across two lines) was compared against ground truth annotations under a strict “all-or-nothing” metric:

  • Perfect match → Correct
  • Any mismatch → Mismatch
  • Nothing detected → No Result
  • Missing ground truth → Excluded

This method reflects real-world needs, where the MRZ is parsed as a whole rather than character by character.

Results of Scanned Passports: Nearly Error-Free

Note: The result is based on Dynamsoft Capture Vision 3.0.

On 400 upright scans, DCV achieved 100% MRZ accuracy, with all strings correctly recognized once ground truth was available.

MRZ Recognition Accuracy on Scan Subset

Confusable Character Pairs

Note: The numbers in these tables represent the count of confusable characters, not the number of document images. Only 400 images were evaluated, but each image can contribute multiple characters to the totals.

Pair 0/O — 2×2

Ground Truth \ Predicted 0 O
0 2232 0
O 0 387

Out-of-pair preds: 0→other = 0, O→other = 0

Pair 1/I — 2×2

Ground Truth \ Predicted 1 I
1 1697 0
I 0 700

Out-of-pair preds: 1→other = 0, I→other = 0

Pair 5/S — 2×2

Ground Truth \ Predicted 5 S
5 858 0
S 0 626

Out-of-pair preds: 5→other = 5, S→other = 0

Photographed Passports: Major Accuracy Gains in Latest Version

In our October 2025 re-evaluation, we re-tested the latest Dynamsoft Capture Vision (DCV) v3.2 build against the same MIDV-2020 photo subset used in the previous benchmark.
While the earlier version achieved a 71% overall MRZ accuracy, the new release shows major gains across all passport sets — with per-country accuracy now consistently above 92%.

These results demonstrate how improved preprocessing, normalization, and checksum-aware validation now handle blur, glare, and skew more effectively.

Below is a sample image of photographed passport MRZ.

MIDV-2020 sample image (photographed passport MRZ)

Performance Comparison

Dataset Total Compared Correct Accuracy (Old, Aug 2025) Accuracy (New, Oct 2025) Δ Change
AZE Passport 100 93 88 77.0 % 94.6 % ▲ +17.6 pp
GRC Passport 100 91 90 71.0 % 98.9 % ▲ +27.9 pp
LVA Passport 100 87 83 62.0 % 95.4 % ▲ +33.4 pp
SRB Passport 100 89 82 74.0 % 92.1 % ▲ +18.1 pp
Total 400 360 343 71.0 % 95.3 % ▲ +24.3 pp

Accuracy = Exact matches / Compared (GT + PR available)

Confusable Character Pairs

Even with higher overall accuracy, confusable substitutions were analyzed to confirm MRZ-specific post-validation effectiveness.

Pair 0/O — 2×2

Ground Truth \ Predicted 0 O
0 1993 8
O 0 350

Out-of-pair preds: 0→other=7, O→other=2

Pair 1/I — 2×2

GT \ Pred 1 I
1 1529 0
I 0 628

Out-of-pair preds: 1→other=8, I→other=3

Pair 5/S — 2×2

GT \ Pred 5 S
5 783 0
S 0 550

Out-of-pair preds: 5→other=4, S→other=0

Observations

  1. Substantial Accuracy Jump – Average MRZ accuracy rose from ~71 % to ~95 %, with some passport sets (e.g., GRC) now nearly perfect.
  2. Lower Character Confusables – DCV’s validation logic and checksum correction further minimize 0/O, 1/I, and 5/S mix-ups.
  3. Improved Robustness to Photo Artifacts – Enhanced normalization now tolerates perspective and lighting variations that previously caused “No Result” cases.
  4. Consistent Pipeline Reliability – The detection → normalization → recognition → validation flow proves scalable across real-world photo captures.

Key Takeaways

  1. Preprocessing is critical — Deskewing and perspective correction significantly improve recognition.
  2. Scans vs. Photos differ — Clean scans are nearly perfect, but handheld captures still pose challenges.
  3. Validation matters — MRZ-specific post-checks effectively reduce substitution errors.

In practice, accurate MRZ reading requires a pipeline approach: detection → normalization → recognition → validation.
DCV implements this pipeline effectively, showing high reliability on scans and resilience under photo conditions.

Workflow: detection, normalization, recognition, and validation

Next Steps

Accurate MRZ recognition underpins identity verification, border control, and KYC workflows. Our evaluation highlights how a structured OCR pipeline ensures robust performance in real-world use cases.

Want to test MRZ scanning with your own documents?
Download the MRZ Scanner under the Dynamsoft Capture Vision umbrella and try it out today. You can quickly prototype a pipeline that combines preprocessing, recognition, and validation for your application.


Reference

[1] K.B. Bulatov, E.V. Emelianova, D.V. Tropin, N.S. Skoryukina, Y.S. Chernyshova, A.V. Sheshkus, S.A. Usilin, Z. Ming, J.-C. Burie, M. M. Luqman, V.V. Arlazarov: “MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis”, Computer Optics (submitted), 2021.