Decode South African Driver's License PDF417 Barcode in Python: Decrypt, Parse & Extract Data
If you are looking for the specification of South African driving license, you may be disappointed. There is no official reference except for the Stack Overflow Q&A, an incomplete document - ZA Drivers License format and a C# open source project. The Stack Overflow Q&A provides the RSA public key for decrypting the data encoded as PDF417, and the incomplete document helps to parse the decrypted data. In this article, I will show you how to decode, decrypt and parse South African driving license in Python.
What you’ll build: A Python application that reads the PDF417 barcode from a South African driver’s license image, decrypts the RSA-encrypted payload, and extracts structured driver data (name, ID number, vehicle codes, license dates) using the Dynamsoft Barcode Reader SDK and the sadl package.
Key Takeaways
- South African driver’s licenses encode all driver data in a PDF417 barcode that is RSA-encrypted with government-issued public keys — there is no official specification, making third-party decoding libraries essential.
- The
south-africa-driving-licensePython package (sadl) handles the full pipeline: PDF417 decoding via Dynamsoft Barcode Reader, RSA decryption (v1 and v2 key sets), and binary data parsing into structured fields. - The encrypted payload is exactly 720 bytes: a 6-byte header followed by five 128-byte RSA blocks and one 74-byte RSA block, each requiring a different public key.
- Parsed output includes 13-digit SA ID number, vehicle codes (A/B/C), license issue and expiry dates, gender, and driver restrictions — ready for KYC, identity verification, or data-entry automation workflows.
Common Developer Questions
- How do I decode South African driver’s license PDF417 barcodes in Python?
- What RSA public keys are needed to decrypt a South African driving license barcode?
- How do I parse the decrypted binary data from a ZA driver’s license into structured fields like ID number and expiry date?
Quick Start with the sadl Python Package
The easiest way to decode South African driving license is to use the south-africa-driving-license package:
pip install south-africa-driving-license
Command-line Tool
The package provides a command-line tool sadltool for quick decoding:
# Decode from PDF417 image
sadltool image.png -l <Dynamsoft-License-Key>
# Decode from base64 string (decrypted)
sadltool dlbase64.txt -t 2 -e 0
# Decode from raw bytes (decrypted)
sadltool dl.raw -t 3 -e 0

Python API
from sadl import parse_file, parse_base64, parse_bytes
# Parse from PDF417 image file
license = parse_file("driver_license.png", encrypted=True, license="YOUR-LICENSE-KEY")
# Parse from base64 string
license = parse_base64("base64_encoded_string", encrypted=False)
# Parse from raw bytes
license = parse_bytes(raw_bytes_data, encrypted=False)
# Access parsed data
print(f"Surname: {license.surname}")
print(f"ID Number: {license.idNumber}")
print(f"Birthdate: {license.birthdate}")
print(f"License Expiry: {license.licenseExpiryDate}")
print(f"Gender: {license.gender}")
print(f"Vehicle Codes: {license.vehicleCodes}")
The DrivingLicense object contains all parsed fields:
| Field | Description |
|---|---|
surname |
Driver’s surname |
initials |
Driver’s initials |
idNumber |
South African ID number (13 digits) |
birthdate |
Date of birth (YYYY/MM/DD) |
gender |
Gender (male/female) |
licenseNumber |
License number |
licenseIssueDate |
License valid from date |
licenseExpiryDate |
License valid to date |
vehicleCodes |
List of vehicle codes (A, B, C, etc.) |
vehicleRestrictions |
Restrictions for each vehicle code |
driverRestrictionCodes |
Driver restrictions (glasses, artificial limb) |
Technical Deep Dive
Step 1: Decode the PDF417 Barcode from a License Image
The South African driving license contains a PDF417 barcode on the back of the card. To extract the raw data:
- Install Dynamsoft Barcode Reader SDK:
pip install dynamsoft-barcode-reader-bundleThe SDK works on Windows, Linux, and macOS with Python 3.6+ support.
-
Get a 30-day free trial license and initialize the barcode reader:
from dynamsoft_barcode_reader_bundle import * # Initialize license errorCode, errorMsg = LicenseManager.init_license("LICENSE-KEY") if errorCode != EnumErrorCode.EC_OK: print(f"License error: {errorMsg}") # Create barcode reader cvr = CaptureVisionRouter() result = cvr.capture(image_file, EnumPresetTemplate.PT_READ_BARCODES.value) # Get decoded bytes barcode_result = result.get_decoded_barcodes_result() if barcode_result and len(barcode_result.get_items()) > 0: raw_bytes = barcode_result.get_items()[0].get_bytes()
Step 2: Decrypt the RSA-Encrypted Payload
The valid data decoded from PDF417 contains 720 bytes. There are two versions of South African driving license:
v1 = [0x01, 0xe1, 0x02, 0x45] # Version 1
v2 = [0x01, 0x9b, 0x09, 0x45] # Version 2
The payload structure:
- First 6 bytes: Header (version identifier + 2 zero bytes)
- 714 bytes: Encrypted payload
- 5 blocks × 128 bytes (encrypted with primary key)
- 1 block × 74 bytes (encrypted with secondary key)
RSA Public Keys
pk_v1_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGXAoGBAP7S4cJ+M2MxbncxenpSxUmBOVGGvkl0dgxyUY1j4FRKSNCIszLFsMNwx2XWXZg8H53gpCsxDMwHrncL0rYdak3M6sdXaJvcv2CEePrzEvYIfMSWw3Ys9cRlHK7No0mfrn7bfrQOPhjrMEFw6R7VsVaqzm9DLW7KbMNYUd6MZ49nAhEAu3l//ex/nkLJ1vebE3BZ2w==
-----END RSA PUBLIC KEY-----
'''
pk_v1_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MGACSwD/POxrX0Djw2YUUbn8+u866wbcIynA5vTczJJ5cmcWzhW74F7tLFcRvPj1tsj3J221xDv6owQNwBqxS5xNFvccDOXqlT8MdUxrFwIRANsFuoItmswz+rfY9Cf5zmU=
-----END RSA PUBLIC KEY-----
'''
pk_v2_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGWAoGBAMqfGO9sPz+kxaRh/qVKsZQGul7NdG1gonSS3KPXTjtcHTFfexA4MkGAmwKeu9XeTRFgMMxX99WmyaFvNzuxSlCFI/foCkx0TZCFZjpKFHLXryxWrkG1Bl9++gKTvTJ4rWk1RvnxYhm3n/Rxo2NoJM/822Oo7YBZ5rmk8NuJU4HLAhAYcJLaZFTOsYU+aRX4RmoF
-----END RSA PUBLIC KEY-----
'''
pk_v2_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MF8CSwC0BKDfEdHKz/GhoEjU1XP5U6YsWD10klknVhpteh4rFAQlJq9wtVBUc5DqbsdI0w/bga20kODDahmGtASy9fae9dobZj5ZUJEw5wIQMJz+2XGf4qXiDJu0R2U4Kw==
-----END RSA PUBLIC KEY-----
'''
Decryption Implementation
import rsa
def decrypt_data(data):
# Detect version from header
header = data[0:6]
if header[0:4] == bytes(v1):
pk128, pk74 = pk_v1_128, pk_v1_74
elif header[0:4] == bytes(v2):
pk128, pk74 = pk_v2_128, pk_v2_74
else:
raise ValueError("Unknown license version")
decrypted = bytearray()
# Decrypt 5 blocks of 128 bytes
pubKey = rsa.PublicKey.load_pkcs1(pk128)
start = 6
for i in range(5):
block = data[start:start + 128]
input_int = int.from_bytes(block, byteorder='big', signed=False)
output_int = pow(input_int, pubKey.e, mod=pubKey.n)
decrypted += output_int.to_bytes(128, byteorder='big', signed=False)
start += 128
# Decrypt 1 block of 74 bytes
pubKey = rsa.PublicKey.load_pkcs1(pk74)
block = data[start:start + 74]
input_int = int.from_bytes(block, byteorder='big', signed=False)
output_int = pow(input_int, pubKey.e, mod=pubKey.n)
decrypted += output_int.to_bytes(74, byteorder='big', signed=False)
return decrypted
Step 3: Parse the Decrypted Binary Data into Structured Fields
The decrypted data consists of 4 sections:
- Header - Skip to string section by finding
0x82 - Strings Section - Vehicle codes, surname, initials, etc.
- Binary Data Section - Dates, ID type, gender
- Image Data Section - Photo dimensions
Reading Strings
Strings are delimited by 0xe0 (separator) and 0xe1 (empty string marker):
def readStrings(data, index, length):
strings = []
i = 0
while i < length:
value = ''
while True:
currentByte = data[index]
index += 1
if currentByte == 0xe0:
break
elif currentByte == 0xe1:
if value != '':
i += 1
break
value += chr(currentByte)
i += 1
if value != '':
strings.append(value)
return strings, index
def readString(data, index):
value = ''
delimiter = 0xe0
while True:
currentByte = data[index]
index += 1
if currentByte == 0xe0 or currentByte == 0xe1:
delimiter = currentByte
break
value += chr(currentByte)
return value, index, delimiter
Reading Binary Data (Nibble-encoded Dates)
Dates are encoded using nibble (4-bit) values:
def readNibbleDateString(nibbleQueue):
m = nibbleQueue.pop(0)
if m == 10: # Empty date marker
return ''
c = nibbleQueue.pop(0)
d = nibbleQueue.pop(0)
y = nibbleQueue.pop(0)
m1 = nibbleQueue.pop(0)
m2 = nibbleQueue.pop(0)
d1 = nibbleQueue.pop(0)
d2 = nibbleQueue.pop(0)
return f'{m}{c}{d}{y}/{m1}{m2}/{d1}{d2}' # YYYY/MM/DD format
def readNibbleDateList(nibbleQueue, length):
dateList = []
for i in range(length):
dateString = readNibbleDateString(nibbleQueue)
if dateString != '':
dateList.append(dateString)
return dateList
Complete Parsing Function
def parse_data(data):
# Find string section marker
index = 0
for i in range(len(data)):
if data[i] == 0x82:
index = i
break
# Parse strings
vehicleCodes, index = readStrings(data, index + 2, 4)
surname, index, delimiter = readString(data, index)
initials, index, delimiter = readString(data, index)
PrDPCode = ''
if delimiter == 0xe0:
PrDPCode, index, delimiter = readString(data, index)
idCountryOfIssue, index, delimiter = readString(data, index)
licenseCountryOfIssue, index, delimiter = readString(data, index)
vehicleRestrictions, index = readStrings(data, index, 4)
licenseNumber, index, delimiter = readString(data, index)
# ID Number (fixed 13 bytes)
idNumber = ''
for i in range(13):
idNumber += chr(data[index])
index += 1
# Parse binary section
idNumberType = f'{data[index]:02d}'
index += 1
# Read nibble queue until 0x57 marker
nibbleQueue = []
while True:
currentByte = data[index]
index += 1
if currentByte == 0x57:
break
nibbleQueue += [currentByte >> 4, currentByte & 0x0f]
licenseCodeIssueDates = readNibbleDateList(nibbleQueue, 4)
driverRestrictionCodes = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
PrDPermitExpiryDate = readNibbleDateString(nibbleQueue)
licenseIssueNumber = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
birthdate = readNibbleDateString(nibbleQueue)
licenseIssueDate = readNibbleDateString(nibbleQueue)
licenseExpiryDate = readNibbleDateString(nibbleQueue)
gender = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
gender = 'male' if gender == '01' else 'female'
# Image dimensions
index += 3
width = data[index]
index += 2
height = data[index]
return DrivingLicense(
vehicleCodes, surname, initials, PrDPCode,
idCountryOfIssue, licenseCountryOfIssue, vehicleRestrictions,
licenseNumber, idNumber, idNumberType, licenseCodeIssueDates,
driverRestrictionCodes, PrDPermitExpiryDate, licenseIssueNumber,
birthdate, licenseIssueDate, licenseExpiryDate, gender,
width, height
)
Supported Input Formats
The library supports three input formats:
| Format | Method | Description |
|---|---|---|
| Image File | parse_file() |
PDF417 barcode image (PNG, JPG, etc.) |
| Base64 String | parse_base64() |
Base64-encoded data (encrypted or decrypted) |
| Raw Bytes | parse_bytes() |
Raw byte array (encrypted or decrypted) |
Use Cases
- Identity Verification: Extract driver information for KYC processes
- License Validation: Check license validity and expiry dates
- Data Entry Automation: Auto-populate forms from driver license scans
- Access Control: Verify driver credentials at checkpoints
Common Issues and Edge Cases
- Barcode not detected from low-quality images: The PDF417 barcode on South African licenses is dense and can be damaged or poorly printed. If
cvr.capture()returns no results, try pre-processing the image (increase contrast, crop to the barcode region) or use Dynamsoft’sPT_READ_BARCODESpreset which is optimized for damaged barcodes. - Decrypted data length mismatch: The raw PDF417 payload must be exactly 720 bytes. If you get fewer bytes, the barcode was only partially scanned — ensure the entire barcode area is visible in the image. If you get more bytes, the barcode reader may have appended padding; trim to 720 bytes before decryption.
- Unknown license version header: If the first 4 bytes don’t match either v1 (
0x01 0xe1 0x02 0x45) or v2 (0x01 0x9b 0x09 0x45), the card may use a newer format not yet covered by the public keys. Check the GitHub repo for updates.