Decode South African Driver's License PDF417 Barcode in Python: Decrypt, Parse & Extract Data

If you are looking for the specification of South African driving license, you may be disappointed. There is no official reference except for the Stack Overflow Q&A, an incomplete document - ZA Drivers License format and a C# open source project. The Stack Overflow Q&A provides the RSA public key for decrypting the data encoded as PDF417, and the incomplete document helps to parse the decrypted data. In this article, I will show you how to decode, decrypt and parse South African driving license in Python.

What you’ll build: A Python application that reads the PDF417 barcode from a South African driver’s license image, decrypts the RSA-encrypted payload, and extracts structured driver data (name, ID number, vehicle codes, license dates) using the Dynamsoft Barcode Reader SDK and the sadl package.

Key Takeaways

  • South African driver’s licenses encode all driver data in a PDF417 barcode that is RSA-encrypted with government-issued public keys — there is no official specification, making third-party decoding libraries essential.
  • The south-africa-driving-license Python package (sadl) handles the full pipeline: PDF417 decoding via Dynamsoft Barcode Reader, RSA decryption (v1 and v2 key sets), and binary data parsing into structured fields.
  • The encrypted payload is exactly 720 bytes: a 6-byte header followed by five 128-byte RSA blocks and one 74-byte RSA block, each requiring a different public key.
  • Parsed output includes 13-digit SA ID number, vehicle codes (A/B/C), license issue and expiry dates, gender, and driver restrictions — ready for KYC, identity verification, or data-entry automation workflows.

Common Developer Questions

  • How do I decode South African driver’s license PDF417 barcodes in Python?
  • What RSA public keys are needed to decrypt a South African driving license barcode?
  • How do I parse the decrypted binary data from a ZA driver’s license into structured fields like ID number and expiry date?

Quick Start with the sadl Python Package

The easiest way to decode South African driving license is to use the south-africa-driving-license package:

pip install south-africa-driving-license

Command-line Tool

The package provides a command-line tool sadltool for quick decoding:

# Decode from PDF417 image
sadltool image.png -l <Dynamsoft-License-Key>

# Decode from base64 string (decrypted)
sadltool dlbase64.txt -t 2 -e 0

# Decode from raw bytes (decrypted)
sadltool dl.raw -t 3 -e 0

Decode South African Driving License in Python

Python API

from sadl import parse_file, parse_base64, parse_bytes

# Parse from PDF417 image file
license = parse_file("driver_license.png", encrypted=True, license="YOUR-LICENSE-KEY")

# Parse from base64 string
license = parse_base64("base64_encoded_string", encrypted=False)

# Parse from raw bytes
license = parse_bytes(raw_bytes_data, encrypted=False)

# Access parsed data
print(f"Surname: {license.surname}")
print(f"ID Number: {license.idNumber}")
print(f"Birthdate: {license.birthdate}")
print(f"License Expiry: {license.licenseExpiryDate}")
print(f"Gender: {license.gender}")
print(f"Vehicle Codes: {license.vehicleCodes}")

The DrivingLicense object contains all parsed fields:

Field Description
surname Driver’s surname
initials Driver’s initials
idNumber South African ID number (13 digits)
birthdate Date of birth (YYYY/MM/DD)
gender Gender (male/female)
licenseNumber License number
licenseIssueDate License valid from date
licenseExpiryDate License valid to date
vehicleCodes List of vehicle codes (A, B, C, etc.)
vehicleRestrictions Restrictions for each vehicle code
driverRestrictionCodes Driver restrictions (glasses, artificial limb)

Technical Deep Dive

Step 1: Decode the PDF417 Barcode from a License Image

The South African driving license contains a PDF417 barcode on the back of the card. To extract the raw data:

  1. Install Dynamsoft Barcode Reader SDK:
     pip install dynamsoft-barcode-reader-bundle
    

    The SDK works on Windows, Linux, and macOS with Python 3.6+ support.

  2. Get a 30-day free trial license and initialize the barcode reader:

     from dynamsoft_barcode_reader_bundle import *
        
     # Initialize license
     errorCode, errorMsg = LicenseManager.init_license("LICENSE-KEY")
     if errorCode != EnumErrorCode.EC_OK:
         print(f"License error: {errorMsg}")
        
     # Create barcode reader
     cvr = CaptureVisionRouter()
     result = cvr.capture(image_file, EnumPresetTemplate.PT_READ_BARCODES.value)
        
     # Get decoded bytes
     barcode_result = result.get_decoded_barcodes_result()
     if barcode_result and len(barcode_result.get_items()) > 0:
         raw_bytes = barcode_result.get_items()[0].get_bytes()
    

Step 2: Decrypt the RSA-Encrypted Payload

The valid data decoded from PDF417 contains 720 bytes. There are two versions of South African driving license:

v1 = [0x01, 0xe1, 0x02, 0x45]  # Version 1
v2 = [0x01, 0x9b, 0x09, 0x45]  # Version 2

The payload structure:

  • First 6 bytes: Header (version identifier + 2 zero bytes)
  • 714 bytes: Encrypted payload
    • 5 blocks × 128 bytes (encrypted with primary key)
    • 1 block × 74 bytes (encrypted with secondary key)

RSA Public Keys

pk_v1_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGXAoGBAP7S4cJ+M2MxbncxenpSxUmBOVGGvkl0dgxyUY1j4FRKSNCIszLFsMNwx2XWXZg8H53gpCsxDMwHrncL0rYdak3M6sdXaJvcv2CEePrzEvYIfMSWw3Ys9cRlHK7No0mfrn7bfrQOPhjrMEFw6R7VsVaqzm9DLW7KbMNYUd6MZ49nAhEAu3l//ex/nkLJ1vebE3BZ2w==
-----END RSA PUBLIC KEY-----
'''

pk_v1_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MGACSwD/POxrX0Djw2YUUbn8+u866wbcIynA5vTczJJ5cmcWzhW74F7tLFcRvPj1tsj3J221xDv6owQNwBqxS5xNFvccDOXqlT8MdUxrFwIRANsFuoItmswz+rfY9Cf5zmU=
-----END RSA PUBLIC KEY-----
'''

pk_v2_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGWAoGBAMqfGO9sPz+kxaRh/qVKsZQGul7NdG1gonSS3KPXTjtcHTFfexA4MkGAmwKeu9XeTRFgMMxX99WmyaFvNzuxSlCFI/foCkx0TZCFZjpKFHLXryxWrkG1Bl9++gKTvTJ4rWk1RvnxYhm3n/Rxo2NoJM/822Oo7YBZ5rmk8NuJU4HLAhAYcJLaZFTOsYU+aRX4RmoF
-----END RSA PUBLIC KEY-----
'''

pk_v2_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MF8CSwC0BKDfEdHKz/GhoEjU1XP5U6YsWD10klknVhpteh4rFAQlJq9wtVBUc5DqbsdI0w/bga20kODDahmGtASy9fae9dobZj5ZUJEw5wIQMJz+2XGf4qXiDJu0R2U4Kw==
-----END RSA PUBLIC KEY-----
'''

Decryption Implementation

import rsa

def decrypt_data(data):
    # Detect version from header
    header = data[0:6]
    if header[0:4] == bytes(v1):
        pk128, pk74 = pk_v1_128, pk_v1_74
    elif header[0:4] == bytes(v2):
        pk128, pk74 = pk_v2_128, pk_v2_74
    else:
        raise ValueError("Unknown license version")
    
    decrypted = bytearray()
    
    # Decrypt 5 blocks of 128 bytes
    pubKey = rsa.PublicKey.load_pkcs1(pk128)
    start = 6
    for i in range(5):
        block = data[start:start + 128]
        input_int = int.from_bytes(block, byteorder='big', signed=False)
        output_int = pow(input_int, pubKey.e, mod=pubKey.n)
        decrypted += output_int.to_bytes(128, byteorder='big', signed=False)
        start += 128
    
    # Decrypt 1 block of 74 bytes
    pubKey = rsa.PublicKey.load_pkcs1(pk74)
    block = data[start:start + 74]
    input_int = int.from_bytes(block, byteorder='big', signed=False)
    output_int = pow(input_int, pubKey.e, mod=pubKey.n)
    decrypted += output_int.to_bytes(74, byteorder='big', signed=False)
    
    return decrypted

Step 3: Parse the Decrypted Binary Data into Structured Fields

The decrypted data consists of 4 sections:

  1. Header - Skip to string section by finding 0x82
  2. Strings Section - Vehicle codes, surname, initials, etc.
  3. Binary Data Section - Dates, ID type, gender
  4. Image Data Section - Photo dimensions

Reading Strings

Strings are delimited by 0xe0 (separator) and 0xe1 (empty string marker):

def readStrings(data, index, length):
    strings = []
    i = 0
    while i < length:
        value = ''
        while True:
            currentByte = data[index]
            index += 1
            if currentByte == 0xe0:
                break
            elif currentByte == 0xe1:
                if value != '':
                    i += 1
                break
            value += chr(currentByte)
        i += 1
        if value != '':
            strings.append(value)
    return strings, index

def readString(data, index):
    value = ''
    delimiter = 0xe0
    while True:
        currentByte = data[index]
        index += 1
        if currentByte == 0xe0 or currentByte == 0xe1:
            delimiter = currentByte
            break
        value += chr(currentByte)
    return value, index, delimiter

Reading Binary Data (Nibble-encoded Dates)

Dates are encoded using nibble (4-bit) values:

def readNibbleDateString(nibbleQueue):
    m = nibbleQueue.pop(0)
    if m == 10:  # Empty date marker
        return ''
    
    c = nibbleQueue.pop(0)
    d = nibbleQueue.pop(0)
    y = nibbleQueue.pop(0)
    m1 = nibbleQueue.pop(0)
    m2 = nibbleQueue.pop(0)
    d1 = nibbleQueue.pop(0)
    d2 = nibbleQueue.pop(0)
    
    return f'{m}{c}{d}{y}/{m1}{m2}/{d1}{d2}'  # YYYY/MM/DD format

def readNibbleDateList(nibbleQueue, length):
    dateList = []
    for i in range(length):
        dateString = readNibbleDateString(nibbleQueue)
        if dateString != '':
            dateList.append(dateString)
    return dateList

Complete Parsing Function

def parse_data(data):
    # Find string section marker
    index = 0
    for i in range(len(data)):
        if data[i] == 0x82:
            index = i
            break
    
    # Parse strings
    vehicleCodes, index = readStrings(data, index + 2, 4)
    surname, index, delimiter = readString(data, index)
    initials, index, delimiter = readString(data, index)
    
    PrDPCode = ''
    if delimiter == 0xe0:
        PrDPCode, index, delimiter = readString(data, index)
    
    idCountryOfIssue, index, delimiter = readString(data, index)
    licenseCountryOfIssue, index, delimiter = readString(data, index)
    vehicleRestrictions, index = readStrings(data, index, 4)
    licenseNumber, index, delimiter = readString(data, index)
    
    # ID Number (fixed 13 bytes)
    idNumber = ''
    for i in range(13):
        idNumber += chr(data[index])
        index += 1
    
    # Parse binary section
    idNumberType = f'{data[index]:02d}'
    index += 1
    
    # Read nibble queue until 0x57 marker
    nibbleQueue = []
    while True:
        currentByte = data[index]
        index += 1
        if currentByte == 0x57:
            break
        nibbleQueue += [currentByte >> 4, currentByte & 0x0f]
    
    licenseCodeIssueDates = readNibbleDateList(nibbleQueue, 4)
    driverRestrictionCodes = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
    PrDPermitExpiryDate = readNibbleDateString(nibbleQueue)
    licenseIssueNumber = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
    birthdate = readNibbleDateString(nibbleQueue)
    licenseIssueDate = readNibbleDateString(nibbleQueue)
    licenseExpiryDate = readNibbleDateString(nibbleQueue)
    
    gender = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
    gender = 'male' if gender == '01' else 'female'
    
    # Image dimensions
    index += 3
    width = data[index]
    index += 2
    height = data[index]
    
    return DrivingLicense(
        vehicleCodes, surname, initials, PrDPCode,
        idCountryOfIssue, licenseCountryOfIssue, vehicleRestrictions,
        licenseNumber, idNumber, idNumberType, licenseCodeIssueDates,
        driverRestrictionCodes, PrDPermitExpiryDate, licenseIssueNumber,
        birthdate, licenseIssueDate, licenseExpiryDate, gender,
        width, height
    )

Supported Input Formats

The library supports three input formats:

Format Method Description
Image File parse_file() PDF417 barcode image (PNG, JPG, etc.)
Base64 String parse_base64() Base64-encoded data (encrypted or decrypted)
Raw Bytes parse_bytes() Raw byte array (encrypted or decrypted)

Use Cases

  • Identity Verification: Extract driver information for KYC processes
  • License Validation: Check license validity and expiry dates
  • Data Entry Automation: Auto-populate forms from driver license scans
  • Access Control: Verify driver credentials at checkpoints

Common Issues and Edge Cases

  • Barcode not detected from low-quality images: The PDF417 barcode on South African licenses is dense and can be damaged or poorly printed. If cvr.capture() returns no results, try pre-processing the image (increase contrast, crop to the barcode region) or use Dynamsoft’s PT_READ_BARCODES preset which is optimized for damaged barcodes.
  • Decrypted data length mismatch: The raw PDF417 payload must be exactly 720 bytes. If you get fewer bytes, the barcode was only partially scanned — ensure the entire barcode area is visible in the image. If you get more bytes, the barcode reader may have appended padding; trim to 720 bytes before decryption.
  • Unknown license version header: If the first 4 bytes don’t match either v1 (0x01 0xe1 0x02 0x45) or v2 (0x01 0x9b 0x09 0x45), the card may use a newer format not yet covered by the public keys. Check the GitHub repo for updates.

Source Code

https://github.com/yushulx/South-Africa-driving-license