How to Decode, Decrypt and Parse South African Driving License in Python
If you are looking for the specification of South African driving license, you may be disappointed. There is no official reference except for the Stack Overflow Q&A, an incomplete document - ZA Drivers License format and an C# open source project. The Stack Overflow Q&A provides the RSA public key for decrypting the data encoded as PDF417, and the incomplete document helps to parse the decrypted data. In this article, I will show you how to decode, decrypt and parse South African driving license in Python.
Decode South African Driving License from PDF417 Barcode
- Download Dynamsoft Barcode Reader SDK:
pip install dbr
The SDK can be installed on any desktop or server operating system that supports Python 3.6 or later.
- Get a license key and then initialize the barcode reader as follows:
from dbr import * BarcodeReader.init_license("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==") reader = BarcodeReader()
- Decode PDF417 to get the raw bytes of the driving license data:
results = reader.decode_file(image_file) if results != None and len(results) > 0: return results[0].barcode_bytes
Dynamsoft Barcode Reader SDK can guarantee the barcode decoding performance. The next step is to decrypt the data with RSA public key.
Decrypt Driving License Data with RSA Public Key
The valid data decoded from PDF417 contains 720 bytes. There are two versions of South African driving license according to the Stack Overflow Q&A.
v1 = [0x01, 0xe1, 0x02, 0x45]
v2 = [0x01, 0x9b, 0x09, 0x45]
The next two bytes are zeros, so the payload contains 714 bytes. The 714 bytes form 6 blocks: 5 blocks of 128 bytes and 1 block of 74 bytes. The first 5 blocks are encrypted with the same RSA public key, and the last block is encrypted with a different RSA public key. The RSA public keys are provided in the Stack Overflow Q&A.
pk_v1_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGXAoGBAP7S4cJ+M2MxbncxenpSxUmBOVGGvkl0dgxyUY1j4FRKSNCIszLFsMNwx2XWXZg8H53gpCsxDMwHrncL0rYdak3M6sdXaJvcv2CEePrzEvYIfMSWw3Ys9cRlHK7No0mfrn7bfrQOPhjrMEFw6R7VsVaqzm9DLW7KbMNYUd6MZ49nAhEAu3l//ex/nkLJ1vebE3BZ2w==
-----END RSA PUBLIC KEY-----
'''
pk_v1_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MGACSwD/POxrX0Djw2YUUbn8+u866wbcIynA5vTczJJ5cmcWzhW74F7tLFcRvPj1tsj3J221xDv6owQNwBqxS5xNFvccDOXqlT8MdUxrFwIRANsFuoItmswz+rfY9Cf5zmU=
-----END RSA PUBLIC KEY-----
'''
pk_v2_128 = '''
-----BEGIN RSA PUBLIC KEY-----
MIGWAoGBAMqfGO9sPz+kxaRh/qVKsZQGul7NdG1gonSS3KPXTjtcHTFfexA4MkGAmwKeu9XeTRFgMMxX99WmyaFvNzuxSlCFI/foCkx0TZCFZjpKFHLXryxWrkG1Bl9++gKTvTJ4rWk1RvnxYhm3n/Rxo2NoJM/822Oo7YBZ5rmk8NuJU4HLAhAYcJLaZFTOsYU+aRX4RmoF
-----END RSA PUBLIC KEY-----
'''
pk_v2_74 = '''
-----BEGIN RSA PUBLIC KEY-----
MF8CSwC0BKDfEdHKz/GhoEjU1XP5U6YsWD10klknVhpteh4rFAQlJq9wtVBUc5DqbsdI0w/bga20kODDahmGtASy9fae9dobZj5ZUJEw5wIQMJz+2XGf4qXiDJu0R2U4Kw==
-----END RSA PUBLIC KEY-----
'''
The following steps demonstrate how to use the RSA public key to decrypt the data:
-
Load the RSA public key from the PEM format.
import rsa def decrypt_data(data): pubKey = rsa.PublicKey.load_pkcs1(pk128) pubKey = rsa.PublicKey.load_pkcs1(pk74)
-
Convert each block byte array to a big integer, and use exponent
e
and modulusn
to calculate the decrypted value.all = bytearray() pubKey = rsa.PublicKey.load_pkcs1(pk128) start = 6 for i in range(5): block = data[start: start + 128] input = int.from_bytes(block, byteorder='big', signed=False) output = pow(input, pubKey.e, mod=pubKey.n) decrypted_bytes = output.to_bytes(128, byteorder='big', signed=False) all += decrypted_bytes start = start + 128 pubKey = rsa.PublicKey.load_pkcs1(pk74) block = data[start: start + 74] input = int.from_bytes(block, byteorder='big', signed=False) output = pow(input, pubKey.e, mod=pubKey.n) decrypted_bytes = output.to_bytes(74, byteorder='big', signed=False) all += decrypted_bytes return all
After getting the decrypted bytes, we can get started to parse the information of the driving license.
Parse South African Driving License
The decrypted data consists of 4 sections: header, strings, binary data, and image data. We can skip to the strings section by finding hex 0x82
.
index = 0
for i in range(0, len(data)):
if data[i] == 0x82:
index = i
break
The next byte needs to be ignored, so the payload starts from index + 2
. The strings are delimited by hex 0xe0
and 0xe1
. 0xe1
does not only indicate the delimiter, but also represents an empty string. For example, 41 e0 42 e1 e1
means A,B
.
We create two functions to read single string and multiple strings.
def readStrings(data, index, length):
strings = []
i = 0
while i < length:
value = ''
while True:
currentByte = data[index]
index += 1
if currentByte == 0xe0:
break
elif currentByte == 0xe1:
if value != '':
i += 1
break
value += chr(currentByte)
i += 1
if value != '':
strings.append(value)
return strings, index
def readString(data, index):
value = ''
delimiter = 0xe0
while True:
currentByte = data[index]
index += 1
if currentByte == 0xe0 or currentByte == 0xe1:
delimiter = currentByte
break
value += chr(currentByte)
return value, index, delimiter
Then read all strings one by one.
def parse_data(data):
vehicleCodes, index = readStrings(data, index + 2, 4)
surname, index, delimiter = readString(data, index)
initials, index, delimiter = readString(data, index)
PrDPCode = ''
if delimiter == 0xe0:
PrDPCode, index, delimiter = readString(data, index)
idCountryOfIssue, index, delimiter = readString(data, index)
licenseCountryOfIssue, index, delimiter = readString(data, index)
vehicleRestrictions, index = readStrings(data, index, 4)
licenseNumber, index, delimiter = readString(data, index)
idNumber = ''
for i in range(13):
idNumber += chr(data[index])
index += 1
From the binary data section, we can get the date of birth, date of license issue, date of license expiry and gender.
idNumberType = f'{data[index]:02d}'
index += 1
nibbleQueue = []
while True:
currentByte = data[index]
index += 1
if currentByte == 0x57:
break
nibbles = [currentByte >> 4, currentByte & 0x0f]
nibbleQueue += nibbles
licenseCodeIssueDates = readNibbleDateList(nibbleQueue, 4)
driverRestrictionCodes = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
PrDPermitExpiryDate = readNibbleDateString(nibbleQueue)
licenseIssueNumber = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
birthdate = readNibbleDateString(nibbleQueue)
licenseIssueDate = readNibbleDateString(nibbleQueue)
licenseExpiryDate = readNibbleDateString(nibbleQueue)
gender = f'{nibbleQueue.pop(0)}{nibbleQueue.pop(0)}'
if gender == '01':
gender = 'male'
else:
gender = 'female'
The functions for nibble date are as follows:
def readNibbleDateString(nibbleQueue):
m = nibbleQueue.pop(0)
if m == 10:
return ''
c = nibbleQueue.pop(0)
d = nibbleQueue.pop(0)
y = nibbleQueue.pop(0)
m1 = nibbleQueue.pop(0)
m2 = nibbleQueue.pop(0)
d1 = nibbleQueue.pop(0)
d2 = nibbleQueue.pop(0)
return f'{m}{c}{d}{y}/{m1}{m2}/{d1}{d2}'
def readNibbleDateList(nibbleQueue, length):
dateList = []
for i in range(length):
dateString = readNibbleDateString(nibbleQueue)
if dateString != '':
dateList.append(dateString)
return dateList
The final image data section is still unknown.
Read South African Driving License from Image File, Byte Array or Base64 String
We can read South African driving license from an image file, a byte array or a base64 string. The byte array and base64 string could be encrypted or decrypted.
- Image file:
BarcodeReader.init_license(key) reader = BarcodeReader() results = reader.decode_file(source) if results != None and len(results) > 0: data = results[0].barcode_bytes if data == None or len(data) != 720: return None return parse_bytes(data, encrypted)
- Byte array:
data = Path(source).read_bytes() if len(data) != 720 and encrypted == True: return None if encrypted: data = decrypt_data(data) return parse_data(data)
- Base64 string:
with open(source, 'r') as f: source = f.read() data = base64.b64decode(source) if len(data) != 720 and encrypted == True: return None if encrypted: data = decrypt_data(data) return parse_data(data)
Use ArgumentParser
to parse command line arguments for different input types.
parser = argparse.ArgumentParser(description='Decode, decrypt and parse South Africa driving license.')
parser.add_argument('source', help='A source file containing information of driving license.')
parser.add_argument('-t', '--types', default=1, type=int, help='Specify the source type. 1: PDF417 image 2: Base64 string 3: Raw bytes')
parser.add_argument('-e', '--encrypted', default=1, type=int, help='Is the source encrypted? 0: No 1: Yes')
parser.add_argument('-l', '--license', default='', type=str, help='The license key is required for decoding PDF417')