How to Generate and Decode a PDF with Vector Barcodes

Dec 06, 2023

PDF uses the PostScript language to encapsulate a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. Storing barcodes as vector graphics is better for printing as the barcodes can scale without losing the quality and tend to have smaller file sizes compared to raster images.

Vector barcodes are common in PDF files directly output by tools like Adobe illustrator while raster barcode images are more common in scanned PDF files.

In this article, we are going to talk about how to generate a PDF with vector barcodes and read barcodes from it.

Dynamsoft Barcode Reader is used to read barcodes. Since we can directly get the coordinates of the black or white bars of a vector barcode, its performance and accuracy are better than decoding a raster image.

Generate a Barcode and Save it as SVG

There are many barcode generation libraries but not many of them can export the barcode as a vector graphic as SVG. Here, we use Python’s python-barcode to do this.

Use the following code to generate a barcode in Code 128 format:

from barcode import Code128
from barcode.writer import SVGWriter
code1 = Code128("Code 128", writer=SVGWriter())
code1.save("out.svg")

Embed a Vector Barcode in a PDF Page

Next, we are going to embed the vector barcode graphic in a PDF.

We can do this using vector design tools like Adobe Illustrator and Inkscape.

Screenshot of Inkscape:

Inkscape

We can also generate a PDF with code. Here, we use Python and its fpdf2 library to do this.

from io import BytesIO
from fpdf import FPDF
from barcode import Code128
from barcode.writer import SVGWriter

# Create a new PDF document
pdf = FPDF()
pdf.add_page()

pdf.set_font("helvetica", "B", 16)
pdf.cell(40, 10, "A PDF page with Code128 barcodes.")

# Generate a Code128 barcode as SVG:
svg_img_bytes = BytesIO()
code1 = Code128("Code 128", writer=SVGWriter())
code1.write(svg_img_bytes)
pdf.image(svg_img_bytes, x=10, y=50, w=100, h=70)

# Generate a second Code128 barcode as SVG:
svg_img_bytes = BytesIO()
code2 = Code128("Second Code 128", writer=SVGWriter())
code2.write(svg_img_bytes)
pdf.image(svg_img_bytes, x=10, y=120, w=100, h=70)

# Output a PDF file:
pdf.output('code128_barcode.pdf')

Read Barcodes from PDF Files

Most barcode reading libraries can only process raster images. So we have to render the PDF pages into images before reading barcodes.

But actually, we can directly parse the vector barcode graphics for decoding.

For example, we can use the PyMuPDF library to read the info of the graphics using its get_drawings method.

>>> import fitz
>>> doc = fitz.open("vector.pdf")
>>> doc[0].get_drawings()
[{'items': [('re', Rect(28.34600067138672, 141.72999572753906, 365.26654052734375, 436.3016357421875), -1)], 'type': 'f', 'even_odd': False, 'fill_opacity': 1.0, 'fill': (1.0, 1.0, 1.0), 'rect': Rect(28.34600067138672, 141.72999572753906, 365.26654052734375, 436.3016357421875), 'seqno': 1, 'layer': '', 'closePath': None, 'color': None, 'width': None, 'lineCap': None, 'lineJoin': None, 'dashes': None, 'stroke_opacity': None}, {'items': [('re', Rect(52.604278564453125, 150.07992553710938, 56.42462158203125, 275.33087158203125), -1)], 'type': 'f', 'even_odd': False, 'fill_opacity': 1.0, 'fill': (0.0, 0.0, 0.0), 'rect': Rect(52.604278564453125, 150.07992553710938, 56.42462158203125, 275.33087158203125), 'seqno': 2, 'layer': '', 'closePath': None, 'color': None, 'width': None, 'lineCap': None, 'lineJoin': None, 'dashes': None, 'stroke_opacity': None}]

Dynamsoft Barcode Reader has the ability to read vector barcodes in PDF. This function is enabled by default.

Here is the code to read barcodes from a PDF (you have to apply for a license to use it):

from dbr import *
error = BarcodeReader.init_license("license")
if error[0] != EnumErrorCode.DBR_OK:
    # Add your code for license error processing
    print("License error: "+ error[1])
reader = BarcodeReader()
results = reader.decode_file("code128_barcode.pdf")
if results != None:
    i = 0
    for text_result in results:
        print("Barcode " + str(i))
        print("Barcode Format : " + text_result.barcode_format_string)
        print("Barcode Text : " + text_result.barcode_text)
        i = i+1

You can set its behavior by updating its runtime settings related to PDF.

settings = reader.get_runtime_settings()
settings.pdf_reading_mode = EnumPDFReadingMode.PDFRM_VECTOR
reader.update_runtime_settings(settings)

You can set the reading mode to the following value:

EnumPDFReadingMode.PDFRM_RASTER: render the PDF to raster images for decoding
EnumPDFReadingMode.PDFRM_VECTOR: directly read the vector barcodes
EnumPDFReadingMode.PDFRM_AUTO: if the PDF contains vector barcodes, then use the vector mode

The vector mode is faster than the raster mode:

Time elapsed decoding a PDF file:
Vector mode: 15.6253ms.
Raster mode: 84.721ms.

But since the vector mode only works for 1D barcodes, if we need to read 2D barcodes like QR codes, we have to use the raster mode.

Determine whether a PDF is a Vector PDF

Let’s talk about another issue: how to determine whether a PDF is a vector PDF.

As we know, a PDF contains raster images, vector graphics or text.

So if a PDF page contains text or vector graphics, it is a vector PDF.

We can use the following code to do the detection:

import fitz

doc = fitz.open("merged.pdf")

index = 0
for page in doc:
    index = index + 1
    has_text = False
    has_images = False
    has_vector_graphics = False
    if len(page.get_images()) > 0:
        has_images = True
    if page.get_text() != "":
        has_text = True
    if len(page.get_drawings()) > 0:
        has_vector_graphics = True
    
    if has_images and has_text == False and has_vector_graphics == False:
        print("Page "+str(index)+" is raster")
    elif has_vector_graphics or has_text:
        print("Page "+str(index)+" is vector")

Source Code

https://github.com/tony-xlh/Vector-Barcode-PDF-Generation-and-Reading