How to Generate and Decode a PDF with Vector Barcodes
Portable Document Format (PDF) is a file format developed by Adobe to present documents. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it.
A barcode or bar code is a method of representing data in a visual, machine-readable form. It can be stored as a raster image or a vector graphic in PDF.
It is common to find barcodes as images in PDF and there are not many articles about vector barcodes in PDF. So in this article, we are going to talk about how to generate a PDF with vector barcodes and read barcodes from it.
Dynamsoft Barcode Reader is used to read barcodes. Since we can directly get the coordinates of the black or white bars of a vector barcode, its performance and accuracy are better than decoding a raster image.
Getting started with Dynamsoft Barcode Reader
Generate a Barcode and Save it as SVG
There are many barcode generation libraries but not many of them can export the barcode as a vector graphic as SVG. Here, we use Python’s python-barcode
to do this.
Use the following code to generate a barcode in Code 128 format:
from barcode import Code128
from barcode.writer import SVGWriter
code1 = Code128("Code 128", writer=SVGWriter())
code1.save("out.svg")
Embed a Vector Barcode in a PDF Page
Next, we are going to embed the vector barcode graphic in a PDF.
We can do this using vector design tools like Adobe Illustrator and Inkscape.
Screenshot of Inkscape:
We can also generate a PDF with code. Here, we use Python and its fpdf2
library to do this.
from io import BytesIO
from fpdf import FPDF
from barcode import Code128
from barcode.writer import SVGWriter
# Create a new PDF document
pdf = FPDF()
pdf.add_page()
pdf.set_font("helvetica", "B", 16)
pdf.cell(40, 10, "A PDF page with Code128 barcodes.")
# Generate a Code128 barcode as SVG:
svg_img_bytes = BytesIO()
code1 = Code128("Code 128", writer=SVGWriter())
code1.write(svg_img_bytes)
pdf.image(svg_img_bytes, x=10, y=50, w=100, h=70)
# Generate a second Code128 barcode as SVG:
svg_img_bytes = BytesIO()
code2 = Code128("Second Code 128", writer=SVGWriter())
code2.write(svg_img_bytes)
pdf.image(svg_img_bytes, x=10, y=120, w=100, h=70)
# Output a PDF file:
pdf.output('code128_barcode.pdf')
Read Barcodes from PDF Files
Most barcode reading libraries can only process raster images. So we have to render the PDF pages into images before reading barcodes.
But actually, we can directly parse the vector barcode graphics for decoding.
For example, we can use the PyMuPDF
library to read the info of the graphics using its get_drawings
method.
>>> import fitz
>>> doc = fitz.open("vector.pdf")
>>> doc[0].get_drawings()
[{'items': [('re', Rect(28.34600067138672, 141.72999572753906, 365.26654052734375, 436.3016357421875), -1)], 'type': 'f', 'even_odd': False, 'fill_opacity': 1.0, 'fill': (1.0, 1.0, 1.0), 'rect': Rect(28.34600067138672, 141.72999572753906, 365.26654052734375, 436.3016357421875), 'seqno': 1, 'layer': '', 'closePath': None, 'color': None, 'width': None, 'lineCap': None, 'lineJoin': None, 'dashes': None, 'stroke_opacity': None}, {'items': [('re', Rect(52.604278564453125, 150.07992553710938, 56.42462158203125, 275.33087158203125), -1)], 'type': 'f', 'even_odd': False, 'fill_opacity': 1.0, 'fill': (0.0, 0.0, 0.0), 'rect': Rect(52.604278564453125, 150.07992553710938, 56.42462158203125, 275.33087158203125), 'seqno': 2, 'layer': '', 'closePath': None, 'color': None, 'width': None, 'lineCap': None, 'lineJoin': None, 'dashes': None, 'stroke_opacity': None}]
Dynamsoft Barcode Reader has the ability to read vector barcodes in PDF. This function is enabled by default.
Here is the code to read barcodes from a PDF (you have to apply for a license to use it):
from dbr import *
error = BarcodeReader.init_license("license")
if error[0] != EnumErrorCode.DBR_OK:
# Add your code for license error processing
print("License error: "+ error[1])
reader = BarcodeReader()
results = reader.decode_file("code128_barcode.pdf")
if results != None:
i = 0
for text_result in results:
print("Barcode " + str(i))
print("Barcode Format : " + text_result.barcode_format_string)
print("Barcode Text : " + text_result.barcode_text)
i = i+1
You can set its behavior by updating its runtime settings related to PDF.
settings = reader.get_runtime_settings()
settings.pdf_reading_mode = EnumPDFReadingMode.PDFRM_VECTOR
reader.update_runtime_settings(settings)
You can set the reading mode to the following value:
- EnumPDFReadingMode.PDFRM_RASTER: render the PDF to raster images for decoding
- EnumPDFReadingMode.PDFRM_VECTOR: directly read the vector barcodes
- EnumPDFReadingMode.PDFRM_AUTO: if the PDF contains vector barcodes, then use the vector mode
The vector mode is faster than the raster mode:
Time elapsed decoding a PDF file:
Vector mode: 15.6253ms.
Raster mode: 84.721ms.
But since the vector mode only works for 1D barcodes, if we need to read 2D barcodes like QR codes, we have to use the raster mode.
Determine whether a PDF is a Vector PDF
Let’s talk about another issue: how to determine whether a PDF is a vector PDF.
As we know, a PDF contains raster images, vector graphics or text.
So if a PDF page contains text or vector graphics, it is a vector PDF.
We can use the following code to do the detection:
import fitz
doc = fitz.open("merged.pdf")
index = 0
for page in doc:
index = index + 1
has_text = False
has_images = False
has_vector_graphics = False
if len(page.get_images()) > 0:
has_images = True
if page.get_text() != "":
has_text = True
if len(page.get_drawings()) > 0:
has_vector_graphics = True
if has_images and has_text == False and has_vector_graphics == False:
print("Page "+str(index)+" is raster")
elif has_vector_graphics or has_text:
print("Page "+str(index)+" is vector")
Source Code
https://github.com/tony-xlh/Vector-Barcode-PDF-Generation-and-Reading