How to Build Flet Chat App with Barcode and Gemini APIs
Gemini is Google’s latest AI model, which can be used for free with a limit of 60 queries per minute, and is capable of recognizing text from images. Generally, 1D barcodes are accompanied by human-readable text, which can be used to verify the accuracy of barcode recognition results. In this article, we will use the Flet Python API to build a desktop chat app integrated with both barcode and Gemini APIs. The app will read barcodes from images using Dynamsoft Barcode Reader and perform OCR on text within images using Gemini’s text recognition capabilities.
This article is Part 8 in a 11-Part Series.
- Part 1 - Detecting and Decoding QR Codes in Python with YOLO and Dynamsoft Barcode Reader
- Part 2 - How to a GUI Barcode Reader with Qt PySide6 on Raspberry Pi
- Part 3 - Advanced GUI Python Barcode and QR Code Reader for Windows, Linux, macOS and Rasberry Pi OS
- Part 4 - Advanced QR Code Recognition: Handling Inverted Colors, Perspective Distortion, and Grayscale Images
- Part 5 - Scanning QR Code from Desktop Screen with Qt and Python Barcode SDK
- Part 6 - Building an Online Barcode and QR Code Scanning App with Python Django
- Part 7 - Real-Time Barcode and QR Code Scanning with Webcam, OpenCV, and Python
- Part 8 - How to Build Flet Chat App with Barcode and Gemini APIs
- Part 9 - Comparing Barcode Scanning in Python: ZXing vs. ZBar vs. Dynamsoft Barcode Reader
- Part 10 - Python Ctypes: Invoking C/C++ Shared Library and Native Threading
- Part 11 - A Guide to Running ARM32 and ARM64 Python Barcode Readers in Docker Containers
Installation
pip install -U google-generativeai dbr flet
Prerequisites
Flet Python API for Desktop Applications
Flet empowers developers to create desktop applications using Python. It offers a crash course for constructing a real-time chat application, which serves as an excellent starting point.
Our application features a list view for displaying chat messages, a text input field, a button for uploading images, a button for sending messages, and a button to clear the chat history.
-
Chat messages:
chat = ft.ListView( expand=True, spacing=10, auto_scroll=True, )
-
Text input field:
new_message = ft.TextField( hint_text="Write a message...", autofocus=True, shift_enter=True, min_lines=1, max_lines=5, filled=True, expand=True, on_submit=send_message_click, )
-
Button to load an image:
def pick_files_result(e: ft.FilePickerResultEvent): global image_path image_path = None if e.files != None: image_path = e.files[0].path # TODO def pick_file(e): pick_files_dialog.pick_files() pick_files_dialog = ft.FilePicker(on_result=pick_files_result) page.overlay.append(pick_files_dialog) ft.IconButton( icon=ft.icons.UPLOAD_FILE, tooltip="Pick an image", on_click=pick_file, )
-
Button to send a message:
def on_message(message: Message): if message.message_type == "chat_message": m = ChatMessage(message) chat.controls.append(m) page.update() page.pubsub.subscribe(on_message) def send_message_click(e): global image_path if new_message.value != "": page.pubsub.send_all( Message("Me", new_message.value, message_type="chat_message")) question = new_message.value new_message.value = "" new_message.focus() page.update() page.pubsub.send_all( Message("Gemini", "Thinking...", message_type="chat_message")) # TODO ft.IconButton( icon=ft.icons.SEND_ROUNDED, tooltip="Send message", on_click=send_message_click, ),
PubSub
facilitates asynchronous communication across page sessions. Thesubscribe
method enables the receipt of broadcast messages from other sessions, while thesend_all
method allows for sending messages to all active sessions. Whenever a new message is received, the list view is automatically updated to display this new message. -
Button to clear the chat history:
def clear_message(e): global image_path image_path = None chat.controls.clear() page.update() ft.IconButton( icon=ft.icons.CLEAR_ALL, tooltip="Clear all messages", on_click=clear_message, )
Integrating the Dynamsoft Barcode Reader
The Dynamsoft Barcode Reader is an efficient library designed for barcode scanning. To enable barcode scanning in your app, you must integrate this library. Here’s how you can do it:
-
Import the Dynamsoft Barcode Reader library and initialize a barcode reader instance using your license key.
from dbr import * license_key = "LICENSE-KEY" BarcodeReader.init_license(license_key) reader = BarcodeReader()
-
Decode the barcode from the uploaded image and send the result to the chat.
def pick_files_result(e: ft.FilePickerResultEvent): global image_path, barcode_text barcode_text = None image_path = None if e.files != None: image_path = e.files[0].path page.pubsub.send_all( Message("Me", image_path, message_type="chat_message", is_image=True)) text_results = None try: text_results = reader.decode_file(image_path) except BarcodeReaderError as bre: print(bre) if text_results != None: barcode_text = text_results[0].barcode_text page.pubsub.send_all( Message("DBR", barcode_text, message_type="chat_message"))
Utilizing Google’s Gemini AI for Text Recognition
Gemini can extract text from images. Once you’ve decoded a barcode, you can employ Gemini to verify the accuracy of the text decoded from the barcode. Here are the steps to use Gemini:
-
Set up the API key for Gemini.
import google.generativeai as genai import google.ai.generativelanguage as glm genai.configure(api_key='API-KEY')
-
Initialize the text and vision models. The vision model takes both text and images as input.
model_text = genai.GenerativeModel('gemini-pro') chat_text = model_text.start_chat(history=[]) model_vision = genai.GenerativeModel('gemini-pro-vision') chat_vision = model_vision.start_chat(history=[])
-
Customize the command to effectively recognize text from the barcode image.
def send_message_click(e): global image_path if new_message.value != "": ... if question == ":verify": question = "recognize text around the barcode" response = model_vision.generate_content( glm.Content( parts=[ glm.Part( text=question), glm.Part( inline_data=glm.Blob( mime_type='image/jpeg', data=pathlib.Path( image_path).read_bytes() ) ), ], )) text = response.text page.pubsub.send_all( Message("Gemini", text, message_type="chat_message"))
Verifying the Barcode Decoding Results with the Accompanying Text
Now, we can check whether the text read from the barcode exists in the text recognized from the image. Since the text extracted by Gemini might include spaces, it’s essential to eliminate these spaces prior to comparison.
if barcode_text == None:
return
text = text.replace(" ", "")
if text.find(barcode_text) != -1:
page.pubsub.send_all(
Message("Gemini", barcode_text + " is correct ✓", message_type="chat_message"))
else:
page.pubsub.send_all(
Message("Gemini", barcode_text + " may not be correct", message_type="chat_message"))
Launch the desktop application and test it with some images that contain 1D barcodes:
flet run chatbot.py