Web Demos

BARCODE READER SDK DEMO

Explore the flexibe barcode reading settings to optimize for your specific usage scenario.

WEB TWAIN SDK DEMO

Try the most popular web scanner features: scan images, import local images and PDF files, edit, save to local, upload to database, and etc.

BARCODE READER JAVASCRIPT DEMO

Transform any camera-equipped devices into real-time, browser-based barcode and QR code scanners.

MRZ SCANNER WEB DEMO

Detects the machine-readable zone of a passport, scans the text, and parses into human-readable data.

APP STORE DEMOS

BARCODE READER SDK FOR IOS

BARCODE READER SDK FOR ANDROID

VIEW MORE DEMOS >

How to OCR PDF in a .NET Desktop Application

Jun 27, 2014

Convert a PDF file to an image and read text from the image in C# / VB.NET Recently, several customers have asked us if it’s possible to use Dynamic .NET TWAIN to convert a PDF file to an image, and then extract text from it, all in a .NET desktop app. In this article I’ll provide some samples that show how to do this using Dynamic .NET TWAIN with our PDF Rasterizer and OCR add-ons. The solution works in both WinForms and WPF applications. All the samples provided below (both C# and VB.NET) are included in the 30-day free trial installer of Dynamic .NET TWAIN. Get 30-day free trial now

Read a PDF file and convert it to image

PDF Rasterizer With the PDF Rasterizer add-on of Dynamic .NET TWAIN, you can load PDF files from your local disk as images, then display them in the Dynamic .NET TWAIN viewer. You can also specify a resolution for the image. C# code snippet

private void btnLoadPDF_Click(object sender, EventArgs e) 
{ try { OpenFileDialog openfiledlg = new OpenFileDialog(); openfiledlg.Filter = "PDF|*.PDF"; openfiledlg.FilterIndex = 0; openfiledlg.Multiselect = true; if (openfiledlg.ShowDialog() == DialogResult.OK) 
{ foreach (string strfilename in openfiledlg.FileNames) 
{ this.dynamicDotNetTwain1.ConvertPDFToImage(strfilename, float.Parse(cmbPDFResolution.SelectedItem.ToString())); } } } 
catch(Exception exp) { MessageBox.Show(exp.Message); } } 

Perform OCR

With the OCR add-on of Dynamic .NET TWAIN, you can extract text from images and save the OCR result in a text  or PDF file (text or image over text) in your .NET application.  The .NET OCR SDK supports  40+ languages, including Arabic, Chinese and more. C# code snippet

private void OCR(bool isOcrOnRectangleArea) { string languageFolder = m_strCurrentDirectory; if (m_bSamplesExist) { languageFolder = m_strCurrentDirectory + @"Samples\Bin\"; } 
//specify the tessdata folder with the language package 
this.dynamicDotNetTwain1.OCRTessDataPath = languageFolder; 
//specify the language for OCR 
this.dynamicDotNetTwain1.OCRLanguage = languages[this.cbxOCRLanguage.Text]; 
//specify the file format to store OCR result: text file, text or image over text PDF file 
this.dynamicDotNetTwain1.OCRResultFormat = (Dynamsoft.DotNet.TWAIN.OCR.ResultFormat)
this.ddlResultFormat.SelectedIndex; 
string strDllPath = m_strCurrentDirectory; 
if (m_bSamplesExist) { strDllPath = m_strCurrentDirectory + @"Redistributable\OCRResources\"; }
this.dynamicDotNetTwain1.OCRDllPath = strDllPath; 
if (this.dynamicDotNetTwain1.CurrentImageIndexInBuffer < 0) 
{ MessageBox.Show("Please load an image before doing OCR!", "Index out of bounds", MessageBoxButtons.OK, MessageBoxIcon.Warning); return; } 
byte[] sbytes = null; 
if (!isOcrOnRectangleArea) sbytes = this.dynamicDotNetTwain1.OCR(this.dynamicDotNetTwain1.CurrentSelectedImageIndicesInBuffer); 
else sbytes = this.dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, int.Parse(tbxLeft.Text), int.Parse(tbxTop.Text), int.Parse(tbxRight.Text), int.Parse(tbxButtom.Text)); 
if (sbytes != null && sbytes.Length > 0) { SaveFileDialog filedlg = new SaveFileDialog(); 
if (this.ddlResultFormat.SelectedIndex != 0) { filedlg.Filter = "PDF File(*.pdf)| *.pdf"; } 
else { filedlg.Filter = "Text File(*.txt)| *.txt"; } 
if (filedlg.ShowDialog() == DialogResult.OK) { File.WriteAllBytes(filedlg.FileName, sbytes); } } 
else { if(this.dynamicDotNetTwain1.ErrorCode != 0) 
MessageBox.Show(this.dynamicDotNetTwain1.ErrorString); } } 

Get samples

You can download and try the .NET PDF samples above from the Dynamsoft sample gallery.

Subscribe Newsletter

Subscribe to our mailing list to get the monthly update.

Subscribename@email.com