Making an Android OCR Application with Tesseract
Tesseract is a well-known open source OCR engine that released under the Apache License 2.0. In this tutorial, I’d like to share how to build the OCR library for Android, as well as how to implement a simple Android OCR application with it.
Tesseract Android Tools
To build the Tesseract OCR library for Android, we can use the tesseract-android-tools provided by Google.
Get the source code:
git clone https://code.google.com/p/tesseract-android-tools/
Open README, and take the following steps:
cd <project-directory>
curl -O https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
curl -O http://leptonica.googlecode.com/files/leptonica-1.69.tar.gz
tar -zxvf tesseract-ocr-3.02.02.tar.gz
tar -zxvf leptonica-1.69.tar.gz
rm -f tesseract-ocr-3.02.02.tar.gz
rm -f leptonica-1.69.tar.gz
mv tesseract-3.02.02 jni/com_googlecode_tesseract_android/src
mv leptonica-1.69 jni/com_googlecode_leptonica_android/src
ndk-build -j8
android update project --target 1 --path .
ant debug (release)
Note: if you are using NDK r9, the building will fail with the error:
format not a string literal and no format arguments [-Werror=format-security]
To solve it, open Application.mk, and add the following line:
APP_CFLAGS += -Wno-error=format-security
After successfully building the OCR library, you will get the class.jar in folder bin and relevant *.so in folder libs.
If you can’t successfully build the source code, please download the jni.zip and copy all source code to your project folder.
Android OCR Application
Create an Android project, and import the relevant libraries.
To do OCR, we can create a class named TessOCR:
public class TessOCR {
private TessBaseAPI mTess;
public TessOCR() {
// TODO Auto-generated constructor stub
mTess = new TessBaseAPI();
String datapath = Environment.getExternalStorageDirectory() + "/tesseract/";
String language = "eng";
File dir = new File(datapath + "tessdata/");
if (!dir.exists())
dir.mkdirs();
mTess.init(datapath, language);
}
public String getOCRResult(Bitmap bitmap) {
mTess.setImage(bitmap);
String result = mTess.getUTF8Text();
return result;
}
public void onDestroy() {
if (mTess != null)
mTess.end();
}
}
In the constructor, we need to check the directory tessdata. If it doesn’t exist, an exception will be thrown in init(). If you want to know why, read the source code:
public boolean init(String datapath, String language) {
if (datapath == null) {
throw new IllegalArgumentException("Data path must not be null!");
}
if (!datapath.endsWith(File.separator)) {
datapath += File.separator;
}
File tessdata = new File(datapath + "tessdata");
if (!tessdata.exists() || !tessdata.isDirectory()) {
throw new IllegalArgumentException("Data path must contain subfolder tessdata!");
}
return nativeInit(datapath, language);
}
Pretty simple! Now we can use three different ways to load images and do OCR:
Browsing images in gallery, and sending one image to the OCR application
In AndroidManifest.xml, add the following intent filter:
<intent-filter>
<action android:name="android.intent.action.SEND" />
<category android:name="android.intent.category.DEFAULT" />
<data android:mimeType="text/plain" />
<data android:mimeType="image/*" />
</intent-filter>
Decode the image URI:
if (Intent.ACTION_SEND.equals(intent.getAction())) {
Uri uri = (Uri) intent.getParcelableExtra(Intent.EXTRA_STREAM);
uriOCR(uri);
}
private void uriOCR(Uri uri) {
if (uri != null) {
InputStream is = null;
try {
is = getContentResolver().openInputStream(uri);
Bitmap bitmap = BitmapFactory.decodeStream(is);
mImage.setImageBitmap(bitmap);
doOCR(bitmap);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if (is != null) {
try {
is.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
}
Picking an image from gallery
Send the Intent for picking images, and decode the returned URI in onActivityResult:
Intent intent = new Intent(Intent.ACTION_PICK, android.provider.MediaStore.Images.Media.EXTERNAL_CONTENT_URI);
startActivityForResult(intent, REQUEST_PICK_PHOTO);
Taking a picture from camera
To get high-quality images, attach the file path to the Intent:
private void dispatchTakePictureIntent() {
Intent takePictureIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
// Ensure that there's a camera activity to handle the intent
if (takePictureIntent.resolveActivity(getPackageManager()) != null) {
// Create the File where the photo should go
File photoFile = null;
try {
photoFile = createImageFile();
} catch (IOException ex) {
// Error occurred while creating the File
}
// Continue only if the File was successfully created
if (photoFile != null) {
takePictureIntent.putExtra(MediaStore.EXTRA_OUTPUT,
Uri.fromFile(photoFile));
startActivityForResult(takePictureIntent, REQUEST_TAKE_PHOTO);
}
}
}
Before running the Android OCR app, do not forget to download the relevant language data packages and push them to your phone storage.