How to convert an image of text into a binary view in Python using Deep Learning

Cloudmersive
2 min readApr 29, 2020

--

To properly perform optical character recognition (OCR), a preprocessing step is required first. Generally, this involves rotating the image and converting it to a binary view, or pure black and white. There are many ways of accomplishing the latter, but the best method uses Deep Learning to enhance the quality and allow for even better accuracy. Today I’m going to show you the setup process for an API that is already equipped for just this role.

Start by installing the API client:

pip install cloudmersive-ocr-api-client

Now below you can see our code block for calling preprocessing_binarize_advanced:

from __future__ import print_functionimport timeimport cloudmersive_ocr_api_clientfrom cloudmersive_ocr_api_client.rest import ApiExceptionfrom pprint import pprint# Configure API key authorization: Apikeyconfiguration = cloudmersive_ocr_api_client.Configuration()configuration.api_key['Apikey'] = 'YOUR_API_KEY'# Uncomment below to setup prefix (e.g. Bearer) for API key, if needed# configuration.api_key_prefix['Apikey'] = 'Bearer'# create an instance of the API classapi_instance = cloudmersive_ocr_api_client.PreprocessingApi(cloudmersive_ocr_api_client.ApiClient(configuration))image_file = '/path/to/file' # file | Image file to perform OCR on.  Common file formats such as PNG, JPEG are supported.try:# Convert an image of text into a binary (light and dark) view with MLapi_response = api_instance.preprocessing_binarize_advanced(image_file)pprint(api_response)except ApiException as e:print("Exception when calling PreprocessingApi->preprocessing_binarize_advanced: %s\n" % e)

Alright, there you go! You can binarize images to your heart’s content.

--

--

Cloudmersive
Cloudmersive

Written by Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

No responses yet