How to Convert a Scanned Image of a Document to Plain Text using C/C++

Cloudmersive
2 min readSep 6, 2023

--

Once we scan our documents, we’re only one step away from digitizing their contents — all we need is an Optical Character Recognition (OCR) service.

Using the below code, we can easily take advantage of a free OCR API specially designed to convert scanned documents into plain text. This API will return our resulting text string along with a confidence score analyzing the perceived success of the operation.

We first need to install libcurl in our project:

libcurl/7.75.0

After that, we can copy the following ready-to-run code examples into our file to structure our API call:

CURL *curl;
CURLcode res;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "POST");
curl_easy_setopt(curl, CURLOPT_URL, "https://api.cloudmersive.com/ocr/image/toText");
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(curl, CURLOPT_DEFAULT_PROTOCOL, "https");
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "recognitionMode: <string>");
headers = curl_slist_append(headers, "language: <string>");
headers = curl_slist_append(headers, "preprocessing: <string>");
headers = curl_slist_append(headers, "Content-Type: multipart/form-data");
headers = curl_slist_append(headers, "Apikey: YOUR-API-KEY-HERE");
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
curl_mime *mime;
curl_mimepart *part;
mime = curl_mime_init(curl);
part = curl_mime_addpart(mime);
curl_mime_name(part, "imageFile");
curl_mime_filedata(part, "/path/to/file");
curl_easy_setopt(curl, CURLOPT_MIMEPOST, mime);
res = curl_easy_perform(curl);
curl_mime_free(mime);
}
curl_easy_cleanup(curl);

We can authorize our requests with a free-tier API key to make up to 800 API calls per month (with no additional commitment).

In addition, we can customize the following request details:

  1. Recognition Mode: Basic, Normal or Advanced (default is Advanced)
  2. Language: three-letter language abbreviation; many common languages supported (default is ENG)
  3. Preprocessing: Auto or None. This engages further image preparation prior to performing the OCR operation (default is Auto).

--

--

Cloudmersive
Cloudmersive

Written by Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

No responses yet