How to Convert a Scanned Document to Text using OCR in PHP

Cloudmersive
3 min readApr 30, 2024

We can easily convert scanned document images to plain text by calling a free API with PHP code examples.

We’ll simply structure our Scanned Document to Text API call using the ready-to-run PHP code examples included below, and we’ll authorize our API call with a free Cloudmersive API key (this allows up to 800 API calls per month with no commitments).

First, let’s install the PHP client via Composer by executing the below command from our command line:

composer require cloudmersive/cloudmersive_ocr_api_client

Next, let’s call the OCR function using the below code:

<?php
require_once(__DIR__ . '/vendor/autoload.php');

// Configure API key authorization: Apikey
$config = Swagger\Client\Configuration::getDefaultConfiguration()->setApiKey('Apikey', 'YOUR_API_KEY');



$apiInstance = new Swagger\Client\Api\ImageOcrApi(


new GuzzleHttp\Client(),
$config
);
$image_file = "/path/to/inputfile"; // \SplFileObject | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
$recognition_mode = "recognition_mode_example"; // string | Optional; possible values are 'Basic' which provides basic recognition and is not resillient to page rotation, skew or low quality images uses 1-2 API calls; 'Normal' which provides highly fault tolerant OCR recognition uses 14-16 API calls; and 'Advanced' which provides the highest quality and most fault-tolerant recognition uses 28-30 API calls. Default recognition mode is 'Advanced'
$language = "language_example"; // string | Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)
$preprocessing = "preprocessing_example"; // string | Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

try {
$result = $apiInstance->imageOcrPost($image_file, $recognition_mode, $language, $preprocessing);
print_r($result);
} catch (Exception $e) {
echo 'Exception when calling ImageOcrApi->imageOcrPost: ', $e->getMessage(), PHP_EOL;
}
?>

We can customize parameters in our request to improve document recognition (e.g., for skewed images), detect various languages, and improve the image quality prior to the operation. As outlined in the code comments, it’s worth noting that leaving $recognition_mode on the default Advanced setting will significantly increase the number of API calls used per document, which will limit the scope of our free scanned document image conversion processes.

That’s all the code we’ll need to make our API call! Now we can easily convert high quality scanned document images to text using minimal PHP code.

--

--

Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.