How to Convert a Photo of a Document into Text using Go

Cloudmersive
2 min readNov 25, 2022

--

When we convert images of documents into plain, unformatted digital text, we need to first determine the sources of the image itself. Was this document scanned, or was it photographed instead (using a smartphone or any other handheld camera)? If the latter, our chosen OCR service will need to offer better built-in fault tolerance features to accommodate background objects (such as the desk the document was photographed on) and crooked camera angles. Built specifically for use on hand-photographed documents, our Photo Document to Text OCR API provides exactly those fault tolerant advantages, along with two (optionally) customizable features including:

  1. RecognitionMode: this can be set to Basic (low fault tolerance) or Normal (high fault tolerance) values; the default is Advanced (highest fault tolerance).
  2. Language selection: this API supports dozens of common languages, including English, Spanish, Bengali, Dutch, and more.

To use this API, you’ll first need to get a Cloudmersive API key by registering a free account on our website (free accounts yield a limit of 800 API calls per month), and after that, you can simply copy & paste the below Golang code examples into your file to structure your API call. That’s all there is to it — just include your API Key & document image (common formats such as JPG & PNG are supported) in their respective fields, and you’re ready to convert to text.

package main

import (
"fmt"
"bytes"
"mime/multipart"
"os"
"path/filepath"
"io"
"net/http"
"io/ioutil"
)

func main() {

url := "https://api.cloudmersive.com/ocr/photo/toText"
method := "POST"

payload := &bytes.Buffer{}
writer := multipart.NewWriter(payload)
file, errFile1 := os.Open("/path/to/file")
defer file.Close()
part1,
errFile1 := writer.CreateFormFile("imageFile",filepath.Base("/path/to/file"))
_, errFile1 = io.Copy(part1, file)
if errFile1 != nil {
fmt.Println(errFile1)
return
}
err := writer.Close()
if err != nil {
fmt.Println(err)
return
}


client := &http.Client {
}
req, err := http.NewRequest(method, url, payload)

if err != nil {
fmt.Println(err)
return
}
req.Header.Add("recognitionMode", "<string>")
req.Header.Add("language", "<string>")
req.Header.Add("Content-Type", "multipart/form-data")
req.Header.Add("Apikey", "YOUR-API-KEY-HERE")

req.Header.Set("Content-Type", writer.FormDataContentType())
res, err := client.Do(req)
if err != nil {
fmt.Println(err)
return
}
defer res.Body.Close()

body, err := ioutil.ReadAll(res.Body)
if err != nil {
fmt.Println(err)
return
}
fmt.Println(string(body))
}

--

--

Cloudmersive
Cloudmersive

Written by Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

No responses yet