Deep Learning OCR on Document and Receipt Photos with Ruby

In this post we’ll take a look at how to do Deep Learning Optical Character Recognition, or OCR, with Ruby.

The goal is to turn a photo of a document or receipt into text, instantly, with Ruby. Here is an example of what we’d like to do:

Image for post
Image for post
Convert a photo of a receipt or document into text with Ruby!

With this — we can enhance our web apps and mobile apps to take photos of documents out in the real world, and use them as input into our applications.

To get started, we need to install the free Cloudmersive Ruby gem:

We can also install this by adding to our Gemfile like so:

Now, we’ll create a Ruby file, document-and-receipt-ocr.rb we want to require the gem:

Next up, we’ll create our OCR client object:

Now note, that here we need to replace YOUR_API_KEY with a free API key from Cloudmersive — this will allow us to create 50,000 OCR API operations per month at no cost, and with no expiration.

Now, all we need to do is send in our file. We can optionally supply our desired language (the default is English), but there are over 90 languages to choose from, including RTL languages such as Arabic, East Asian languages, etc.

That’s it! Now we have a fully working Ruby app that converts a photo, taken with a smart phone, of a receipt into text!

You can download the full source code on Github.

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store