How to generate an English text description of an image in Python using Deep Learning

If you have ever trained a deep learning AI for a task, you probably know the time investment and fiddling involved. So how can we achieve those excellent results without all the aggravation? That’s simple, we use an API that already has Deep Learning as part of its repertoire. Let me show you how to use it.

First we need to install our API client, like so:

pip install cloudmersive-image-api-client

And now we can call the function, recognize_describe:

from __future__ import print_functionimport timeimport cloudmersive_image_api_clientfrom cloudmersive_image_api_client.rest import ApiExceptionfrom pprint import pprint# Configure API key authorization: Apikeyconfiguration = cloudmersive_image_api_client.Configuration()configuration.api_key['Apikey'] = 'YOUR_API_KEY'# Uncomment below to setup prefix (e.g. Bearer) for API key, if needed# configuration.api_key_prefix['Apikey'] = 'Bearer'# create an instance of the API classapi_instance = cloudmersive_image_api_client.RecognizeApi(cloudmersive_image_api_client.ApiClient(configuration))image_file = '/path/to/file.txt' # file | Image file to perform the operation on.  Common file formats such as PNG, JPEG are supported.try:# Describe an image in natural languageapi_response = api_instance.recognize_describe(image_file)pprint(api_response)except ApiException as e:print("Exception when calling RecognizeApi->recognize_describe: %s\n" % e)

That wraps up our setup, now let’s test it out on this image.

Image for post
Image for post

And here we have our response from the API.

{
"Successful": true,
"Highconfidence": true,
"BestOutcome": {
"ConfidenceScore": 0.365462526679039,
"Description": "A woman sitting in front of a laptop computer."
},
"RunnerUpOutcome": {
"ConfidenceScore": 0.20903514232486486,
"Description": "A woman sitting at a table using a laptop computer."
}
}

We have two guesses at the possible image contents, along with confidence scores for each. Not bad.

Written by

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store