How to get document type information and file extension in Python
Sometimes really bad things happen in the life of a file, things that leave it stripped of its identity, or even with the wrong extension entirely. This can be very inconvenient, causing all sorts of different problems. Having a good way to sort out which extension is correct for each file can bulletproof your app or website from coughing up error messages. So how can we perform this identification process without wasting a ton of our precious time? Let me show you.
To make life very easy, we shall use an API for the task at hand. You may install its client like so:
pip install cloudmersive-convert-api-client
Proceeding along, we have our function call next:
from __future__ import print_functionimport timeimport cloudmersive_convert_api_clientfrom cloudmersive_convert_api_client.rest import ApiExceptionfrom pprint import pprint# Configure API key authorization: Apikeyconfiguration = cloudmersive_convert_api_client.Configuration()configuration.api_key['Apikey'] = 'YOUR_API_KEY'# Uncomment below to setup prefix (e.g. Bearer) for API key, if needed# configuration.api_key_prefix['Apikey'] = 'Bearer'# create an instance of the API classapi_instance = cloudmersive_convert_api_client.ConvertDocumentApi(cloudmersive_convert_api_client.ApiClient(configuration))input_file = '/path/to/file' # file | Input file to perform the operation on.try:# Get document type informationapi_response = api_instance.convert_document_autodetect_get_info(input_file)pprint(api_response)except ApiException as e:print("Exception when calling ConvertDocumentApi->convert_document_autodetect_get_info: %s\n" % e)
As you can see, after the API instance is created, it’s just a matter of giving our file path and sending in the call. A few seconds later and we have our file extension. Problem solved!