How to Autodetect and Validate Common Document Formats in Python
If your application is handling a lot of diverse files/document types in its workflow, an all-in-one document validation solution can certainly come in handy. Thankfully, the API solution provided below is free to use & automatically detects and validates a wide range of common document types. In its response, this API will return the input file extension (to confirm correct autodetection) and subsequently provide key information about that document’s validity. This information includes the following responses:
- “FileFormatExtension” — string identifying autodetected content type
- “DocumentIsValid” — boolean response indicating document validity
- “ErrorCount” — number of errors found in the document
- “WarningCount” — number of warnings found in the document
In addition to the above, a list of error descriptions and error paths may also be provided.
To take advantage of this API for free, first grab a free-tier API key from our website (create a free account with zero commitments) and then use the complementary code examples provided below to structure your API call.
Run the following command to install the SDK:
pip install cloudmersive-convert-api-client
Then add the imports and call the function, including your API key in the configuration.api_key field:
from __future__ import print_function
import time
import cloudmersive_convert_api_client
from cloudmersive_convert_api_client.rest import ApiException
from pprint import pprint
# Configure API key authorization: Apikey
configuration = cloudmersive_convert_api_client.Configuration()
configuration.api_key['Apikey'] = 'YOUR_API_KEY'
# create an instance of the API class
api_instance = cloudmersive_convert_api_client.ValidateDocumentApi(cloudmersive_convert_api_client.ApiClient(configuration))
input_file = '/path/to/inputfile' # file | Input file to perform the operation on.
try:
# Autodetect content type and validate
api_response = api_instance.validate_document_autodetect_validation(input_file)
pprint(api_response)
except ApiException as e:
print("Exception when calling ValidateDocumentApi->validate_document_autodetect_validation: %s\n" % e)
That’s all there is to it — no more code required!