How to get PDF Metadata in Python
Today we are going to set up PDF metadata retrieval in a matter of a few minutes. Get ready to save some time!
With pip install, our client files can be retrieved for use in your project:
pip install cloudmersive-convert-api-client
And now edit_pdf_get_metadata can be called, which is going to require a few things: an API instance, API key, and input file.
from __future__ import print_functionimport timeimport cloudmersive_convert_api_clientfrom cloudmersive_convert_api_client.rest import ApiExceptionfrom pprint import pprint# Configure API key authorization: Apikeyconfiguration = cloudmersive_convert_api_client.Configuration()configuration.api_key['Apikey'] = 'YOUR_API_KEY'# Uncomment below to setup prefix (e.g. Bearer) for API key, if needed# configuration.api_key_prefix['Apikey'] = 'Bearer'# create an instance of the API classapi_instance = cloudmersive_convert_api_client.EditPdfApi(cloudmersive_convert_api_client.ApiClient(configuration))input_file = '/path/to/file' # file | Input file to perform the operation on.try:# Get PDF document metadataapi_response = api_instance.edit_pdf_get_metadata(input_file)pprint(api_response)except ApiException as e:print("Exception when calling EditPdfApi->edit_pdf_get_metadata: %s\n" % e)
Now if we run our code we will receive the extracted metadata in JSON format, like so:
{
"Successful": true,
"Title": "string",
"Keywords": "string",
"Subject": "string",
"Author": "string",
"Creator": "string",
"DateModified": "2020-06-06T03:33:57.865Z",
"DateCreated": "2020-06-06T03:33:57.865Z",
"PageCount": 0
}
Not bad for five minutes.
