How to get PDF Metadata in Python

Today we are going to set up PDF metadata retrieval in a matter of a few minutes. Get ready to save some time!

With pip install, our client files can be retrieved for use in your project:

pip install cloudmersive-convert-api-client

And now edit_pdf_get_metadata can be called, which is going to require a few things: an API instance, API key, and input file.

from __future__ import print_functionimport timeimport cloudmersive_convert_api_clientfrom cloudmersive_convert_api_client.rest import ApiExceptionfrom pprint import pprint# Configure API key authorization: Apikeyconfiguration = cloudmersive_convert_api_client.Configuration()configuration.api_key['Apikey'] = 'YOUR_API_KEY'# Uncomment below to setup prefix (e.g. Bearer) for API key, if needed# configuration.api_key_prefix['Apikey'] = 'Bearer'# create an instance of the API classapi_instance = cloudmersive_convert_api_client.EditPdfApi(cloudmersive_convert_api_client.ApiClient(configuration))input_file = '/path/to/file' # file | Input file to perform the operation on.try:# Get PDF document metadataapi_response = api_instance.edit_pdf_get_metadata(input_file)pprint(api_response)except ApiException as e:print("Exception when calling EditPdfApi->edit_pdf_get_metadata: %s\n" % e)

Now if we run our code we will receive the extracted metadata in JSON format, like so:

{
"Successful": true,
"Title": "string",
"Keywords": "string",
"Subject": "string",
"Author": "string",
"Creator": "string",
"DateModified": "2020-06-06T03:33:57.865Z",
"DateCreated": "2020-06-06T03:33:57.865Z",
"PageCount": 0
}

Not bad for five minutes.

Image for post
Image for post

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store