How to Check .POTX Template Files for Threats in your Node.js App
Just like their .pptx
counterparts, .potx
files are complex archives containing XML and media content— and that means they can be weaponized.
Understanding POTX Threats
It’s worth noting that PowerPoint template files are fairly niche — most developers aren’t likely to deal with them very often. They’re typically used in large corporate environments to standardize presentation formats.
As is the case with most Open Office formats, threat actors can embed harmful macros, scripts, or malformed content within the file structure to exploit vulnerabilities in weakly configured (or infrequently updated) file parsers. It doesn’t matter if the file in question was opened manually or automatically processed in a Node.js web app — this risk of exploitation applies regardless.
Protecting your Node.js App Against Insecure POTX Uploads
You might intentionally accept .potx
uploads through a Node.js web app for some internal document repository or presentation generator application, or you might maintain an application that sees template file uploads from time to time. If your application uses a file extension whitelist to limit uploads, you might only find .potx
content stored within another archive. In any case, it’s critical to properly identify and scan all incoming files — including .potx
— before handling them server-side.
The challenge is that relying on MIME-type checks or client-side validation won’t solve all our problems here — and neither will basic AV scanning. File extensions can be spoofed, malicious code can be obfuscated. When code execution or data compromise are possible outcomes of an overlooked threat, we should remain on high alert.
Scanning POTX with a Free API
Using the below code examples, we can take advantage of a free API that can scan all major Open Office formats — from .potx
to .docx
and everything in-between — and check beyond the file extension for insecure content embedded within the document. It will also identify virus and malware signatures with traditional signature-based detection, which means we’re covered against a large number of threats at once.
To get started, we’ll install the SDK with the following NPM command:
npm install cloudmersive-virus-api-client --save
Right after that, we’ll import the API client, create a default instance & configure API key authorization (we’ll need a free API key of our own to authorize API calls):
var CloudmersiveVirusApiClient = require('cloudmersive-virus-api-client');
var defaultClient = CloudmersiveVirusApiClient.ApiClient.instance;
// Configure API key authorization: Apikey
var Apikey = defaultClient.authentications['Apikey'];
Apikey.apiKey = 'YOUR API KEY';
Finally, we’ll create our API instance, buffer our files, and define our scanning options. This API offers numerous optional parameters to restrict risky content types like executables, scripts, macros, etc., and it also gives us the option to block specific file types via file extension (using deep content verification to check if file contents meet the extension standards):
var apiInstance = new CloudmersiveVirusApiClient.ScanApi();
var inputFile = Buffer.from(fs.readFileSync("C:\\temp\\inputfile").buffer); // File | Input file to perform the operation on.
var opts = {
'allowExecutables': true, // Boolean | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended).
'allowInvalidFiles': true, // Boolean | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended).
'allowScripts': true, // Boolean | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended).
'allowPasswordProtectedFiles': true, // Boolean | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended).
'allowMacros': true, // Boolean | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowXmlExternalEntities': true, // Boolean | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowInsecureDeserialization': true, // Boolean | Set to false to block Insecure Deserialization and other threats embedded in JSON and other object serialization files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowHtml': true, // Boolean | Set to false to block HTML input in the top level file; HTML can contain XSS, scripts, local file accesses and other threats. Set to true to allow these file types. Default is false (recommended) [for API keys created prior to the release of this feature default is true for backward compatability].
'restrictFileTypes': "restrictFileTypes_example" // String | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled.
};
var callback = function(error, data, response) {
if (error) {
console.error(error);
} else {
console.log('API called successfully. Returned data: ' + data);
}
};
apiInstance.scanFileAdvanced(inputFile, opts, callback);
After we make our API calls, we can expect our threat reponse to come back in the below format (this is a full response model; we likely won’t see all this information at once):
{
"CleanResult": true,
"ContainsExecutable": true,
"ContainsInvalidFile": true,
"ContainsScript": true,
"ContainsPasswordProtectedFile": true,
"ContainsRestrictedFileFormat": true,
"ContainsMacros": true,
"ContainsXmlExternalEntities": true,
"ContainsInsecureDeserialization": true,
"ContainsHtml": true,
"ContainsUnsafeArchive": true,
"ContainsOleEmbeddedObject": true,
"VerifiedFileFormat": "string",
"FoundViruses": [
{
"FileName": "string",
"VirusName": "string"
}
],
"ContentInformation": {
"ContainsJSON": true,
"ContainsXML": true,
"ContainsImage": true,
"RelevantSubfileName": "string",
"IsAuthenticodeSigned": true
}
}
And that’s all there is to it!
We now have a simple, straightforward way to block risky content in a wide range of forms at once — whether that’s stashed in a .potx
file, another Open Office file type, a PDF, and more.