How to Scan Excel XLSM Files for Viruses, Malware, and Other Threats in Node.js
Macro-enabled Excel workbooks (.XLSM) are ideal vehicles for malware. Macro code can run automatically when an Excel XLSM document is opened, and Excel documents are common enough that tricking prospective victims into opening them is relatively easy.
Thankfully, using a free API, we can scan Excel XLSM files for a variety of threats at once. We can deep-verify the XLSM file format (to ensure the file is valid and not masquerading as another file type), determine if macros are present within the document, identify if images are present within the document, scan the document for viruses and malware, and more.
To see how this works, we can simply structure an API call in two quick steps using ready-to-run Node.js code examples.
First, let’s install the client SDK. Let’s run the following command to install via NPM install:
npm install cloudmersive-virus-api-client --save
We could also install by adding the Node client to our package.json:
"dependencies": {
"cloudmersive-virus-api-client": "^1.1.9"
}
Next, let’s copy the below code examples to call the function. After we provide our file path and API key (free Cloudmersive API keys allow 800 API calls per month with no commitments), we can set custom threat rules against invalid files, scripts, macros, unsafe archives, and more. Even without setting any threat rules, all possible content types will be identified in the API response:
var CloudmersiveVirusApiClient = require('cloudmersive-virus-api-client');
var defaultClient = CloudmersiveVirusApiClient.ApiClient.instance;
// Configure API key authorization: Apikey
var Apikey = defaultClient.authentications['Apikey'];
Apikey.apiKey = 'YOUR API KEY';
var apiInstance = new CloudmersiveVirusApiClient.ScanApi();
var inputFile = Buffer.from(fs.readFileSync("C:\\temp\\inputfile").buffer); // File | Input file to perform the operation on.
var opts = {
'allowExecutables': true, // Boolean | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended).
'allowInvalidFiles': true, // Boolean | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended).
'allowScripts': true, // Boolean | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended).
'allowPasswordProtectedFiles': true, // Boolean | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended).
'allowMacros': true, // Boolean | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowXmlExternalEntities': true, // Boolean | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowInsecureDeserialization': true, // Boolean | Set to false to block Insecure Deserialization and other threats embedded in JSON and other object serialization files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowHtml': true, // Boolean | Set to false to block HTML input in the top level file; HTML can contain XSS, scripts, local file accesses and other threats. Set to true to allow these file types. Default is false (recommended) [for API keys created prior to the release of this feature default is true for backward compatability].
'restrictFileTypes': "restrictFileTypes_example" // String | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled.
};
var callback = function(error, data, response) {
if (error) {
console.error(error);
} else {
console.log('API called successfully. Returned data: ' + data);
}
};
apiInstance.scanFileAdvanced(inputFile, opts, callback);
With a complete understanding of our Excel document’s validity and the types of content enclosed within, we make take important steps to quarantine & analyze or simply delete suspicious XLSM files.
Here’s a look at what an example JSON response could look like:
{
"CleanResult": false,
"ContainsExecutable": false,
"ContainsInvalidFile": false,
"ContainsScript": false,
"ContainsPasswordProtectedFile": false,
"ContainsRestrictedFileFormat": false,
"ContainsMacros": true,
"ContainsXmlExternalEntities": false,
"ContainsInsecureDeserialization": false,
"ContainsHtml": false,
"ContainsUnsafeArchive": false,
"ContainsOleEmbeddedObject": true,
"VerifiedFileFormat": "string",
"FoundViruses": [
{
"FileName": "string",
"VirusName": "string"
}
],
"ContentInformation": {
"ContainsJSON": false,
"ContainsXML": false,
"ContainsImage": true,
"RelevantSubfileName": null
}
}
Every major Office format, PDF format, and over 100 unique image formats are supported in this request, so we’re not limited to versions of Excel.