How to Check GCP Storage Buckets for Obfuscated Threats using Node.js
Files uploaded to cloud storage often remain unmodified and unopened for long periods of time. If threat actors successfully upload obfuscated threats to our storage locations, it might be weeks or even months before that content is downloaded to a new location and executed by an unsuspecting person or program.
With that in mind, it’s good practice to regularly check files in storage for hidden malware threats and establish strict rules against file types that are both unnecessary and inherently threatening. Even if we can’t always use antivirus software to detect obfuscated malicious code within a file, we can still identify threatening file types by verifying the makeup of the file. We can, for example, identify if a file is executable, contains scripts, has insecure links and objects embedded within it, and more.
Using the below code, we can take advantage of a free API designed to virus scan files that are stored within our Google Cloud Storage instances. We can also set custom rules in our API request parameters to block various threatening file types through in-depth content verification. For an extremely stringent approach, we can even whitelist acceptable file types by extension in a comma separated string (e.g., ‘.pdf,.docx’) to disallow any files that fail to match our specific list. Files are scanned in-memory (data released upon scan completion) so the process is both fast and extremely secure.
To authorize our requests, we’ll just need a free-tier API key — this will allow a limit of 800 file scans per month with no commitment (great for smaller scale storage).
We can structure our API call in a few easy steps. To start, we’ll need to install the SDK — we can do that either by running the below command:
npm install cloudmersive-virus-api-client --save
Or by adding this snippet to our package.json:
"dependencies": {
"cloudmersive-virus-api-client": "^1.1.9"
}
Next, we should retreive a few of the GCP storage details we’ll need to complete our request. We’ll need the following information:
- Bucket Name — the name of our GCP storage bucket
- Object Name — the name of the object or file we’re scanning in GCP storage (if the object name contains Unicode characters, we’ll need to base64 encode the object name and prepend with ‘base64’
- JSON Credential File — our Service Account credential for GCP stored in a JSON file
Once we have all that, we can copy the below code into our file and customize boolean request parameters to specify content threat rules.
var CloudmersiveVirusApiClient = require('cloudmersive-virus-api-client');
var defaultClient = CloudmersiveVirusApiClient.ApiClient.instance;
// Configure API key authorization: Apikey
var Apikey = defaultClient.authentications['Apikey'];
Apikey.apiKey = 'YOUR API KEY';
var apiInstance = new CloudmersiveVirusApiClient.ScanCloudStorageApi();
var bucketName = "bucketName_example"; // String | Name of the bucket in Google Cloud Storage
var objectName = "objectName_example"; // String | Name of the object or file in Google Cloud Storage. If the object name contains Unicode characters, you must base64 encode the object name and prepend it with 'base64:', such as: 'base64:6ZWV6ZWV6ZWV6ZWV6ZWV6ZWV'.
var jsonCredentialFile = Buffer.from(fs.readFileSync("C:\\temp\\inputfile").buffer); // File | Service Account credential for Google Cloud stored in a JSON file.
var opts = {
'allowExecutables': true, // Boolean | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended).
'allowInvalidFiles': true, // Boolean | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended).
'allowScripts': true, // Boolean | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended).
'allowPasswordProtectedFiles': true, // Boolean | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended).
'allowMacros': true, // Boolean | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'allowXmlExternalEntities': true, // Boolean | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
'restrictFileTypes': "restrictFileTypes_example" // String | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled.
};
var callback = function(error, data, response) {
if (error) {
console.error(error);
} else {
console.log('API called successfully. Returned data: ' + data);
}
};
apiInstance.scanCloudStorageScanGcpStorageFileAdvanced(bucketName, objectName, jsonCredentialFile, opts, callback);
Its important to note that this also performs an in-depth virus & malware scan, so we’ll be able to find out if an extremely wide range of potential threats are present in any given file.