How to Scan Azure Blobs for Malicious Content in .NET Core

Cloudmersive
5 min readMay 31, 2024

--

Using a free API, we can directly scan blobs in our Azure container for malicious content.

That includes malware AND specially crafted malicious files devised for targeted attacks on our network.

If we’re sending form uploads from a web application directly to an Azure container, we obviously need to check those files for malicious content before they get there.

It’s equally important, however, that we re-double our efforts and scan suspicious files directly in storage as well.

Viruses and malware aren’t the only threats we can expect to see in our Azure blobs; we might also encounter specially crafted malicious files (e.g., a JPG executable or a JS injection PDF) that were intentionally designed to bypass our initial Antivirus (AV) policies. Other examples include scripts, unsafe archives (e.g., compressed threats or zip bombs), HTML, macros, and more.

For a sophisticated threat actor, obfuscating custom-written malicious code in a common file type is trivial, and it’s very worthwhile: files like these can cause just as much damage as virus- or malware-infected files, and they’re more likely to succeed in carrying out the threat actor’s planned attack once they successfully bypass AV checks at the network edge.

Rather than solely scanning files for viruses that aren’t necessarily present, we can instead identify specially crafted malicious content by comparing the contents of a file upload with the stringent file formatting standards laid out for that particular file type.

For example, if we compare a JS injection PDF with strict PDF formatting standards, we’ll find that the JS injection makes the file invalid — despite the invalid PDF bearing a legitimate PDF extension.

There are, of course, cases where JavaScript is included in a PDF for legitimate reasons (e.g., to allow for interactive PDF forms that communicate with an external server), but it’s highly unlikely we want this type of content to come from a form upload that accepts content from outside of our network. By categorically rejecting content like this, we can stay safe from a wide range of custom attacks.

To scan our blobs in Azure storage, we’ll first need to obtain some details from our Azure storage account. We’ll need the following:

1. Connection String (get from Access Keys tab of Storage Account blade)

2. Container Name

3. Blob Path (path to the blob within the Azure container)

With that information ready to go, we can structure our API call in a few quick steps using complementary, ready-to-run C#/.NET Core examples. First, however, we should grab a free Cloudmersive API key to authorize our requests; this will allow us to make up to 800 in-storage scans per month with zero additional commitments.

We can install the SDK via NuGet by running the following command in our Package Manager console:

Install-Package Cloudmersive.APIClient.NETCore.VirusScan -Version 2.0.4

Then, we can copy from the below examples to add our namespace imports and structure our in-storage scanning function call:

using System;
using System.Diagnostics;
using Cloudmersive.APIClient.NETCore.VirusScan.Api;
using Cloudmersive.APIClient.NETCore.VirusScan.Client;
using Cloudmersive.APIClient.NETCore.VirusScan.Model;

namespace Example
{
public class ScanCloudStorageScanAzureBlobAdvancedExample
{
public void main()
{
// Configure API key authorization: Apikey
Configuration.Default.AddApiKey("Apikey", "YOUR_API_KEY");

var apiInstance = new ScanCloudStorageApi();
var connectionString = connectionString_example; // string | Connection string for the Azure Blob Storage Account; you can get this connection string from the Access Keys tab of the Storage Account blade in the Azure Portal.
var containerName = containerName_example; // string | Name of the Blob container within the Azure Blob Storage account
var blobPath = blobPath_example; // string | Path to the blob within the container, such as 'hello.pdf' or '/folder/subfolder/world.pdf'. If the blob path contains Unicode characters, you must base64 encode the blob path and prepend it with 'base64:', such as: 'base64:6ZWV6ZWV6ZWV6ZWV6ZWV6ZWV'.
var allowExecutables = true; // bool? | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended). (optional)
var allowInvalidFiles = true; // bool? | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended). (optional)
var allowScripts = true; // bool? | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended). (optional)
var allowPasswordProtectedFiles = true; // bool? | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended). (optional)
var allowMacros = true; // bool? | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended). (optional)
var allowXmlExternalEntities = true; // bool? | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended). (optional)
var restrictFileTypes = restrictFileTypes_example; // string | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled. (optional)

try
{
// Advanced Scan an Azure Blob for viruses
CloudStorageAdvancedVirusScanResult result = apiInstance.ScanCloudStorageScanAzureBlobAdvanced(connectionString, containerName, blobPath, allowExecutables, allowInvalidFiles, allowScripts, allowPasswordProtectedFiles, allowMacros, allowXmlExternalEntities, restrictFileTypes);
Debug.WriteLine(result);
}
catch (Exception e)
{
Debug.Print("Exception when calling ScanCloudStorageApi.ScanCloudStorageScanAzureBlobAdvanced: " + e.Message );
}
}
}
}

That’s all the code we’ll need!

We can incorporate this API call into our .NET Core application to retrieve and scan blobs from our Azure container.

As for our API response, we can review the below JSON model to understand the extent of our threat diagnostic:

{
"Successful": true,
"CleanResult": true,
"ContainsExecutable": true,
"ContainsInvalidFile": true,
"ContainsScript": true,
"ContainsPasswordProtectedFile": true,
"ContainsRestrictedFileFormat": true,
"ContainsMacros": true,
"VerifiedFileFormat": "string",
"FoundViruses": [
{
"FileName": "string",
"VirusName": "string"
}
],
"ErrorDetailedDescription": "string",
"FileSize": 0,
"ContentInformation": {
"ContainsJSON": true,
"ContainsXML": true,
"ContainsImage": true,
"RelevantSubfileName": "string"
}
}

In addition to verifying blob contents, each blob will also be referenced against a continuously updated list of more than 17 million virus and malware signatures. This will help ensure we’re able to identify a very broad range of potential threats in a single request.

--

--

Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.