How to Block Invalid File Uploads in PHP

Cloudmersive
4 min readMay 7, 2024

Invalid files — such as invalid PDFs, for example — can pose a serious risk to our system.

Sophisticated threat actors often craft malicious files to exploit zero-day vulnerabilities in our file rendering & processing applications (even including our web browsers). These attacks can catch us completely off guard and lead to a variety of disastrous outcomes.

Most of the time, specially crafted malicious files will bypass a traditional antivirus scan (by design). They won’t, however, bypass a thorough deterministic scan that verifies the file contents against the expected format.

Using a free threat detection API, we can rigorously verify file uploads in our PHP form to ensure they conform with file formatting standards. This supports dozens of common file formats — including all major Office formats, PDFs, and 100+ image formats — so we can really cover our bases in a typical file upload process (e.g., resume uploads, profile picture uploads, insurance claim uploads, etc.)

We won’t have to worry about virus- and malware-infected files, either. This API will reference file contents against a regularly updated list of nearly 20 million virus and malware signatures, so we’ll know if established malware families are hiding within file uploads as well.

To structure our API call, we need to begin by installing the PHP client. We can install with Composer by executing the below command from the command line:

composer require cloudmersive/cloudmersive_virusscan_api_client

Next, we should quickly shift our attention to authorizing our API calls. We’ll need a free Cloudmersive API key; this will allow us to make up to 800 API calls per month with zero additional commitments.

We can now call the function using the below ready-to-run PHP code examples. We can set the $allow_invalid_files threat rule to false if we want to specifically call out and block invalid file uploads of any kind:

<?php
require_once(__DIR__ . '/vendor/autoload.php');

// Configure API key authorization: Apikey
$config = Swagger\Client\Configuration::getDefaultConfiguration()->setApiKey('Apikey', 'YOUR_API_KEY');



$apiInstance = new Swagger\Client\Api\ScanApi(


new GuzzleHttp\Client(),
$config
);
$input_file = "/path/to/inputfile"; // \SplFileObject | Input file to perform the operation on.
$allow_executables = true; // bool | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended).
$allow_invalid_files = true; // bool | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended).
$allow_scripts = true; // bool | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended).
$allow_password_protected_files = true; // bool | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended).
$allow_macros = true; // bool | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
$allow_xml_external_entities = true; // bool | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
$allow_insecure_deserialization = true; // bool | Set to false to block Insecure Deserialization and other threats embedded in JSON and other object serialization files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
$allow_html = true; // bool | Set to false to block HTML input in the top level file; HTML can contain XSS, scripts, local file accesses and other threats. Set to true to allow these file types. Default is false (recommended) [for API keys created prior to the release of this feature default is true for backward compatability].
$restrict_file_types = "restrict_file_types_example"; // string | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled.

try {
$result = $apiInstance->scanFileAdvanced($input_file, $allow_executables, $allow_invalid_files, $allow_scripts, $allow_password_protected_files, $allow_macros, $allow_xml_external_entities, $allow_insecure_deserialization, $allow_html, $restrict_file_types);
print_r($result);
} catch (Exception $e) {
echo 'Exception when calling ScanApi->scanFileAdvanced: ', $e->getMessage(), PHP_EOL;
}
?>

That’s all there is to it — now we can block invalid file uploads using a deterministic threat scanning solution.

--

--

Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.