Sitemap

How to Prevent Malicious .EPUB File Uploads in Java

4 min readMay 1, 2025

--

If you’re developing applications in the digital publishing space, there’s a good chance you’ve worked with .epub files. They seem simple enough on the outside, but they aren’t just simple text containers — they’re zipped archives filled with HTML, CSS, images… and sometimes even embedded scripts.

Understanding EPUB Threats

The complexity of .epub files makes them vulnerable to malicious content.

Malicious content in an .epub can take several forms, including JavaScript exploits and malformed file structures — or even embedded payloads targeting e-reader software.

Malware hidden in the contents of an .epub can be activated automatically during parsing or preview generation, potentially compromising the system handling the file. That can lead to disastrous consequences; we don’t want to be stuck picking up the pieces from a sudden, unexpected attack.

If you’re building or maintaining a Java application that accepts.epub uploads — perhaps for a reading platform, a digital library, or even a publishing format conversion service — scanning these files for threats in-depth before processing them is essential.

Scanning EPUB Files with Full Content Verification

Using the below code, we can implement a free API that scans dozens of file types — including .epub files — for viruses, malware, and insecure content like scripts, executables, etc. This will ensure files rigorously conform with file formatting standards and protect you from embedded malicious content.

It’s extremely straightforward and easy to use. We can start by adding a reference to our pom.xml repository:

<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>

Following that, we can add a reference to our pom.xml dependency:

<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>

Next we can add the imports to the top of our file:

// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ScanApi;

Finally, we can set up API key authentication (we’ll need a free API key) and scan options, then scan files for threats and print or log our results:

ApiClient defaultClient = Configuration.getDefaultApiClient();

// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");

ScanApi apiInstance = new ScanApi();
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
Boolean allowExecutables = true; // Boolean | Set to false to block executable files (program code) from being allowed in the input file. Default is false (recommended).
Boolean allowInvalidFiles = true; // Boolean | Set to false to block invalid files, such as a PDF file that is not really a valid PDF file, or a Word Document that is not a valid Word Document. Default is false (recommended).
Boolean allowScripts = true; // Boolean | Set to false to block script files, such as a PHP files, Python scripts, and other malicious content or security threats that can be embedded in the file. Set to true to allow these file types. Default is false (recommended).
Boolean allowPasswordProtectedFiles = true; // Boolean | Set to false to block password protected and encrypted files, such as encrypted zip and rar files, and other files that seek to circumvent scanning through passwords. Set to true to allow these file types. Default is false (recommended).
Boolean allowMacros = true; // Boolean | Set to false to block macros and other threats embedded in document files, such as Word, Excel and PowerPoint embedded Macros, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
Boolean allowXmlExternalEntities = true; // Boolean | Set to false to block XML External Entities and other threats embedded in XML files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
Boolean allowInsecureDeserialization = true; // Boolean | Set to false to block Insecure Deserialization and other threats embedded in JSON and other object serialization files, and other files that contain embedded content threats. Set to true to allow these file types. Default is false (recommended).
Boolean allowHtml = true; // Boolean | Set to false to block HTML input in the top level file; HTML can contain XSS, scripts, local file accesses and other threats. Set to true to allow these file types. Default is false (recommended) [for API keys created prior to the release of this feature default is true for backward compatability].
String restrictFileTypes = "restrictFileTypes_example"; // String | Specify a restricted set of file formats to allow as clean as a comma-separated list of file formats, such as .pdf,.docx,.png would allow only PDF, PNG and Word document files. All files must pass content verification against this list of file formats, if they do not, then the result will be returned as CleanResult=false. Set restrictFileTypes parameter to null or empty string to disable; default is disabled.
try {
VirusScanAdvancedResult result = apiInstance.scanFileAdvanced(inputFile, allowExecutables, allowInvalidFiles, allowScripts, allowPasswordProtectedFiles, allowMacros, allowXmlExternalEntities, allowInsecureDeserialization, allowHtml, restrictFileTypes);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling ScanApi#scanFileAdvanced");
e.printStackTrace();
}

That’s all there is to it! We can expect API responses to follow the below model:

{
"CleanResult": true,
"ContainsExecutable": true,
"ContainsInvalidFile": true,
"ContainsScript": true,
"ContainsPasswordProtectedFile": true,
"ContainsRestrictedFileFormat": true,
"ContainsMacros": true,
"ContainsXmlExternalEntities": true,
"ContainsInsecureDeserialization": true,
"ContainsHtml": true,
"ContainsUnsafeArchive": true,
"ContainsOleEmbeddedObject": true,
"VerifiedFileFormat": "string",
"FoundViruses": [
{
"FileName": "string",
"VirusName": "string"
}
],
"ContentInformation": {
"ContainsJSON": true,
"ContainsXML": true,
"ContainsImage": true,
"RelevantSubfileName": "string",
"IsAuthenticodeSigned": true
}
}

--

--

Cloudmersive
Cloudmersive

Written by Cloudmersive

There’s an API for that. Cloudmersive is a leader in Highly Scalable Cloud APIs.

No responses yet